Abstract
Aims
Artificial intelligence (AI) methods are being used increasingly for the automated segmentation of cine cardiac magnetic resonance (CMR) imaging. However, these methods have been shown to be subject to race bias, i.e. they exhibit different levels of performance for different races depending on the (im)balance of the data used to train the AI model. In this paper we investigate the source of this bias, seeking to understand its root cause(s).
Methods and Results
We trained AI models to perform race classification on cine CMR images and/or segmentations from White and Black subjects from the UK Biobank and found that the classification accuracy for images was higher than for segmentations. Interpretability methods showed that the models were primarily looking at non-heart regions. Cropping images tightly around the heart caused classification accuracy to drop to almost chance level. Visualising the latent space of AI segmentation models showed that race information was encoded in the models. Training segmentation models using cropped images reduced but did not eliminate the bias. A number of possible confounders for the bias in segmentation model performance were identified for Black subjects but none for White subjects.
Conclusions
Distributional differences between annotated CMR data of White and Black races, which can lead to bias in trained AI segmentation models, are predominantly image-based, not segmentation-based. Most of the differences occur in areas outside the heart, such as subcutaneous fat. These findings will be important for researchers investigating performance of AI models on different races.
Artificial intelligence (AI) methods are being used increasingly for the automated segmentation of cine cardiac magnetic resonance (CMR) imaging. However, these methods have been shown to be subject to race bias, i.e. they exhibit different levels of performance for different races depending on the (im)balance of the data used to train the AI model. In this paper we investigate the source of this bias, seeking to understand its root cause(s).
Methods and Results
We trained AI models to perform race classification on cine CMR images and/or segmentations from White and Black subjects from the UK Biobank and found that the classification accuracy for images was higher than for segmentations. Interpretability methods showed that the models were primarily looking at non-heart regions. Cropping images tightly around the heart caused classification accuracy to drop to almost chance level. Visualising the latent space of AI segmentation models showed that race information was encoded in the models. Training segmentation models using cropped images reduced but did not eliminate the bias. A number of possible confounders for the bias in segmentation model performance were identified for Black subjects but none for White subjects.
Conclusions
Distributional differences between annotated CMR data of White and Black races, which can lead to bias in trained AI segmentation models, are predominantly image-based, not segmentation-based. Most of the differences occur in areas outside the heart, such as subcutaneous fat. These findings will be important for researchers investigating performance of AI models on different races.
Original language | English |
---|---|
Journal | European Heart Journal - Digital Health |
DOIs | |
Publication status | Published - 24 Feb 2025 |