PROPOSAL

Debiasing medical image datasets to improve machine learning robustness and fairness

Supervisors: Amelia Jiménez-Sánchez, Eike Petersen, Veronika Cheplygina
Semester: Fall 2024
Tags: machine learning, data science, medical imaging

It has been observed that deep learning models are able to identify patient characteristics such as age, sex, and self-reported race with high accuracy from medical images such as chest x-ray recordings, even when medical doctors cannot. This raises the potential for such models to learn to (falsely) diagnose patients of different demographics differently, even if they present with the same disease characteristics. This may happen because, for example, some groups may tend to be misdiagnosed or underdiagnosed in the datasets using which such models are trained. The model would then (undesirably) learn and reinforce such existing healthcare system biases.

Preventing such demographic shortcuts from being learned is nontrivial, especially since the mechanism by which models identify e.g. patient race are still not well-understood. In this project, we will explore whether careful and targeted image normalization and preprocessing techniques can help mitigate the potential impact of such biases in the chest x-ray analysis domain. The project will involve

An exploration and implementation of (standard) image preprocessing techniques for chest x-ray recordings
An investigation of the degree to which these techniques reduce the potential for models to identify e.g. patient race
An investigation of how these preprocessing steps affect overall model performance and robustness

Groups of 2 are preferred, but a single project is also possible. You must have experience with deep learning and the HPC at ITU. Veronika Cheplygina might join some of the meetings, but the main supervisors will be Amelia and Eike.

References

Gichoya, Banerjee, et al. (2022). AI recognition of patient race in medical imaging: a modelling study. The Lancet Digital Health. https://doi.org/10.1016/S2589-7500(22)00063-2
Glocker, Jones, Bernhardt, Winzeck (2023). Algorithmic encoding of protected characteristics in chest X-ray disease detection models. eBioMedicine. https://doi.org/10.1016/j.ebiom.2023.104467
Wang, Chaudhari, Davatzikos (2023). Bias in machine learning models can be significantly mitigated by careful training: Evidence from neuroimaging studies. PNAS. https://doi.org/10.1073/pnas.2211613120
Burns, et al. (2023). Ability of artificial intelligence to identify self-reported race in chest x-ray using pixel intensity counts. https://doi.org/10.1117/1.JMI.10.6.061106
Brown et al. (2023). Detecting shortcut learning for fair medical AI using shortcut testing. Nature Communications. https://doi.org/10.1038/s41467-023-39902-7
Yang, Zhang, Gichoya, Katabi, Ghassemi (2024). The limits of fair medical imaging AI in real-world generalization. Nature Medicine. https://doi.org/10.1038/s41591-024-03113-4
Banerjee et al. (2024). “Shortcuts” Causing Bias in Radiology Artificial Intelligence: Causes, Evaluation, and Mitigation. Journal of the American College of Radiology. https://doi.org/10.1016%2Fj.jacr.2023.06.025