PROPOSAL

Finding hidden features responsible for machine learning failures

Supervisors: Veronika Cheplygina, Amelia Jiménez-Sánchez
Semester: Spring 2025
Tags: machine learning, data science, medical imaging

There have been several situations where machine learning classifiers, trained to diagnose a particular disease (for example, lung cancer from chest x-rays), overfit on hidden features within the data. Examples include gridlines, surgical markers or evidence of treatment or text present in the images (see references for examples). This causes the classifier to fail on other type of images. Although these “hidden features” are often visible, the presence of such features is not documented in the dataset.

Until now, we have had several successful projects on detecting such hidden features in chest x-rays. We are looking to extend this set of projects with:

Other types of imaging data
Unsupervised methods to detect subgroups in the images
Explainability techniques to understand the shortcuts better
Integrating tabular data (for example patient demographics)

Multiple projects are possible, groups of 2 preferred. You must have experience with deep learning and the HPC at ITU.

References

Jiménez-Sánchez, A., Juodelyte, D., Chamberlain, B., & Cheplygina, V. (2022). Detecting Shortcuts in Medical Images-A Case Study in Chest X-rays. arXiv preprint arXiv:2211.04279.

Varoquaux, G., & Cheplygina, V. (2022). Machine learning for medical imaging: methodological failures and recommendations for the future. NPJ digital medicine, 5(1), 48.

Oakden-Rayner, L., Dunnmon, J., Carneiro, G., & Ré, C. (2020). Hidden stratification causes clinically meaningful failures in machine learning for medical imaging. In Proceedings of the ACM conference on health, inference, and learning (pp. 151-159).

Winkler, J. K., Fink, C., Toberer, F., Enk, A., Deinlein, T., Hofmann-Wellenhof, R., … & Haenssle, H. A. (2019). Association between surgical skin markings in dermoscopic images and diagnostic performance of a deep learning convolutional neural network for melanoma recognition. JAMA dermatology, 155(10), 1135-1141.

Pacheco, A. G., Lima, G. R., Salomao, A. S., Krohling, B., Biral, I. P., de Angelo, G. G., … & de Barros, L. F. (2020). PAD-UFES-20: A skin lesion dataset composed of patient data and clinical images collected from smartphones. Data in brief, 32, 106221.