Tagged with: Machine Learning


PROPOSAL

Query optimization is crucial for any data management system to achieve good performance. Recent advancements in AI have led academia and industry to investigate learning-based techniques in query optimization. In particular, many works propose replacing the cost model used during plan enumeration with a machine learning model (typically a regression model) that estimates the runtime of a query …
Supervisors: Zoi Kaoudi
Semester: Fall 2024
Tags: machine learning, database, query optimization, ranking

PROPOSAL

Query optimization is crucial for any data management system to achieve good performance. Recent advancements in AI have led academia and industry to investigate learning-based techniques in query optimization. In particular, many works propose replacing the cost model used during plan enumeration with a machine learning model that estimates the runtime of a plan. However, to build such a model …
Supervisors: Zoi Kaoudi
Semester: Fall 2024
Tags: machine learning, training data, query optimizer

PROPOSAL

There is pressure on hospitals to implement AI systems which promise to improve diagnoses and save time for the doctors. One use-case could be related to the automation of protocoling based on a physician referral. Currently, this requires a referral letter from a physician who has examined a patient and evaluates that there is a need for additional imaging studies. In this case, the physician …
Supervisors: Veronika Cheplygina
Semester: Fall 2023
Tags: machine learning, medical imaging, data analysis

PROPOSAL

In medical imaging, multi-task learning can be used to train a model that jointly predicts both a diagnosis, and other patient characteristics, such as demographic variables. Among others, this strategy has frequently been used for diagnosis of Alzheimer’s from brain MR scans, with age as an additional variable, see Zhang et al as an example. The idea is that both the disease, and age, …
Supervisors: Veronika Cheplygina
Semester: Fall 2022
Tags: machine learning, medical imaging, data analysis, fairness

PROPOSAL

Concept Bottleneck Models [1] are designed to leverage high-level concepts. They revisit the classic idea of first predicting concepts that are providing at training time, and then using these concepts to predict the label. By construction, it is possible to intervene on these concept bottleneck models by editing their predicted concept values and propagating these changes to the final prediction. …
Supervisors: Amelia Jiménez-Sánchez
Semester: Fall 2023
Tags: machine learning, data science, medical imaging

PROPOSAL

A medical Visual Question Answering (VQA) system can provide meaningful references for both doctors and patients during the treatment process. Different from normal images, a learning setting with medical images is more challenging due limited amounts of data, class-imbalance and the presence of label noise for diagnosis tasks. Moreover, little attention is paid to how the images and meta-data is …
Supervisors: Amelia Jiménez-Sánchez
Semester: Fall 2023
Tags: medical imaging, deep learning, machine learning, transfer learning, meta-learning

PROPOSAL

Machine learning models, especially larger models that are used in for example image or text datasets, can be expensive to train. During development models are usually trained multiple times for example to optimize hyperparameters, which can result in a large carbon footprint. This project specifically focuses specifically on medical data. There are some recent efforts, for example by Selvan et …
Supervisors: Veronika Cheplygina
Semester: Fall 2023
Tags: machine learning, medical imaging, data analysis, resource consumption

PROPOSAL

There have been several situations where machine learning classifiers, trained to diagnose a particular disease (for example, lung cancer from chest x-rays), overfit on hidden features within the data. Examples include gridlines, surgical markers or evidence of treatment or text present in the images (see references for examples). This causes the classifier to fail on other type of images. …
Supervisors: Veronika Cheplygina, Amelia Jiménez-Sánchez
Semester: Fall 2023
Tags: machine learning, data science, medical imaging

PROPOSAL

It is common to process data to clean it, filter it, restructure it, get metadata out of it, etc. before feeding the data into a data analysis or machine learning pipeline. There are many tools and libraries out there to aide with this process with different strengths and functionality (DALI, RAPIDS, HoloClean, DAPHNE, DuckDB, etc.). In this project, we would like to analyze pros/cons of some of …
Supervisors: Pınar Tözün
Semester: Fall 2022
Tags: data preprocessing libraries, heterogeneous hardware, machine learning

PROPOSAL

State-of-the-art machine learning models are known to be compute- and power-hungry. On the other hand, modern servers come equipped with really powerful CPU-GPU co-processors. Not all machine learning models are able to use all the available hardware resources on such servers. Workload collocation is a mechanism to increase hardware utilization when a single workload is not able to utilize all the …
Supervisors: Pınar Tözün
Semester: Fall 2022
Tags: benchmarking, workload collocation, machine learning