Learning-to-rank methods for query optimization

Supervisors: Zoi Kaoudi
Semester: Fall 2024
Tags: machine learning, database, query optimization, ranking

Query optimization is crucial for any data management system to achieve good performance. Recent advancements in AI have led academia and industry to investigate learning-based techniques in query optimization. In particular, many works propose replacing the cost model used during plan enumeration with a machine learning model (typically a regression model) that estimates the runtime of a query plan. Interestingly though, it is well-known that what really matters in query optimization is the relative order of the query plan alternatives and not their actual cost or runtime.

This project/thesis will investigate learning-to-rank approaches for query optimization. Concretely, it can entail exploring different query-plan feature representation, different model architectures for ranking query plans, and an evaluation with state-of-the-art models, such as 1, 2.

Prerequisites: programming skills in Python; good knowledge of deep learning frameworks (e.g., PyTorch or Tensorflow), and (preferably) knowledge in internals of database systems (query optimization).