PROPOSAL
Learning-to-rank methods for query optimization
Query optimization is crucial for any data management system to achieve good performance. Recent advancements in AI have led academia and industry to investigate learning-based techniques in query optimization. In particular, many works propose replacing the cost model used during plan enumeration with a machine learning model (typically a regression model) that estimates the runtime of a query plan. Interestingly though, it is well-known that what really matters in query optimization is the relative order of the query plan alternatives and not their actual cost or runtime.
This project/thesis will investigate learning-to-rank approaches for query optimization. Concretely, it can entail exploring different query-plan feature representation, different model architectures for ranking query plans, and an evaluation with state-of-the-art models, such as 1, 2.
Prerequisites: programming skills in Python; good knowledge of deep learning frameworks (e.g., PyTorch or Tensorflow), and (preferably) knowledge in internals of database systems (query optimization).