(MSc Research Project / MSc Thesis)
The goal of this project is to research how AI and large language models generate datasets. Research questions include: Where does the generated data come from? Are sources available on the internet or can they be found? What biases exist in the generated data? And how much of the data is simply wrong? Generated datasets are used in many fields in practice, …
Supervisors:
Martin Hentschel
Semester: Fall 2025
Tags: training data, machine learning, LLMs
Knowledge graphs (KGs) are extensively used in many application domains, such as search engines, product recommendation, and bioinformatics. Knowledge graph completion (a.k.a.~link prediction), i.e.,~the task of inferring missing information from knowledge graphs, is a widely used task in the above applications. This project will investigate how to loosely-couple the data-driven power of knowledge …
Supervisors:
Zoi Kaoudi
Semester: Fall 2025
Tags: knowledge graph, LLMs, reasoning
Are you interested in working with a big data open source project and AI?
You are welcome to conduct your thesis/project in the context of Apache Wayang. Apache Wayang is the first cross-platform framework that allows users to specify their task/query in a system-agnostic manner and Wayang will determine which is the best system(s) to execute this task with the goal of optimizing performance. For …
Supervisors:
Zoi Kaoudi
Semester: Fall 2025
Tags: big data, AI, LLMs, cross-platform data processing, open source, Apache