Big data processing with Apache Wayang

Supervisors: Zoi Kaoudi
Semester: Fall 2024
Tags: big data, database, cross-platform data processing, open source, Apache

Are you interested in working with a big data open source project?

You are welcome to conduct your thesis/project in Apache Wayang. Apache Wayang is the first cross-platform framework that allows users to specify their task/query in a system-agnostic manner and Wayang will determine which is the best system(s) to execute this task with the goal of optimizing performance. For a general overview check this paper.

Potential Intended Learning Outcomes depending on the concrete topic:

  • Ability to contribute to large open source codebases
  • Ability to use new technology
  • Ability to use data management techniques to improve systems performance
  • Ability to leverage machine learning techniques to improve systems performance

Prerequisites: good programming skills in Java; (preferably) knowledge in big data systems (e.g., Apache Flink, Apache Spark).