title: “Big data processing with Apache Wayang” type: “proposal” date: 2023-10-25 semester: “Spring 2024” supervisor: [ “Zoi Kaoudi” ] tags: [ “big data”, “database”, “cross-platform data processing”, “open source”, “Apache”]

Are you interested in working with a big data open source project?

You are welcome to conduct your thesis/project in Apache Wayang. Apache Wayang is the first cross-platform framework that allows users to specify their task/query in a system-agnostic manner and Wayang will determine which is the best system(s) to execute this task with the goal of optimizing performance. For a general overview check this paper.

Potential Intended Learning Outcomes depending on the concrete topic:

  • Ability to contribute to large open source codebases
  • Ability to use new technology
  • Ability to use data management techniques to improve systems performance
  • Ability to leverage machine learning techniques to improve systems performance

Prerequisites: good programming skills in Java; (preferably) knowledge in big data systems (e.g., Apache Flink, Apache Spark).