PROPOSAL
Big data processing with Apache Wayang
Are you interested in working with a big data open source project?
You are welcome to conduct your thesis/project in Apache Wayang. Apache Wayang is the first cross-platform framework that allows users to specify their task/query in a system-agnostic manner and Wayang will determine which is the best system(s) to execute this task with the goal of optimizing performance. For a general overview check this paper.
Potential Intended Learning Outcomes depending on the concrete topic:
- Ability to contribute to large open source codebases
- Ability to use new technology
- Ability to use data management techniques to improve systems performance
- Ability to leverage machine learning techniques to improve systems performance
Prerequisites: good programming skills in Java; (preferably) knowledge in big data systems (e.g., Apache Flink, Apache Spark).