It is common to process data to clean it, filter it, restructure it, get metadata out of it, etc. before feeding the data into a data analysis or machine learning pipeline. There are many tools and libraries out there to aide with this process with different strengths and functionality (DALI, RAPIDS, HoloClean, DAPHNE, DuckDB, etc.). In this project, we would like to analyze pros/cons of some of …
Supervisors:
Pınar Tözün
Semester: Fall 2022
Tags: data preprocessing libraries, heterogeneous hardware, machine learning