Workload Characterization for Machine Learning
A data science infrastructure orchestrates the execution of widely used machine learning frameworks (e.g., TensorFlow , PyTorch) on a heterogeneous set of processing units (e.g., CPU, GPU, TPU, FPGA) while powering an increasingly diverse and complex range of applications (e.g., fraud detection, healthcare, virtual assistance, automatic driving). Understanding the resource consumption characteristics of different data science workloads is a necessary step before designing effective hardware resource management in any data science infrastructure. There have been several thorough workload characterization studies for traditional data-intensive workloads (online transaction processing, online analytical processing, etc.) running on CPUs. However, this is not the case for modern data science applications since the data science ecosystem and its benchmarks are still rapidly maturing. The existing studies are not representative for machine learning as they neither cover a relevant range of applications nor target heterogeneous hardware environments.
Our aim with this broad project topic is to shed light on the behavior of various machine learning workloads and frameworks running on different types of processors. Based on your interests, we can determine the specific machine learning framework, application domain, or hardware infrastructure to use for this goal in the context of your project or thesis.