PROPOSAL
Resource management for Data-Intensive Systems Running on Heterogeneous Hardware
The variety and complexity of data-intensive applications and systems have been increasing drastically the past decade. Tasks from a SQL-based big data analytics request running on Apache Spark can be very different from tasks from deep learning training using TensorFlow framework. Nevertheless, these data-intensive applications increasingly run on shared hardware resources in data centers or high-performance computing (HPC) platforms. These hardware resources are also diverse today ranging from general-purpose CPUs to GPUs to programmable FPGAs. There is a pressing need for a more resource-aware infrastructure that orchestrates the different data-intensive tasks over the heterogeneous processing units effectively.
In this project, our goal is to investigate novel techniques for collaborative scheduling of concurrent data-intensive tasks onto the resources of modern server hardware. Based on your interests and availability of hardware resources, we can determine the specific subset of data-intensive systems and hardware in the context of your project or thesis.