PROPOSAL

Analysis of Predicting GPU Utilization Metrics Using Deep Learning Models and infering number of required SMs


Supervisors: Pınar Tözün, Ehsan Yousefzadeh-Asl-Miandoab
Semester: Spring 2024
Tags: machine learning systems, GPU Utilization, resource management, resource interference

GPU offers massive computational power and parallelism through its Streaming Multiprocessors (SMs). Efficient GPU utilization is critical for maximizing performance and optimizing compute resource usage, which is measured using various metrics such as SMACT (SM Activity) and SMOCC (SM Occupancy), and DRAMA (DRAM Active). These metrics provide insight into how effectively the GPU’s SMs and memory are used during deep learning tasks. Currently, monitoring tools provide such metrics during runtime. This project aims to explore the concept of predicting GPU utilization metrics using deep learning models, trained on a large dataset of models executed on NVIDIA A100 40GB GPUs. Furthermore, we will investigate the feasibility of using these predictions to estimate the number of Streaming Multiprocessors (SMs) being used and the overall GPU performance looking at the three kinds of metrics together.

This project would be suitable as a BSc or MSc thesis as well as a standalone project at ITU during Spring 2025. If you are interested in machine learning systems and their efficiency in general, this project would be a great fit for you.