Tagged with: checkpointing


PROPOSAL

Deep learning changed the landscape of many applications like computer vision, natural language processing, etc. On the other hand, deep learning require gigantic computing power offered by modern hardware. As a result data scientists rely on powerful hardware resources offered by shared high-performance computing (HPC) clusters or the cloud. Due to the long-running times of deep learning …
Supervisors: Pınar Tözün, Ehsan Yousefzadeh-Asl-Miandoab
Semester: Fall 2024
Tags: machine learning systems, checkpointing, scheduling, resource management