PUMP: Profiling-free Unified Memory Prefetcher for Large DNN Model Support

Lin C.-H; Lin S.-F; Chen Y.-J; Jenp E.-Y; CHIA-LIN YANG; Lin C.-H;Lin S.-F;Chen Y.-J;Jenp E.-Y;Yang C.-L.

doi:10.1109/ASP-DAC52403.2022.9712507

PUMP: Profiling-free Unified Memory Prefetcher for Large DNN Model Support

Journal

Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC

Journal Volume

2022-January

Pages

122-127

Date Issued

2022

Author(s)

Lin C.-H

Lin S.-F

Chen Y.-J

Jenp E.-Y

CHIA-LIN YANG

DOI

10.1109/ASP-DAC52403.2022.9712507

URI

https://www.scopus.com/inward/record.uri?eid=2-s2.0-85126087424&doi=10.1109%2fASP-DAC52403.2022.9712507&partnerID=40&md5=49370ca6ef04b606e489861a73121eee

https://scholars.lib.ntu.edu.tw/handle/123456789/632187

Abstract

Modern DNNs are going deeper and wider to achieve higher accuracy. However, existing deep learning frameworks require the whole DNN model to fit into the GPU memory when training with GPUs, which puts an unwanted limitation on training large models. Utilizing NVIDIA Unified Memory (UM) could inherently support training DNN models beyond GPU memory capacity. However, naively adopting UM would suffer a significant performance penalty due to the delay of data transfer. In this paper, we propose PUMP, a Profiling-free Unified Memory Prefetcher. PUMP exploits GPU asynchronous execution for prefetch; that is, there exists a delay between the time that CPU launches a kernel and the time the kernel executes in GPU. PUMP extracts memory blocks accessed by the kernel when launching and swaps these blocks into GPU memory. Experimental results show PUMP achieves about 2x speedup on the average compared to the baseline that naively enables UM. © 2022 IEEE.

SDGs

[SDGs]SDG7

Other Subjects

Data transfer; Deep learning; Graphics processing unit; Program processors; Pumps; Asynchronous executions; High-accuracy; Large models; Learning frameworks; Memory blocks; Memory capacity; Performance penalties; Prefetches; Memory architecture

Type

conference paper

PUMP: Profiling-free Unified Memory Prefetcher for Large DNN Model Support

關於 (About)

聯絡資訊 (Contact Us)

相關網站 (Useful Links)

關於開放取用 (Open Access, OA)

出版社期刊論文授權政策 (Copyright)

使用說明 (Instructions)

登入說明 (Sign-in)

匯入著作 (Submission)