MultiFuse: Efficient Cross Layer Fusion for DNN Accelerators with Multi-level Memory Hierarchy

Chang, Chia Wei; Liou, Jing Jia; Huang, Chih Tsun; WEI-CHUNG HSU; Lu, Juin Ming

doi:10.1109/ICCD58817.2023.00097

MultiFuse: Efficient Cross Layer Fusion for DNN Accelerators with Multi-level Memory Hierarchy

Journal

Proceedings - IEEE International Conference on Computer Design: VLSI in Computers and Processors

ISBN

9798350342918

Date Issued

2023-01-01

Author(s)

Chang, Chia Wei

Liou, Jing Jia

Huang, Chih Tsun

WEI-CHUNG HSU

Lu, Juin Ming

DOI

10.1109/ICCD58817.2023.00097

URI

https://scholars.lib.ntu.edu.tw/handle/123456789/639717

URL

https://api.elsevier.com/content/abstract/scopus_id/85182336049

Abstract

In order to facilitate the deployment of diverse deep learning models while maintaining scalability, modern DNN accelerators frequently employ reconfigurable structures such as Network-on-Chip (NoC) and multi-level on-chip memory hierarchy. To achieve high energy efficiency, it is imperative to store intermediate DNN-layer results within the on-chip memory hierarchy, thereby reducing the need for off-chip data transfers to/from the DRAM memory.Two well-established optimization techniques, node fusion and loop tiling, have proven effective in retaining temporary results within the on-chip buffers, commonly used to minimize off-chip DRAM accesses. In this paper, we introduce MultiFuse, an infrastructure designed to automatically explore multiple DNN layer node fusion techniques, enabling optimal utilization of the on-chip multi-level memory hierarchy.Experimental results demonstrate the effectiveness of our retargetable infrastructure, which outperforms Ansor's algorithm. Our exploration algorithm achieves a remarkable 70% reduction in Energy-Delay Product (EDP) while gaining a 67x speedup in search time when executing the data-intensive MobileNet model on a single DNN accelerator.

Subjects

DNN | DNN Compilers | Hardware Accelerators | Node Fusion

SDGs

[SDGs]SDG7

Type

conference paper

MultiFuse: Efficient Cross Layer Fusion for DNN Accelerators with Multi-level Memory Hierarchy

關於 (About)

聯絡資訊 (Contact Us)

相關網站 (Useful Links)

關於開放取用 (Open Access, OA)

出版社期刊論文授權政策 (Copyright)

使用說明 (Instructions)

登入說明 (Sign-in)

匯入著作 (Submission)