Arithmetic Precision Reconfigurable Convolution Neural Network Accelerator

Shen E.H; Klopp J.P; SHAO-YI CHIEN; Shen E.H;Klopp J.P;Chien S.-Y.

doi:10.1109/SiPS50750.2020.9195210

Arithmetic Precision Reconfigurable Convolution Neural Network Accelerator

Journal

IEEE Workshop on Signal Processing Systems, SiPS: Design and Implementation

Journal Volume

2020-October

Date Issued

2020

Author(s)

Shen E.H

Klopp J.P

SHAO-YI CHIEN

DOI

10.1109/SiPS50750.2020.9195210

URI

https://www.scopus.com/inward/record.uri?eid=2-s2.0-85096750939&doi=10.1109%2fSiPS50750.2020.9195210&partnerID=40&md5=04f6182d81a2926c5a9a858ba05b2c13

https://scholars.lib.ntu.edu.tw/handle/123456789/581096

Abstract

Deep neural networks have demonstrated unprecedented results on core AI and computer vision tasks. They are typically executed on general purpose GPUs with large form factors and high power consumption, unsuitable for mobile deployment. We present a VLSI architecture that is able to execute quantized, low precision convolution neural networks (CNNs). Compared to high precision, our approach significantly reduces power consumption from memory access and increases processing speed at limited area budget, making it particularly suitable for mobile applications. We propose a dataflow with high data reuse rate specially designed for quantized models. To fully utilize low precision data, we also design a microarchitecture for subword parallel computing of low bit-length data, an on-chip memory hierarchy and data realignment flow for power saving and avoiding buffer bank-conflicts, and finally a corresponding processing element (PE) array. The architecture is highly flexible to suit various CNNs and re-configurable for low bit-length quantized models. We have implemented the proposed VLSI architecture in the TSMC 90nm cell library. At a hardware cost of 180KB on-chip memory and 1,340k logic gate counts, the implementation result shows state-of-the-art hardware efficiency. ? 2020 IEEE.

Subjects

Budget control; Computation theory; Computer hardware; Convolution; Data flow analysis; Deep neural networks; Electric power utilization; Memory architecture; Network architecture; Program processors; Signal processing; Silicon compounds; VLSI circuits; Convolution neural network; Data realignment; General-purpose GPUs; High power consumption; Micro architectures; Mobile applications; Processing elements; VLSI architectures; Neural networks

SDGs

[SDGs]SDG7

[SDGs]SDG16

Type

conference paper

Arithmetic Precision Reconfigurable Convolution Neural Network Accelerator

關於 (About)

聯絡資訊 (Contact Us)

相關網站 (Useful Links)

關於開放取用 (Open Access, OA)

出版社期刊論文授權政策 (Copyright)

使用說明 (Instructions)

登入說明 (Sign-in)

匯入著作 (Submission)