A Multiplier-Less Convolutional Neural Network Inference Accelerator for Intelligent Edge Devices

Hsieh M.-H;Liu Y.-T;Chiueh T.-D.

DC 欄位	值	語言
dc.contributor.author	Hsieh M.-H	en_US
dc.contributor.author	Liu Y.-T	en_US
dc.contributor.author	Chiueh T.-D.	en_US
dc.contributor.author	TZI-DAR CHIUEH	zz
dc.creator	Hsieh M.-H;Liu Y.-T;Chiueh T.-D.	-
dc.date.accessioned	2022-04-25T06:43:21Z	-
dc.date.available	2022-04-25T06:43:21Z	-
dc.date.issued	2021	-
dc.identifier.issn	21563357	-
dc.identifier.uri	https://www.scopus.com/inward/record.uri?eid=2-s2.0-85118637908&doi=10.1109%2fJETCAS.2021.3116044&partnerID=40&md5=e236c839c3dfdf42880fa626eca1c0ff	-
dc.identifier.uri	https://scholars.lib.ntu.edu.tw/handle/123456789/607332	-
dc.description.abstract	As the demand for neural network operations on edge devices increases, energy-efficient neural network inference solutions become necessary. To this end, this paper proposes a compact 4-bit number format (SD4) for neural network weights. In addition to significantly reducing the amount of neural network data transmission, SD4 also reduces the neural network convolution operation from multiplication and addition (MAC) to only addition. MNIST and CIFAR-10 CNNs with SD4 weights achieve results similar to their FP32-trained counterparts. The difference between the top-1 accuracy of 4-bit ResNet CNN for ImageNet and the baseline FP32 CNN is less than 0.5%. In the hardware design, we have implemented a multiplier-less convolution acceleration circuit. Compared with the 8-bit weight circuit, the power consumption and area of a 4-bit 3times 3 convolution circuit are reduced by nearly 50%. This work also proposes a systematic CNN deployment solution consisting of software CNN training and hardware acceleration. The proposed FPGA-based accelerator for VGG7 image classification achieves a peak throughput of 345.6 GOPS when running at a 100-MHz clock rate. The proposed convolution accelerator's power consumption and energy efficiency are 1.19W and 289. 5 GOPS/W, respectively. Compared to the CPU implementation of VGG7-128 inference, the multiplier-less acceleration circuit is 4.8 times faster and achieves 384 times higher energy efficiency. ? 2011 IEEE.	-
dc.relation.ispartof	IEEE Journal on Emerging and Selected Topics in Circuits and Systems	-
dc.subject	Convolutional neural networks (CNNs)	-
dc.subject	FPGA	-
dc.subject	inference acceleration	-
dc.subject	multiplier-less	-
dc.subject	Acceleration	-
dc.subject	Codes (symbols)	-
dc.subject	Computer graphics	-
dc.subject	Convolution	-
dc.subject	Electric power utilization	-
dc.subject	Energy efficiency	-
dc.subject	Network coding	-
dc.subject	Neural networks	-
dc.subject	Neurons	-
dc.subject	Program processors	-
dc.subject	Quantization (signal)	-
dc.subject	Convolutional neural network	-
dc.subject	Encodings	-
dc.subject	Graphic processing unit	-
dc.subject	Graphics processing	-
dc.subject	Inference acceleration	-
dc.subject	Multiplierless	-
dc.subject	Processing units	-
dc.subject	Field programmable gate arrays (FPGA)	-
dc.subject.classification	[SDGs]SDG7	-
dc.title	A Multiplier-Less Convolutional Neural Network Inference Accelerator for Intelligent Edge Devices	en_US
dc.type	journal article	en
dc.identifier.doi	10.1109/JETCAS.2021.3116044	-
dc.identifier.scopus	2-s2.0-85118637908	-
dc.relation.pages	739-750	-
dc.relation.journalvolume	11	-
dc.relation.journalissue	4	-
item.cerifentitytype	Publications	-
item.fulltext	no fulltext	-
item.openairecristype	http://purl.org/coar/resource_type/c_6501	-
item.openairetype	journal article	-
item.grantfulltext	none	-
crisitem.author.dept	Electrical Engineering	-
crisitem.author.dept	Electronics Engineering	-
crisitem.author.dept	MediaTek-NTU Research Center	-
crisitem.author.dept	Graduate School of Advanced Technology	-
crisitem.author.orcid	0000-0003-0851-6629	-
crisitem.author.parentorg	College of Electrical Engineering and Computer Science	-
crisitem.author.parentorg	College of Electrical Engineering and Computer Science	-
crisitem.author.parentorg	Others: University-Level Research Centers	-
crisitem.author.parentorg	National Taiwan University	-
顯示於：	電機工程學系

顯示文件簡單紀錄

SCOPUS^TM
Citations

checked on 2023/11/27

WEB OF SCIENCE^TM
Citations

checked on 2023/11/21

Page view(s)

checked on 2024/4/27

Google Scholar^TM

檢查

Altmetric

TAIR相關文章

SCOPUSTM Citations

WEB OF SCIENCETM Citations

Page view(s)

Google ScholarTM

Altmetric

Altmetric

SCOPUS^TM
Citations

WEB OF SCIENCE^TM
Citations

Google Scholar^TM