Toward Fast Platform-Aware Neural Architecture Search for FPGA-Accelerated Edge AI Applications

Liang Y.-C;Liao Y.-C;Lin C.-C;Hung S.-H.

Title:	Toward Fast Platform-Aware Neural Architecture Search for FPGA-Accelerated Edge AI Applications
Authors:	Liang Y.-C Liao Y.-C Lin C.-C SHIH-HAO HUNG
Keywords:	AI; Deep Learning; Edge Computing; FPGA. OpenVINO; GPU; Neural Architecture Search; Performance Evaluation; Reinforcement learning
Issue Date:	2020
Start page/Pages:	219-225
Source:	ACM International Conference Proceeding Series
Abstract:	Neural Architecture Search (NAS) is a technique for finding suitable neural network architecture models for given applications. Previously, such search methods are usually based on reinforcement learning, with a recurrent neural network to generate neural network models. However, most NAS methods aim to find a set of candidates with best cost-performance ratios, e.g. high accuracy and low computing time, based on rough estimates derived from the workload generically. As today's deep learning chips accelerate neural network operations with a variety of hardware tricks such as vectors and low-precision data formats, the estimated metrics derived from generic computing operations such as float-point operations (FLOPS) would be very different from the actual latency, throughput, power consumption, etc., which are highly sensitive to the hardware design and even the software optimization in edge AI applications. Thus, instead of taking a long time to pick and train so called good candidates repeatedly based on unreliable estimates, we propose a NAS framework which accelerates the search process by including the actual performance measurements in the search process. The inclusion of actual measurements enables the proposed NAS framework to find candidates based on correct information and reduce the possibility of selecting wrong candidates and wasting search time on wrong candidates. To illustrate the effectiveness of our framework, we prototyped the framework to work with Intel OpenVINO and Field Programmable Gate Arrays (FPGA) to meet the accuracy and latency required by the user. The framework takes the dataset, accuracy and latency requirements from the user and automatically search for candidates to meet the requirements. Case studies and experimental results are presented in this paper to evaluate the effectiveness of our framework for Edge AI applications in real-time image classification. ? 2020 ACM.
URI:	https://www.scopus.com/inward/record.uri?eid=2-s2.0-85097365651&doi=10.1145%2f3400286.3418240&partnerID=40&md5=8edb3f0168591f9016bb35cc3c5009b3 https://scholars.lib.ntu.edu.tw/handle/123456789/581432
DOI:	10.1145/3400286.3418240
SDG/Keyword:	Application programs; Field programmable gate arrays (FPGA); Green computing; Network architecture; Reinforcement learning; Actual measurements; AI applications; Cost-Performance ratio; Network operations; Neural architectures; Neural network model; Performance measurements; Software optimization; Recurrent neural networks
Appears in Collections:	資訊工程學系

Show full item record

SCOPUS^TM
Citations

checked on Dec 27, 2023

Page view(s)

checked on May 18, 2024

Google Scholar^TM

Check

DSpace CRIS

SCOPUS^TM
Citations

Page view(s)

Google Scholar^TM

Altmetric

Altmetric

SCOPUSTM Citations

Page view(s)

Google ScholarTM

Altmetric

Altmetric

SCOPUS^TM
Citations

Google Scholar^TM