A 16nm Fully Integrated SoC for Hardware-Aware Neural Architecture Search

Lin, Yu-ChengYu-ChengLinHuang, Ming-ShanMing-ShanHuangWang, Jeng-BangJeng-BangWangChen, Wen-ChingWen-ChingChenChang, Nian-ShyangNian-ShyangChangLin, Chun-PinChun-PinLinChen, Chi-ShiChi-ShiChenChiueh, Tzi-DarTzi-DarChiuehYang, Chia-HsiangChia-HsiangYang2026-01-272026-01-272025-09-08[9798331525392]19308833https://www.scopus.com/record/display.uri?eid=2-s2.0-105024546249&origin=resultslisthttps://scholars.lib.ntu.edu.tw/handle/123456789/735613Neural architecture search (NAS) is a technique that can automatically design and optimize neural network architectures. It aims to find a better balance between AI performance and hardware efficiency, at the cost of excessively high computational complexity. This work presents the first fully integrated system-on-chip (SoC) specialized for accelerating hardware-aware NAS. The SoC enables efficient exploration on diverse network architectures in the accuracy-latency space. It supports commonly-used networks, including convolutional neural network (CNN), recurrent neural network (RNN), and Transformer. Fabricated in 16 nm FinFET, the chip dissipates 255 mW at a clock frequency of 500 MHz from a 0.8 V supply. Compared to an NVIDIA A40 GPU, this work achieves a 27× speedup at a 2.6× lower clock frequency, given 1176× less power and 166× smaller silicon area.falseA 16nm Fully Integrated SoC for Hardware-Aware Neural Architecture Searchconference paper10.1109/esserc66193.2025.112139482-s2.0-105024546249