T-EAP: Trainable Energy-Aware Pruning for NVM-based Computing-in-Memory Architecture
Journal
Proceeding - IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS 2022
ISBN
9781665409964
Date Issued
2022-01-01
Author(s)
Abstract
While convolutional neural networks (CNNs) are desired for outstanding performance in many applications, the energy consumption for inference becomes enormous. Computing-in-memory architecture based on embedded nonvolatile memory (NVM-CIM) has emerged to improve CNNs' energy efficiency. Recently, NVM crossbar-aware pruning has been extensively studied. However, directly incorporating energy estimation during sparse learning has not been well explored. In this paper, for the first time, we propose T-EAP, a trainable energy-aware pruning method to close the gap between pruning policy and energy optimization for NVM-CIM. Specifically, T-EAP improves the energy-accuracy trade-off by removing redundant weight groups that consume significant energy. Moreover, the trainable thresholds enable end-to-end sparse learning without a laborious train-prune-retrain process. Experimental results based on NeuroSim, which is a circuit-level simulator for CIM systems, show that compared with prior work, T-EAP maintains the accuracy while reducing energy consumption by up to 26.5% and 22.7% for VGG-8 and ResNet-20, respectively. We also provide a layer-wise analysis for energy savings to validate the effectiveness of T-EAP.
Subjects
Computing-in-memory | deep learning accelerator | embedded nonvolatile memory | energy consumption | pruning
Type
conference paper