Merging Well-Trained Deep CNN Models for Efficient Inference

Wu C.-ELee J.-HWan T.S.TChan Y.-MCHU-SONG CHEN2021-09-022021-09-022020https://www.scopus.com/inward/record.uri?eid=2-s2.0-85100946038&partnerID=40&md5=2e05b66345f9934578958b303633eac8https://scholars.lib.ntu.edu.tw/handle/123456789/581332In signal processing applications, more than one tasks often have to be integrated into a system. Deep learning models (such as convolutional neural networks) of multiple purposes have to be executed simultaneously. When deploying multiple well-trained models to an application system, running them simultaneously is inefficient due to the collective loads of computation. Hence, merging the models into a more compact one is often required, so that they can be executed more efficiently on resource-limited devices. When deploying two or more well-trained deep neural-network models in the inference stage, we introduce an approach that fuses the models into a condensed model. The proposed approach consists of three phases: Filter Alignment, Shared-weight Initialization, and Model Calibration. It can merge well-trained feed-forward neural networks of the same architecture into a single network to reduce online storage and inference time. Experimental results show that our approach can improve both the run-time memory compression ratio and increase the computational speed in the execution. ? 2020 APSIPA.Convolutional neural networks; Deep learning; Deep neural networks; Merging; Network architecture; Signal processing; Application systems; Computational speed; Memory compression; Model calibration; Neural network model; Resource-limited devices; Signal processing applications; Weight initialization; Feedforward neural networksMerging Well-Trained Deep CNN Models for Efficient Inferenceconference paper2-s2.0-85100946038