Pruning Depthwise Separable Convolutions for MobileNet Compression

Tu C.-HLee J.-HChan Y.-MCHU-SONG CHEN2021-09-022021-09-022020https://www.scopus.com/inward/record.uri?eid=2-s2.0-85093833376&doi=10.1109%2fIJCNN48605.2020.9207259&partnerID=40&md5=e270ed0795c234ceff3504717b4c4416https://scholars.lib.ntu.edu.tw/handle/123456789/581337Deep convolutional neural networks are good at accuracy while bad at efficiency. To improve the inference speed, two directions have been explored in the past, lightweight model designing and network weight pruning. Lightweight models have been proposed to improve the speed with good enough accuracy. It is, however, not trivial if we can further speed up these 'compact' models by weight pruning. In this paper, we present a technique to gradually prune the depthwise separable convolution networks, such as MobileNet, for improving the speed of this kind of 'dense' network. When pruning depthwise separable convolutions, we need to consider more structural constraints to ensure the speedup of inference. Instead of pruning the model with the desired ratio in one stage, the proposed multi-stage gradual pruning approach can stably prune the filters with a finer pruning ratio. Our method achieves satisfiable speedup with little accuracy drop for MobileNets. Code is available at https://github.com/ivclab/Multistage-Pruning. ? 2020 IEEE.Convolution; Deep neural networks; Multi stage; Network weights; Speed up; Structural constraints; Two directions; Convolutional neural networksPruning Depthwise Separable Convolutions for MobileNet Compressionconference paper10.1109/IJCNN48605.2020.92072592-s2.0-85093833376