Pruning Depthwise Separable Convolutions for MobileNet Compression
Journal
Proceedings of the International Joint Conference on Neural Networks
Date Issued
2020
Author(s)
Abstract
Deep convolutional neural networks are good at accuracy while bad at efficiency. To improve the inference speed, two directions have been explored in the past, lightweight model designing and network weight pruning. Lightweight models have been proposed to improve the speed with good enough accuracy. It is, however, not trivial if we can further speed up these 'compact' models by weight pruning. In this paper, we present a technique to gradually prune the depthwise separable convolution networks, such as MobileNet, for improving the speed of this kind of 'dense' network. When pruning depthwise separable convolutions, we need to consider more structural constraints to ensure the speedup of inference. Instead of pruning the model with the desired ratio in one stage, the proposed multi-stage gradual pruning approach can stably prune the filters with a finer pruning ratio. Our method achieves satisfiable speedup with little accuracy drop for MobileNets. Code is available at https://github.com/ivclab/Multistage-Pruning. ? 2020 IEEE.
Subjects
Convolution; Deep neural networks; Multi stage; Network weights; Speed up; Structural constraints; Two directions; Convolutional neural networks
Type
conference paper