https://scholars.lib.ntu.edu.tw/handle/123456789/581419
標題: | Exploiting Data Entropy for Neural Network Compression | 作者: | Chen T.-W PANGFENG LIU Wu J.-J. |
關鍵字: | CNN; Entropy; Filter Pruning; Machine Learning; Model Compression | 公開日期: | 2020 | 起(迄)頁: | 5007-5016 | 來源出版物: | Proceedings - 2020 IEEE International Conference on Big Data, Big Data 2020 | 摘要: | Convolutional neural networks (CNN) achieves tremendous success in computer vision. However, due to the increasing number of parameters and the limitation of hardware/software resources, model compression has become an important issue, so we should reduce the size of CNN's and improve the train and inference speed. This paper focuses on channel pruning, a model compression technique that evaluates the importance of channels within a convolution layer, and prune away the less important ones.In this paper, we propose a mutual information metric to prune the network. By measuring the entropy of feature maps, we can estimate how much information goes through each channel during the label recognition and prune away those that have the least information. We compute the mutual information between feature maps and labels, which is the only information relevant to the label classification.We also propose a weighted mutual information metric that further improves the accuracy. We observe from our experiments that the weighted mutual information metric achieves better accuracy than the classic L1-norm metric [1] and the original entropy metric [2]. We also discover that the classic L1-norm pruning metric can be improved by computing the L1-norm of output filter weights (denoted as output L1) instead of input filter weights (denoted as input L1).We test our channel pruning algorithms on the SVHN, the CIFAR-10, and the CIFAR-100 datasets using Simplenet [3]. When we prune away 70% parameters for all convolution layers, our weighted mutual information method has 1.52%, 13.24%, and 7.90% higher accuracy than the output L1 metric on these three datasets. In the global pruning experiment, our weighted mutual in-formation metric has about 2% higher accuracy than the output L1 metric when we removed 55% of parameters from the SVHN dataset. On the CIFAR-100 dataset, our metric is 1.5% more accurate than the output L1 metric when there are only 53% of the parameters remain. The only exception is the CIFAR-10 dataset, where our metric is 5% less accurate than the output L1 metric when there are 40% of the parameters remain. ? 2020 IEEE. |
URI: | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85103844415&doi=10.1109%2fBigData50022.2020.9378489&partnerID=40&md5=15ea2425586d1cbd09bde07ed15b4393 https://scholars.lib.ntu.edu.tw/handle/123456789/581419 |
DOI: | 10.1109/BigData50022.2020.9378489 | SDG/關鍵字: | Big data; Convolution; Convolutional neural networks; Entropy; Hardware/software; Label recognition; Model compression; Mutual information method; Mutual informations; Network compression; Output filters; Pruning algorithms; Classification (of information) |
顯示於: | 資訊工程學系 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。