https://scholars.lib.ntu.edu.tw/handle/123456789/642189
標題: | Exploiting Fine-Grained Structured Pruning for Efficient Inference on CNN Model | 作者: | CHENG-HUNG WU Hong, Ding Yong PANGFENG LIU Wu, Jan Jan |
關鍵字: | deep neural network | weight pruning | 公開日期: | 1-一月-2023 | 來源出版物: | Proceedings of the International Conference on Parallel and Distributed Systems - ICPADS | 摘要: | Weight pruning is a technique to remove redundant or unimportant weights from the network. It can help reduce the size and computational cost of neural networks while preserving their accuracy. In this paper, we aim to design efficient CNN models with N:M pruning on the CPU. We propose a dynamic programming algorithm to find a good sparsity ratio for every layer under a total time budget based on the execution times and L1 norm of layers. After deciding the sparsity ratio of each layer, we leverage the auto-tuner of the TVM compiler to search for an optimization schedule of the pruned convolution to accelerate fine-grained pruned models. Experimental results show that our scheme can achieve 0.35% accuracy improvement and a 1.55× speedup on VGG-16 with ImageNet than the dense model. |
URI: | https://scholars.lib.ntu.edu.tw/handle/123456789/642189 | ISBN: | 9798350330717 | ISSN: | 15219097 | DOI: | 10.1109/ICPADS60453.2023.00398 |
顯示於: | 資訊工程學系 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。