[1]宋 晨,魏子重,姜 凯,等.新型轻量化神经网络结构范式的剪枝研究[J].南京师范大学学报(工程技术版),2023,23(04):029-36.[doi:10.3969/j.issn.1672-1292.2023.04.004]
 Song Chen,Wei Zizhong,Jiang Kai,et al.Pruning Research on New Lightweight Neural Network Structures Paradigm[J].Journal of Nanjing Normal University(Engineering and Technology),2023,23(04):029-36.[doi:10.3969/j.issn.1672-1292.2023.04.004]





Pruning Research on New Lightweight Neural Network Structures Paradigm
宋 晨魏子重姜 凯李 锐段 强
(山东浪潮科学研究院有限公司,山东 济南 250014)
Song ChenWei ZizhongJiang KaiLi RuiDuan Qiang
(Inspur Academy of Science and Technology,Jinan 250014,China)
MobileOneSSDdeep separable convolutionpruningTinyML
随着深度学习技术的推广,图像处理中的目标检测任务取得了蓬勃发展. 伴随着大模型的流行发展,深度学习模型精度在不断的上升. 而这些大模型却难以部署在日益发展的边缘设备上. 针对目前边缘端的目标检测任务,提出了一个MobileOne-S0和SSD相结合的网络结构,该网络结构经重参数化后,形成了VGG形式的网络结构用于推理过程. 随后采用了非结构化的权重剪枝,结构化的BN剪枝和泰勒剪枝这3种不同的剪枝标准进行了剪枝. 结果显示权重剪枝效果最差,而两种结构化剪枝对FLOPs和参数量随稀疏度上升下降速率几乎一致,但BN剪枝精度下降较泰勒剪枝缓慢,而泰勒剪枝对峰值内存大小的剪枝效果最好. 在模型精度下降约10%时,BN剪枝可以压缩22.3倍的参数量,9.4倍的FLOPs和2.5倍的内存占用峰值大小. 最终模型大小仅为123.88 kB,使模型更容易部署在TinyML适用、MCU级别的低功耗端侧设备上.
With the widespread adoption of deep learning technology,the object detection task in image processing has made vigorous progress. Along with the popularity and development of large models,the accuracy of deep learning models continuously improves. However,these large models are difficult to deploy on edge devices that are increasingly developing. To address the current object detection tasks at the edge-side,a network structure combining MobileOne-S0 and SSD is proposed. This network structure is reparameterized to form a VGG-like network structure for the inference process. Then,three different pruning criteria are used,including unstructured weight pruning,structured BN pruning,and Taylor pruning. The results show that weight pruning has the worst effect,while the two structured pruning methods have almost the same decrease rate for FLOPs and parameter quantity with the increase of sparsity. However,the accuracy drop of BN pruning is slower than that of Taylor pruning while Taylor pruning has the best pruning effect on peak memory size. When the model precision decreases by about 10%,BN pruning can compress the parameter quantity by 22.3 times,FLOPs by 9.4 times,and peak memory usage by 2.5 times. The final model size is only 123.88 kB,making it easier to deploy on TinyML-suitable,MCU-level,low-power end-side devices.


[1]LIU W,ANGUELOV D,ERHAN D,et al. Ssd:single shot multibox detector[C]//European Conference on Computer Vision. Berlin:Springer,2016:21-37.
[2]PAVAN K A V,GABRIEL J,ZHU J,et al. An improved one millisecond mobile backbone[J/OL]. arXiv Preprint arXiv:2206.04040,2022.
[3]GIRSHICK R,DONAHUE J,DARRELL T,et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Columbus,Ohio,USA:IEEE,2014:580-587.
[4]HE K,ZHANG X,REN S,et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(9):1904-1916.
[5]GIRSHICK R. Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision. Santiago,Chile:IEEE,2015:1440-1448.
[6]REN S Q,HE K M,GIRSHICK R,et al. Faster R-CNN:Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.
[7]REDMON J,DIVVALA S,GIRSHICK R,et al. You only look once:unified,real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas,Nevada,USA:IEEE,2016:779-788.
[8]WANG C Y,BOCHKOVSKIY A,LIAO H Y M. YOLOv7:Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver,Canada:IEEE,2023:7464-7475.
[9]IANDOLA F N,HAN S,MOSKEWICZ M W,et al. Squeezenet:Alexnet-level accuracy with 50x fewer parameters and<0.5 MB model size[J/OL]. arXiv Preprint arXiv:1602.07360,2016.
[10]HOWARD A G,ZHU M,CHEN B,et al. Mobilenets:efficient convolutional neural networks for mobile vision applications[J/OL]. arXiv Preprint arXiv:1704.04861,2017.
[11]SANDLER M,HOWARD A,ZHU M,et al. MobileNetV2:inverted residuals and linear bottlenecks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City,Utah,USA:IEEE,2018:4510-4520.
[12]HOWARD A,SANDLER M,CHU G,et al. Searching for mobilenetv3[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Long Beach,California,USA:IEEE,2019:1314-1324.
[13]SZEGEDY C,VANHOUCKE V,IOFFE S,et al. Rethinking the inception architecture for computer vision[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas,Nevada,USA:IEEE,2016:2818-2826.
[14]ZHANG X,ZHOU X,LIN M,et al. Shufflenet:An extremely efficient convolutional neural network for mobile devices[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City,Utah,USA:IEEE,2018:6848-6856.
[15]CAI H,ZHU L,HAN S. Proxylessnas:Direct neural architecture search on target task and hardware[J/OL]. arXiv Preprint arXiv:1812.00332,2018.
[16]CAI H,GAN C,WANG T,et al. Once-for-all:train one network and specialize it for efficient deployment[J/OL]. arXiv Preprint arXiv:1908.09791,2019.
[17]DING X,ZHANG X,MA N,et al. Repvgg:making vgg-style convnets great again[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Virtual:IEEE,2021:13733-13742.
[18]FRANKLE J,CARBIN M. The lottery ticket hypothesis:finding sparse,trainable neural networks[J/OL]. arXiv Preprint arXiv:1803.03635,2018.
[19]LECUN Y,DENKER J,SOLLA S. Optimal brain damage[J]. Advances in Neural Information Processing Systems,1990,2(279):598-605.
[20]HAN S,POOL J,TRAN J,et al. Learning both weights and connections for efficient neural network[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal,Canada:NIPS,2015:1135-1143.
[21]LIU Z,LI J,SHEN Z,et al. Learning efficient convolutional networks through network slimming[C]//Proceedings of the IEEE International Conference on Computer Vision. Venice,Italy:IEEE,2017:2736-2744.
[22]MOLCHANOV P,TYREE S,KARRAS T,et al. Pruning convolutional neural networks for resource efficient inference[J/OL]. arXiv Preprint arXiv:1611.06440,2017.


通讯作者:李锐,博士,正高级工程师,研究方向:数据挖掘和机器学习. E-mail:lirui01@inspur.com
更新日期/Last Update: 2023-12-15