Lee, Eunho and Hwang, Youngbae (2021) Layer-Wise Network Compression Using Gaussian Mixture Model. Electronics, 10 (1). p. 72. ISSN 2079-9292
electronics-10-00072-v3.pdf - Published Version
Download (762kB)
Abstract
Due to the large number of parameters and heavy computation, the real-time operation of deep learning in low-performance embedded board is still difficult. Network Pruning is one of effective methods to reduce the number of parameters without additional network structure modification. However, the conventional method prunes redundant parameters up to the same rate for all layers. It may cause a bottleneck problem, which leads to the performance degradation, because the minimum number of optimal parameters is different according to the each layer. We propose a layer adaptive pruning method based on the modeling of weight distribution. We can measure the amount of weights close to zero accurately by applying Gaussian Mixture Model (GMM). Until the target compression rate is reached, the layer selection and pruning are iteratively performed. The layer selection in each iteration considers the timing to reach the target compression rate and the degree of weight pruning. We apply the proposed network compression method for image classification and semantic segmentation to show the effectiveness of the proposed method. In the experiments, the proposed method shows higher compression rate during maintaining the accuracy compared with previous methods.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | network pruning; network compression; Gaussian mixture model |
Subjects: | STM Repository > Engineering |
Depositing User: | Managing Editor |
Date Deposited: | 16 Jul 2024 06:56 |
Last Modified: | 16 Jul 2024 06:56 |
URI: | http://classical.goforpromo.com/id/eprint/711 |