Layer-Wise Network Compression Using Gaussian Mixture Model

Lee, Eunho and Hwang, Youngbae (2021) Layer-Wise Network Compression Using Gaussian Mixture Model. Electronics, 10 (1). p. 72. ISSN 2079-9292

[thumbnail of electronics-10-00072-v3.pdf] Text
electronics-10-00072-v3.pdf - Published Version

Download (762kB)

Abstract

Due to the large number of parameters and heavy computation, the real-time operation of deep learning in low-performance embedded board is still difficult. Network Pruning is one of effective methods to reduce the number of parameters without additional network structure modification. However, the conventional method prunes redundant parameters up to the same rate for all layers. It may cause a bottleneck problem, which leads to the performance degradation, because the minimum number of optimal parameters is different according to the each layer. We propose a layer adaptive pruning method based on the modeling of weight distribution. We can measure the amount of weights close to zero accurately by applying Gaussian Mixture Model (GMM). Until the target compression rate is reached, the layer selection and pruning are iteratively performed. The layer selection in each iteration considers the timing to reach the target compression rate and the degree of weight pruning. We apply the proposed network compression method for image classification and semantic segmentation to show the effectiveness of the proposed method. In the experiments, the proposed method shows higher compression rate during maintaining the accuracy compared with previous methods.

Item Type: Article
Uncontrolled Keywords: network pruning; network compression; Gaussian mixture model
Subjects: STM Repository > Engineering
Depositing User: Managing Editor
Date Deposited: 16 Jul 2024 06:56
Last Modified: 16 Jul 2024 06:56
URI: http://classical.goforpromo.com/id/eprint/711

Actions (login required)

View Item
View Item