Login Paper Search My Schedule Paper Index Help

My ICIP 2021 Schedule

Note: Your custom schedule will not be saved unless you create a new account or login to an existing account.
  1. Create a login based on your email (takes less than one minute)
  2. Perform 'Paper Search'
  3. Select papers that you desire to save in your personalized schedule
  4. Click on 'My Schedule' to see the current list of selected papers
  5. Click on 'Printable Version' to create a separate window suitable for printing (the header and menu will appear, but will not actually print)

Paper Detail

Paper IDSS-NNC.11
Paper Title Comprehensive Online Network Pruning via Learnable Scaling Factors
Authors Muhammad Umair Haider, Murtaza Taj, Lahore University of Management Sciences, Pakistan
SessionSS-NNC: Special Session: Neural Network Compression and Compact Deep Features
LocationArea B
Session Time:Tuesday, 21 September, 08:00 - 09:30
Presentation Time:Tuesday, 21 September, 08:00 - 09:30
Presentation Poster
Topic Special Sessions: Neural Network Compression and Compact Deep Features: From Methods to Standards
IEEE Xplore Open Preview  Click here to view in IEEE Xplore
Abstract One of the major challenges in deploying deep neural network architectures is their size which has an adverse effect on their inference time and memory requirements. Deep CNNs can either be pruned width-wise by removing filters or depth-wise by removing layers and blocks. Width wise pruning (filter pruning) is commonly performed via learnable gates or switches and sparsity regularizers whereas pruning of layers has so far been performed arbitrarily by manually designing a smaller network usually referred to as a student network. We propose a comprehensive pruning strategy that can perform both width-wise as well as depth-wise pruning. This is achieved by introducing gates at different granularities (neuron, filter, layer, block) which are then controlled via an objective function that simultaneously performs pruning at different granularity during each forward pass. Our approach is applicable to wide-variety of architectures without any constraints on spatial dimensions or connection type (sequential, residual, parallel or inception). Our method has resulted in a compression ratio of 70% to 90% without noticeable loss in accuracy when evaluated on benchmark datasets.