Login Paper Search My Schedule Paper Index Help

My ICIP 2021 Schedule

Note: Your custom schedule will not be saved unless you create a new account or login to an existing account.
  1. Create a login based on your email (takes less than one minute)
  2. Perform 'Paper Search'
  3. Select papers that you desire to save in your personalized schedule
  4. Click on 'My Schedule' to see the current list of selected papers
  5. Click on 'Printable Version' to create a separate window suitable for printing (the header and menu will appear, but will not actually print)

Paper Detail

Paper IDMLR-APPL-IVSMR-1.10
Paper Title DIFFERENTIABLE DYNAMIC CHANNEL ASSOCIATION FOR KNOWLEDGE DISTILLATION
Authors Qiankun Tang, Zhejiang Lab, China; Xiaogang Xu, Zhejiang Gongshang University, China; Jun Wang, Zhejiang Lab, China
SessionMLR-APPL-IVSMR-1: Machine learning for image and video sensing, modeling and representation 1
LocationArea C
Session Time:Tuesday, 21 September, 13:30 - 15:00
Presentation Time:Tuesday, 21 September, 13:30 - 15:00
Presentation Poster
Topic Applications of Machine Learning: Machine learning for image & video sensing, modeling, and representation
IEEE Xplore Open Preview  Click here to view in IEEE Xplore
Abstract Knowledge distillation is an effective model compression technology, which encourages a small student model to mimic the features or probabilistic outputs of a large teacher model. Existing feature-based distillation methods mainly focus on formulating enriched representations, while naively address the channel dimension gap and adopt the handcrafted channel association strategy between teacher and student for distillation. This not only introduces more parameters and computational cost, but may transfer irrelevant information to student. In this paper, we present a differentiable and efficient Dynamic Channel Association (DCA) mechanism, which automatically associates proper teacher channels for each student channel. DCA also enables each student channel to distill knowledge from multiple teacher channels in a weighted manner. Extensive experiments on classification task, with various combinations of network architectures for teacher and student models, well demonstrate the effectiveness of our proposed approach.