Journals & Magazines >IEEE Transactions on Artifici... >Volume: 5 Issue: 6

Reinforced Knowledge Distillation for Time Series Regression

Download PDF
Download References
Request Permissions
Save to
Alerts

Impact Statement:The advanced deep neural networks often suffer from complex network architectures which severely hinder their deployment on resource-limited devices. Knowledge distillati...Show More

Abstract:

As one of the most popular and effective methods in model compression, knowledge distillation (KD) attempts to transfer knowledge from single or multiple large-scale netw...Show More

Metadata

Impact Statement:

The advanced deep neural networks often suffer from complex network architectures which severely hinder their deployment on resource-limited devices. Knowledge distillation, as one of the most popular model compression techniques, aims to transfer the knowledge from single or multiple cumbersome teacher models to a compact student model. However, existing KD methods either laboriously select a particular teacher or simply assign equal or fixed weights for multiple teachers, resulting in tedious teacher selection procedure and poor distillation efficiency. Thus, in this article, we propose a novel reinforcement learning-based KD approach to overcome these limitations and our experimental results demonstrate that it consistently outperforms other SOTA methods on two real-world tasks. The proposed method can achieve average improvements of 5.7% and 15.7% in terms of RMSE and score, respectively, for machine RUL prediction task, and 8.1% in terms of mean localization error for indoor local...

Abstract:

As one of the most popular and effective methods in model compression, knowledge distillation (KD) attempts to transfer knowledge from single or multiple large-scale networks (i.e., Teachers) to a compact network (i.e., Student). For the multiteacher scenario, existing methods either assign equal or fixed weights for different teacher models during distillation, which can be inefficient as teachers might perform variously or even oppositely on different training samples. To address this issue, we propose a novel reinforced knowledge distillation method with negatively correlated teachers which are generated via negative correlation learning. The negatively correlated teachers would encourage teachers to learn different aspects of data and thus the ensemble of them can be more comprehensive and suitable for multiteacher KD. Subsequently, a reinforced KD algorithm is proposed to dynamically employ proper teachers for different training instances via dueling double deep Q-network (DDQN). ...

Published in: IEEE Transactions on Artificial Intelligence ( Volume: 5, Issue: 6, June 2024)

Page(s): 3184 - 3194

Date of Publication: 18 December 2023

Electronic ISSN: 2691-4581

DOI: 10.1109/TAI.2023.3341854

Funding Agency:

Contents

References is not available for this document.

Reinforced Knowledge Distillation for Time Series Regression

Abstract:

Metadata

Abstract:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Reinforced Knowledge Distillation for Time Series Regression

Alerts

Abstract:

Metadata

Abstract:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?