research-article

Regression Algorithm Based on Self-Distillation and Ensemble Learning

Authors:
Yaqi Li

School of Data Science&Engineering, East China Normal University, China

School of Data Science&Engineering, East China Normal University, China
View Profile

,
Qiwen Dong

School of Data Science&Engineering, East China Normal University, China

School of Data Science&Engineering, East China Normal University, China
View Profile

,
Gang Liu

China Academy of Launch Vehicle Technology, China

China Academy of Launch Vehicle Technology, China
View Profile

CSAI '21: Proceedings of the 2021 5th International Conference on Computer Science and Artificial IntelligenceDecember 2021Pages 209–215https://doi.org/10.1145/3507548.3507580

Published:09 March 2022Publication History

CSAI '21: Proceedings of the 2021 5th International Conference on Computer Science and Artificial Intelligence

Pages 209–215

ABSTRACT

Low-dimensional feature regression is a common problem in various disciplines, such as chemistry, kinetics, and medicine, etc. Most common solutions are based on machine learning, but as deep learning evolves, there is room for performance improvements. A few researchers have proposed deep learning-based solutions such as ResidualNet, GrowNet and EnsembleNet. The latter two methods are both boost methods, which are more suitable for shallow network, and the model performance is basically determined by the first model, with limited effect of subsequent boosting steps. We propose a method based on self-distillation and bagging, which selects the well-performing base model and distills several student models by appropriate regression distillation algorithm. Finally, the output of these student models is averaged as the final result. This integration method can be applied to any form of network. The method achieves good results in the CASP dataset, and the R2(Coefficient of Determination) of the model is improved from (0.65) to (0.70) in comparison with the best base model ResidualNet.

References

Smola, Alex J and Schölkopf, Bernhard. 2004. A tutorial on support vector regression. Statistics and computing, 14, 3 (August 2004), 199-222. https://doi.org/10.1023/b:stco.0000035301.49549.88Google ScholarDigital Library
Criminisi, Antonio, Shotton, Jamie and Konukoglu, Ender. 2011. Decision forests for classification, regression, density estimation, manifold learning and semi-supervised learning. Microsoft Research Cambridge, Tech. Rep. MSRTR-2011-114, 5, 6 (2011), 12.Google Scholar
Freund, Yoav and Schapire, Robert E. 1997. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of computer and system sciences, 55, 1 (August 1997), 119-139. https://doi.org/10.1006/jcss.1997.1504Google ScholarDigital Library
Friedman, Jerome H. 2001. Greedy function approximation: a gradient boosting machine. Annals of statistics, 29, 5 (October 2001), 1189-1232. https://doi.org/10.1214/aos/1013203451Google ScholarCross Ref
Chen, Dongwei, Hu, Fei, Nian, Guokui and Yang, Tiantian. 2020. Deep residual learning for nonlinear regression. Entropy, 22, 2 (February 2020), 193. https://doi.org/10.3390/e22020193Google Scholar
Badirli, Sarkhan, Liu, Xuanqing, Xing, Zhengming, Bhowmik, Avradeep, Doan, Khoa and Keerthi, Sathiya S. 2020. Gradient boosting neural networks: Grownet. arXiv preprint arXiv:2002.07971 (2020).Google Scholar
Park, Minyoung, Lee, Seungyeon, Hwang, Sangheum and Kim, Dohyun. 2020. Additive Ensemble Neural Networks. IEEE Access, 8 (2020), 113192-113199. https://doi.org/10.1109/access.2020.3003748Google ScholarCross Ref
Martınez-Munoz, Gonzalo and Superior, Escuela Politéctica. 2019. Sequential training of neural networks with gradient boosting. arXiv preprint arXiv:1909.12098 (2019).Google Scholar
Furlanello, Tommaso, Lipton, Zachary, Tschannen, Michael, Itti, Laurent and Anandkumar, Anima. Born again neural networks. 2018. In Proceedings of International Conference on Machine Learning, 1607-1616.Google Scholar
Yuen, Kevin Kam Fung. 2017. Towards multiple regression analyses for relationships of air quality and weather. Journal of Advances in Information Technology Vol, 8, 2 (May 2017), 135-140. 10.12720/jait.8.2.135-140Google Scholar
Lo, Wai Lun, Zhu, Meimei and Fu, Hong. 2020. Meteorology visibility estimation by using multi-support vector regression method. Journal of Advances in Information Technology Vol, 11, 2 (May 2020), 40-47. 10.12720/jait.11.2.40-47Google Scholar
Daghistani, Tahani and Alshammari, Riyad. 2020. Comparison of statistical logistic regression and random forest machine learning techniques in predicting diabetes. Journal of Advances in Information Technology Vol, 11, 2 (May 2020), 78-83. 10.12720/jait.11.2.78-83Google Scholar
He, Kaiming, Zhang, Xiangyu, Ren, Shaoqing and Sun, Jian. Deep residual learning for image recognition. 2016. In Proceedings of Proceedings of the IEEE conference on computer vision and pattern recognition, 770-778.Google Scholar
Sahoo, Doyen, Pham, Quang, Lu, Jing and Hoi, Steven CH. 2017. Online deep learning: Learning deep neural networks on the fly. arXiv preprint arXiv:1711.03705 (July 2017). https://doi.org/10.24963/ijcai.2018/369Google Scholar
Zhang, Si-si, Liu, Jian-wei, Zuo, Xin, Lu, Run-kun and Lian, Si-ming. 2021. Online deep learning based on auto-encoder. Applied Intelligence (2021), 1-20.Google Scholar
Hansen, Lars Kai and Salamon, Peter. 1990. Neural network ensembles. IEEE transactions on pattern analysis and machine intelligence, 12, 10 (1990), 993-1001.Google Scholar
Ganaie, MA and Hu, Minghui. 2021. Ensemble deep learning: A review. arXiv preprint arXiv:2104.02395 (2021).Google Scholar
Brownlee, Jason. 2018. Ensemble Learning Methods for Deep Learning Neural Networks. December 19, 2018 from https://machinelearningmastery.com/ensemble-methods-for-deep-learning-neural-networks/Google Scholar
Hinton, Geoffrey, Vinyals, Oriol and Dean, Jeff. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015).Google Scholar
Xie, Qizhe, Luong, Minh-Thang, Hovy, Eduard and Le, Quoc V. Self-training with noisy student improves imagenet classification. 2020. In Proceedings of Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10687-10698. https://doi.org/10.1109/cvpr42600.2020.01070Google Scholar
Cho, Jang Hyun and Hariharan, Bharath. On the efficacy of knowledge distillation. 2019. In Proceedings of Proceedings of the IEEE/CVF International Conference on Computer Vision, 4794-4802. https://doi.org/10.1109/iccv.2019.00489Google Scholar
Yang, Chenglin, Xie, Lingxi, Qiao, Siyuan and Yuille, Alan L. Training deep neural networks in generations: A more tolerant teacher educates better students. 2019. In Proceedings of Proceedings of the AAAI Conference on Artificial Intelligence, 5628-5635. https://doi.org/10.1609/aaai.v33i01.33015628Google ScholarDigital Library
Chen, Guobin, Choi, Wongun, Yu, Xiang, Han, Tony and Chandraker, Manmohan. 2017. Learning efficient object detection models with knowledge distillation. Advances in neural information processing systems, 30 (2017).Google Scholar
Saputra, Muhamad Risqi U, De Gusmao, Pedro PB, Almalioglu, Yasin, Markham, Andrew and Trigoni, Niki. Distilling knowledge from a deep pose regressor network. 2019. In Proceedings of Proceedings of the IEEE/CVF International Conference on Computer Vision, 263-272. https://doi.org/10.1109/iccv.2019.00035Google Scholar
Rana, Prashant Singh. Physicochemical Properties of Protein Tertiary Structure Data Set. March 31, 2013 from https://archive.ics.uci.edu/ml/datasets/Physicochemical+Properties+of+Protein+Tertiary+StructureGoogle Scholar
harlfoxem. House Sales in King County, USA. 2016 from https://www.kaggle.com/harlfoxem/housesalespredictionGoogle Scholar
Arzamasov, Vadim. Electrical Grid Stability Simulated Data Data Set. November 16, 2018 from https://archive.ics.uci.edu/ml/datasets/Electrical+Grid+Stability+Simulated+Data+Google Scholar
Kamath, RS and Kamat, RK. 2018. Modelling Physicochemical Properties for Protein Tertiary Structure Prediction: Performance Analysis of Regression Models (December 2018).Google Scholar

Index Terms

Regression Algorithm Based on Self-Distillation and Ensemble Learning
1. Applied computing
2. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Neural networks

Index terms have been assigned to the content through auto-classification.

Recommendations

“In-Network Ensemble”: Deep Ensemble Learning with Diversified Knowledge Distillation
Ensemble learning is a widely used technique to train deep convolutional neural networks (CNNs) for improved robustness and accuracy. While existing algorithms usually first train multiple diversified networks and then assemble these networks as an ...
Read More
Ensemble deep learning: A review
Abstract
Ensemble learning combines several individual models to obtain better generalization performance. Currently, deep learning architectures are showing better performance compared to the shallow or traditional models. Deep ensemble ...
Highlights
- This paper reviews the state-of-art deep ensemble models and hence serves as an extensive summary for the researchers.
Read More
A comprehensive review on ensemble deep learning: Opportunities and challenges
Abstract
In machine learning, two approaches outperform traditional algorithms: ensemble learning and deep learning. The former refers to methods that integrate multiple base models in the same framework to obtain a stronger model that ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

CSAI '21: Proceedings of the 2021 5th International Conference on Computer Science and Artificial Intelligence
December 2021
437 pages
ISBN:9781450384155
DOI:10.1145/3507548

Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 9 March 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Deep Learning
Ensemble Learning
Knowledge distillation
Regression
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 165
  Total Downloads
- Downloads (Last 12 months)61
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Regression Algorithm Based on Self-Distillation and Ensemble Learning

CSAI '21: Proceedings of the 2021 5th International Conference on Computer Science and Artificial Intelligence

ABSTRACT

References

Cited By

Index Terms

Recommendations

“In-Network Ensemble”: Deep Ensemble Learning with Diversified Knowledge Distillation

Ensemble deep learning: A review

A comprehensive review on ensemble deep learning: Opportunities and challenges

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Regression Algorithm Based on Self-Distillation and Ensemble Learning

CSAI '21: Proceedings of the 2021 5th International Conference on Computer Science and Artificial Intelligence

ABSTRACT

References

Cited By

Index Terms

Recommendations

“In-Network Ensemble”: Deep Ensemble Learning with Diversified Knowledge Distillation

Ensemble deep learning: A review

A comprehensive review on ensemble deep learning: Opportunities and challenges

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media