research-article

Truncated Cross-entropy: A New Loss Function for Multi-category Classification

Authors:
Yaolin Zheng

Computer school, Beijing Information Science and Technology University, China

Computer school, Beijing Information Science and Technology University, China
View Profile

,
Hongbo Huang

Computer school, Beijing Information Science and Technology University, China

Computer school, Beijing Information Science and Technology University, China
View Profile

,
Xiaoxu Yan

Computer school, Beijing Information Science and Technology University, China

Computer school, Beijing Information Science and Technology University, China
View Profile

,
Jiayu He

Computer school, Beijing Information Science and Technology University, China

Computer school, Beijing Information Science and Technology University, China
View Profile

ICCIR '22: Proceedings of the 2022 2nd International Conference on Control and Intelligent RoboticsJune 2022Pages 274–278https://doi.org/10.1145/3548608.3559206

Published:14 October 2022Publication History

ICCIR '22: Proceedings of the 2022 2nd International Conference on Control and Intelligent Robotics

Pages 274–278

ABSTRACT

Deep learning is one of the hottest research topics which has received a lot of attention in recent years. During the training of deep learning models, the loss function is a critical indicating objective which measures the difference between the predicted value and the distribution of the real data. It is also an important indicator for evaluating the performance of a deep learning model. The most popular loss functions used in deep learning include mean square error (MSE), cross-entropy error, etc. Obviously, the loss function has non-negligible influence to the optimizer. The most common optimizers include stochastic gradient descent method (SGD), mini-batch stochastic gradient descent method (MBGD) and Adaptive moment estimation (ADAM). Among them, the MBGD is widely used for its equilibrium between accuracy and speed. However, how to set the batch size is a big challenge. If the batch size is too large, the cost of computation and memory increases accordingly. On the other hand, the gradient descent process could be more oscillated with small batch size. Therefore, this paper proposes a improved loss function named truncated cross-entropy to stabilize the convergence procedure of the optimizer. Experiments show that the proposed method could speed up the convergence of training and reduce the oscillation. The proposed method can achieve similar performance to large-batch-size training with relatively small batch size.

References

Qian N. 1999. On the momentum term in gradient descent learning algorithms. J. Neural networks, 12, 1, 145-151. https://doi.org/10.1016/S0893-6080(98)00116-6Google ScholarDigital Library
Duchi J, Hazan E, Singer Y. 2011. Adaptive subgradient methods for online learning and stochastic optimization. J. Journal of machine learning research, 12, 7Google Scholar
Zeiler M D. 2012. Adadelta: an adaptive learning rate method. arXiv:1212.5701. Retrieved from https://arxiv.org/abs/1212.5701Google Scholar
Kingma D P, Ba J 2015. Adam: a Method for Stochastic Optimization. Interna-tional Conference on Learning Representations, 1-13.Google Scholar
De S, Mukherjee A, Ullah E. 2018. Convergence guarantees for RMSProp and ADAM in non-convex optimization and an empirical comparison to Nesterov acceleration. arXiv:1807.06766. Retrieved from https://arxiv.org/abs/1807.06766Google Scholar
Szegedy C, Vanhoucke V, Ioffe S, 2016. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), 2818-2826.Google ScholarCross Ref
Beyer L, Hénaff O J, Kolesnikov A, 2020. Are we done with imagenet? arXiv:2006.07159. Retrieved from https://arxiv.org/abs/2006.07159Google Scholar
Lapin M, Hein M, Schiele B. 2017. Analysis and optimization of loss functions for multiclass, top-k, and multilabel classification. J. IEEE transactions on pattern analysis and machine intelligence, 40, 7, 1533-1554. https://doi.org/10.1109/TPAMI.2017.2751607Google Scholar
Brian Lucena. 2022. Loss Functions for Classification using Structured Entropy. arXiv:2206.07122. Retrieved from https://arxiv.org/abs/2206.07122Google Scholar
Bertinetto L, Mueller R, Tertikas K, 2020. Making better mistakes: Leveraging class hierarchies with deep networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 12506-12515.Google ScholarCross Ref
Feng L, Shu S, Lin Z, 2021. Can cross entropy loss be robust to label noise? In Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, 2206-2212.Google Scholar
Santiago Gonzalez and Risto Miikkulainen. 2020. Optimizing loss functions through multivariate taylor polynomial parameterization. arXiv:2002.00059, 2020b. Retrieved from https://arxiv.org/abs/2002.00059Google Scholar
Tong Y, Yu L, Li S, 2021. Polynomial fitting algorithm based on neural network. J. ASP Transactions on Pattern Recognition and Intelligent Systems, 1, 1, 32-39.Google ScholarCross Ref
Gonzalez S, Miikkulainen R. 2020. Evolving loss functions with multivariate taylor polynomial parameterizations. arXiv:2002.00059. Retrieved from https://arxiv.org/abs/2002.00059v2Google Scholar
Wang D, Smith A, Xu J. 2019. Noninteractive locally private learning of linear models via polynomial approximations. Algorithmic Learning Theory. PMLR, 898-903.Google Scholar
Leng Z, Tan M, Liu C, 2022. PolyLoss: A Polynomial Expansion Perspective of Classification Loss Functions. arXiv:2204.12511. Retrieved from https://arxiv.org/abs/2204.12511Google Scholar

Truncated Cross-entropy: A New Loss Function for Multi-category Classification
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches

Recommendations

Cross-entropy loss functions: theoretical analysis and applications
ICML'23: Proceedings of the 40th International Conference on Machine Learning

Cross-entropy is a widely used loss function in applications. It coincides with the logistic loss applied to the outputs of a neural network, when the softmax is used. But, what guarantees can we rely on when using cross-entropy as a surrogate loss? We ...
Read More
Hinge Loss Projection for Classification
Proceedings of the 23rd International Conference on Neural Information Processing - Volume 9948

Hinge loss is one-sided function which gives optimal solution than that of squared error SE loss function in case of classification. It allows data points which have a value greater than 1 and less than $$-1$$-1 for positive and negative classes, ...
Read More
Risk-sensitive loss functions for sparse multi-category classification problems

In this paper, we propose two risk-sensitive loss functions to solve the multi-category classification problems where the number of training samples is small and/or there is a high imbalance in the number of samples per class. Such problems are common ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICCIR '22: Proceedings of the 2022 2nd International Conference on Control and Intelligent Robotics
June 2022
905 pages
ISBN:9781450397179
DOI:10.1145/3548608

Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 14 October 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate131of239submissions,55%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 68
  Total Downloads
- Downloads (Last 12 months)43
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Truncated Cross-entropy: A New Loss Function for Multi-category Classification

ICCIR '22: Proceedings of the 2022 2nd International Conference on Control and Intelligent Robotics

ABSTRACT

References

Cited By

Recommendations

Cross-entropy loss functions: theoretical analysis and applications

Hinge Loss Projection for Classification

Risk-sensitive loss functions for sparse multi-category classification problems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Truncated Cross-entropy: A New Loss Function for Multi-category Classification

ICCIR '22: Proceedings of the 2022 2nd International Conference on Control and Intelligent Robotics

ABSTRACT

References

Cited By

Recommendations

Cross-entropy loss functions: theoretical analysis and applications

Hinge Loss Projection for Classification

Risk-sensitive loss functions for sparse multi-category classification problems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media