Abstract
In the last years, deep learning models have achieved remarkable generalization capability on computer vision tasks, obtaining excellent results in fine-grained classification problems. Sophisticated approaches based-on discriminative feature learning via patches have been proposed in the literature, boosting the model performances and achieving the state-of-the-art over well-known datasets. Cross-Entropy (CE) loss function is commonly used to enhance the discriminative power of the deep learned features, encouraging the separability between the classes. However, observing the activation map generated by these models in the hidden layer, we realize that many image regions with low discriminative content have a high activation response and this could lead to misclassifications. To address this problem, we propose a loss function called Gaussian Mixture Centers (GMC) loss, leveraging on the idea that data follow multiple unimodal distributions. We aim to reduce variances considering many centers per class, using the information from the hidden layers of a deep model, and decreasing the high response from the unnecessary areas of images detected along the baselines. Using jointly CE and GMC loss, we improve the learning generalization model overcoming the performance of the baselines in several use cases. We show the effectiveness of our approach by carrying out experiments over CUB-200-2011, FGVC-Aircraft, Stanford-Dogs benchmarks, and considering the most recent Convolutional Neural Network (CNN).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Améndola, C., Engström, A., Haase, C.: Maximum number of modes of gaussian mixtures. Inf. Infer. J. IMA (2020)
Branson, S., Van Horn, G., Belongie, S., Perona, P.: Bird species categorization using pose normalized deep convolutional nets (2014)
Cai, S., et al.: Higher-order integration of hierarchical convolutional activations for fine-grained visual categorization. In: ICCV (2017)
Chen, T., et al.: Knowledge-embedded representation learning for fine-grained image recognition. In: Conference on Artificial Intelligence (2018)
Chen, W., et al.: Beyond triplet loss: a deep quadruplet network for person re-identification. In: CVPR (2017)
Deng, J., et al.: Arcface: Additive angular margin loss for deep face recognition. In: CVPR (2019)
Ding, Y., et al.: Selective sparse sampling for fine-grained image recognition. In: IEEE/CVF (2019)
Dubey, A., et al.: Maximum-entropy fine grained classification. In: NIPS (2018)
Frosst, N., Papernot, N., Hinton, G.E.: Analyzing and improving representations with the soft nearest neighbor loss. In: ICML (2019)
Fu, J., Zheng, H., Mei, T.: Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition. In: CVPR (2017)
Ghosh, P., Davis, L.S.: Understanding center loss based network for image retrieval with few training data. In: ECCV (2018)
He, X., et al.: Triplet-center loss for multi-view 3d object retrieval. In: CVPR (2018)
Hennig, C., Meila, M., Murtagh, F., Rocci, R.: Handbook of cluster analysis (2015)
Khosla, A., et al.: Novel dataset for fine-grained image categorization: stanford dogs. In: CVPR
Kulesza, A., Jiang, N., Singh, S.: Low-rank spectral learning with weighted loss functions. In: AISTATS (2015)
La Grassa, R., et al.: Learning to navigate in the gaussian mixturesurface. https://gitlab.com/artelabsuper/gmc_loss
La Grassa, R., et al.: \(\sigma ^{2}\) r loss: a weighted loss by multiplicative factors using sigmoidal functions. arXiv preprint arXiv:2009.08796 (2020)
LeCun, Y.A., et al.: Efficient backprop. In: Neural networks (2012)
Li, P., Xie, J., Wang, Q., Gao, Z.: Towards faster training of global covariance pooling networks by iterative matrix square root normalization. In: CVPR (2018)
Li, Z., et al.: Dynamic computational time for visual attention. In: ICCV (2017)
Liu, W., et al.: Sphereface: deep hypersphere embedding for face recognition. In: CVPR (2017)
Luo, W., Zhang, H., Li, J., Wei, X.S.: Learning semantically enhanced feature for fine-grained image classification. IEEE Signal Proc. Lett. 27, 1545-9 (2020)
Luo, W., et al.: Cross-x learning for fine-grained visual categorization. In: ICCV (2019)
Maji, S., et al.: Fine-grained visual classification of aircraft Tech Rep (2013)
Peng, Y., He, X., Zhao, J.: Object-part attention model for fine-grained image classification. IEEE Transactions on Image Processing (2017)
Qi, C., Su, F.: Contrastive-center loss for deep neural networks. In: ICIP (2017)
Salakhutdinov, R., Hinton, G.: Learning a nonlinear embedding by preserving class neighbourhood structure. In: AISTATS (2007)
Sun, G., et al.: Fine-grained recognition: accounting for subtle differences between similar classes. In: Conference on Artificial Intelligence (2020)
Theodoridis, S., et al.: Pattern recognition. IEEE Trans. Neural Netw. (2008)
Wah, C., et al.: The Caltech-UCSD birds-200-2011 Dataset tech rep (2011)
Wan, W., Zhong, Y., Li, T., Chen, J.: Rethinking feature distribution for loss functions in image classification. In: CVPR (2018)
Wang, M., Deng, W.: Deep face recognition. Neurocomputing 393, 1-14 (2020)
Wang, Q., et al.: A comprehensive survey of loss functions in machine learning. Annals of Data Sci. (2020)
Wang, Y., et al.: Learning a discriminative filter bank within a cnn for fine-grained recognition. In: CVPR (2018)
Wang, Z., et al.: Weakly supervised fine-grained image classification via correlation-guided discriminative learning. In: ACM-MM (2019)
Wei, X.S., et al.: Mask-cnn: localizing parts and selecting descriptors for fine-grained bird species categorization. Pattern Recogn. (2018)
Wen, Y., et al.: A discriminative feature learning approach for deep face recognition. In: ECCV (2016)
Yang, Z., et al.: Learning to navigate for fine-grained classification. In: ECCV (2018)
Zheng, H., et al.: Learning rich part hierarchies with progressive attention networks for fine-grained image recognition. IEEE Trans. Image Process. 29,476-488 (2019)
Zheng, H., et al.: Looking for the devil in the details: learning trilinear attention sampling network for fine-grained image recognition. In: CVPR (2019)
Zhu, Y., et al.: Hetero-center loss for cross-modality person re-identification. Neurocomputing 386, 97-109 (2020)
Zhuang, P., Wang, Y., Qiao, Y.: Learning attentive pairwise interaction for fine-grained classification. In: AAAI (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
La Grassa, R., Gallo, I., Vetro, C., Landro, N. (2021). Learning to Navigate in the Gaussian Mixture Surface. In: Tsapatsoulis, N., Panayides, A., Theocharides, T., Lanitis, A., Pattichis, C., Vento, M. (eds) Computer Analysis of Images and Patterns. CAIP 2021. Lecture Notes in Computer Science(), vol 13052. Springer, Cham. https://doi.org/10.1007/978-3-030-89128-2_40
Download citation
DOI: https://doi.org/10.1007/978-3-030-89128-2_40
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-89127-5
Online ISBN: 978-3-030-89128-2
eBook Packages: Computer ScienceComputer Science (R0)