SSCRL: fine-grained object retrieval with switched shifted centralized ranking loss

Zeng, Xianxian; Liu, Shun; Wang, Xiaodong; Ye, Peichu; Lai, Guanyu

doi:10.1007/s10489-022-03287-9

SSCRL: fine-grained object retrieval with switched shifted centralized ranking loss

Published: 15 April 2022

Volume 53, pages 336–350, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Xianxian Zeng^1,3,
Shun Liu¹,
Xiaodong Wang ORCID: orcid.org/0000-0002-6487-8969²,
Peichu Ye³ &
…
Guanyu Lai²

353 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Image retrieval is an attractive task in computer vision that aims at browsing, searching, and returning images from a large database of digital images after delivering a retrieval query. Numerous works have focused on fine-grained object retrieval (FGOR) because it is extremely challenging and of great value in practical application. Due to the large diversity within a class and the small diversity across different classes of fine-grained objects data, a convolutional neural network (CNN) is a powerful extractor that can be used to obtain fine-grained features for distinguishing tiny variations between classes. As an indispensable part of a convolutional neural network model, the loss function is of critical importance for feature extraction. In this work, based on the global structure loss function, we propose a variant of softmax loss, named switched shifted softmax loss, to potentially reduce the overfitting phenomenon of the model. Comparative experiments with different backbone structures verify that the developed loss function with trivial transformation enhances the fine-grained retrieval performance of deep learning methods¹. Furthermore, additional experiments of fine-grained object classification and person re-identification (re-ID) prove that our method has a wide spectrum of applicability to other tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fine-grained image retrieval by combining attention mechanism and context information

Article 06 October 2022

Xiaoqing Li & Jinwen Ma

Image Retrieval Research Based on Significant Regions

Cross-Resolution Deep Features Based Image Search

Notes

http://www.vision.caltech.edu/visipedia/CUB-200-2011.html
https://ai.stanford.edu/~jkrause/cars/car_dataset.html
https://pytorch.org
the values of α and λ are same as in previous works [46, 52]. And the switched value β ∈ [0.4, 0.6] can obtain similar performance.
https://github.com/KaiyangZhou/deep-person-reid

References

Bell S, Bala K (2015) Learning visual similarity for product design with convolutional neural networks. ACM Trans Graph (TOG) 34(4):98
Article Google Scholar
Deng C, Liu X, Mu Y, Li J (2015) Large-scale multi-task image labeling with adaptive relevance discovery and feature hashing. Signal Process 112:137–145
Article Google Scholar
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on computer vision and pattern recognition. IEEE, pp 248–255
Dubey A, Gupta O, Guo P, Raskar R, Farrell R, Naik N (2018) Pairwise confusion for fine-grained visual classification. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 70–86
Dubey A, Gupta O, Raskar R, Naik N (2018) Maximum-entropy fine grained classification. In: Advances in neural information processing systems, pp 637–647
Golik P, Doetsch P, Ney H (2013) Cross-entropy vs. squared error training: a theoretical and experimental comparison. In: Interspeech, vol 13, pp 1756–1760
Gudivada VN, Raghavan VV (1995) Content based image retrieval systems. Computer 28(9):18–22
Article Google Scholar
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2961–2969
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hoi SC, Liu W, Chang SF (2010) Semi-supervised distance metric learning for collaborative image retrieval and clustering. ACM Trans Multimed Comput Commun Appl (TOMM) 6(3):1–26
Article Google Scholar
Huang C, Loy CC, Tang X (2016) Local similarity-aware deep feature embedding. In: Advances in neural information processing systems, pp 1262–1270
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
Jain AK, Vailaya A (1996) Image retrieval using color and shape. Pattern Recogn 29 (8):1233–1244
Article Google Scholar
Khosla A, Jayadevaprakash N, Yao B, Li FF (2011) Novel dataset for fine-grained image categorization: Stanford dogs. In: Proc. CVPR workshop on fine-grained visual categorization (FGVC), vol 2
Krause J, Stark M, Deng J (2013) Fei-fei, L.: 3d object representations for fine-grained categorization. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp 554–561
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Li C, Deng C, Wang L, Xie D, Liu X (2019) Coupled cyclegan: Unsupervised hashing network for cross-modal retrieval. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 33, pp 176–183
Liu W, Wang J, Ji R, Jiang YG, Chang SF (2012) Supervised hashing with kernels. In: 2012 IEEE Conference on computer vision and pattern recognition. IEEE, pp 2074–2081
Liu Z, Li H, Zhou W, Zhao R, Tian Q (2014) Contextual hashing for large-scale image search. IEEE Trans Image Process 23(4):1606–1614
Article MathSciNet MATH Google Scholar
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60 (2):91–110
Article Google Scholar
Maji S, Kannala J, Rahtu E, Blaschko M, Vedaldi A (2013) Fine-grained visual classification of aircraft. Technical report
Nilsback M, Zisserman A (2006) A visual vocabulary for flower classification. In: 2006 IEEE Computer society conference on computer vision and pattern recognition (CVPR’06), vol 2, pp 1447–1454. https://doi.org/10.1109/CVPR.2006.42
Oh Song H, Jegelka S, Rathod V, Murphy K (2017) Deep metric learning via facility location. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5382–5390
Oh Song H, Xiang Y, Jegelka S, Savarese S (2016) Deep metric learning via lifted structured feature embedding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4004–4012
Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24(7):971– 987
Article MATH Google Scholar
Radenović F, Tolias G, Chum O (2018) Fine-tuning cnn image retrieval with no human annotation. IEEE Trans Pattern Anal Mach Intell 41(7):1655–1668
Article Google Scholar
Schroff F, Kalenichenko D, Philbin J (2015) Facenet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 815–823
Shi W, Gong Y, Tao X, Cheng D, Zheng N (2018) Fine-grained image classification using modified DCNNs trained by cascaded softmax and generalized large-margin losses. IEEE Trans Neural Netw Learn Syst 30(3):683–694
Article Google Scholar
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: Proceedings of the IEEE International Conference on Learning Representations
Sohn K (2016) Improved deep metric learning with multi-class n-pair loss objective. In: Advances in neural information processing systems, pp 1857–1865
Su X, Liu Z, Zhang Y, Chen CP (2019) Event-triggered adaptive fuzzy tracking control for uncertain nonlinear systems preceded by unknown Prandtl-Ishlinskii hysteresis. IEEE Trans Cybern 51 (6):2979–2992
Article Google Scholar
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826
Ustinova E, Lempitsky V (2016) Learning deep embeddings with histogram loss. In: Advances in neural information processing systems, pp 4170–4178
Wah C, Branson S, Welinder P, Perona P, Belongie S (2011) The caltech-ucsd birds-200-2011 dataset
Wang H, Wang Y, Zhou Z, Ji X, Gong D, Zhou J, Li Z, Liu W (2018) Cosface: Large margin cosine loss for deep face recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5265–5274
Wang J, Song Y, Leung T, Rosenberg C, Wang J, Philbin J, Chen B, Wu Y (2014) Learning fine-grained image similarity with deep ranking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1386–1393
Wei L, Zhang S, Gao W, Tian Q (2018) Person transfer gan to bridge domain gap for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 79–88
Wei XS, Luo J, Wu J, Zhou ZH (2017) Selective convolutional descriptor aggregation for fine-grained image retrieval. IEEE Trans Image Process 26(6):2868–2881
Article MathSciNet MATH Google Scholar
Wei XS, Wu J, Cui Q (2019) Deep learning for fine-grained image analysis: A survey. arXiv:1907.03069
Xie L, Wang J, Zhang B, Tian Q (2015) Fine-grained image search. IEEE Trans Multimed 17(5):636–647
Article Google Scholar
Xu B, Bu J, Chen C, Cai D, He X, Liu W, Luo J (2011) Efficient manifold ranking for image retrieval. In: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval, pp 525–534
Yi D, Lei Z, Li S (2014) Deep metric learning for practical person re-identification. ArXiv e-prints
Yuan L, Wang T, Zhang X, Tay FE, Jie Z, Liu W, Feng J (2020) Central similarity quantization for efficient image and video retrieval. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 3083–3092
Yuan X, Yu J, Qin Z, Wan T (2011) A sift-lbp image retrieval model based on bag of features. In: IEEE International conference on image processing, pp 1061–1064
Zeng X, Wang X, Chen K, Zhang Y, Li D (2019) Dividing the neighbors is not enough: adding confusion makes local descriptor stronger. IEEE Access 7:136106–136115
Article Google Scholar
Zeng X, Zhang Y, Wang X, Chen K, Li D, Yang W (2020) Fine-grained image retrieval via piecewise cross entropy loss. Image Vis Comput 93:103820
Article Google Scholar
Zhang S, Yang M, Wang X, Lin Y, Tian Q (2015) Semantic-aware co-indexing for image retrieval. IEEE Trans Pattern Anal Mach Intell 37(12):2573–2587
Article Google Scholar
Zhang X, Zhou F, Lin Y, Zhang S (2016) Embedding label structures for fine-grained feature representation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1114–1123
Zheng L, Wang S, Tian Q (2014) Coupled binary embedding for large-scale image retrieval. IEEE Trans Image Process 23(8): 3368–3380
Article MathSciNet MATH Google Scholar
Zheng L, Shen L, Tian L, Wang S, Wang J, Tian Q (2015) Scalable person re-identification: a benchmark. In: Proceedings of the IEEE international conference on computer vision, pp 1116–1124
Zheng X, Ji R, Sun X, Wu Y, Huang F, Yang Y (2018) Centralized ranking loss with weakly supervised localization for fine-grained object retrieval. In: IJCAI, pp 1226–1233
Zheng X, Ji R, Sun X, Zhang B, Wu Y, Huang F (2019) Towards optimal fine grained retrieval via decorrelated centralized loss with normalize-scale layer
Zheng Z, Zheng L, Yang Y (2017) Unlabeled samples generated by gan improve the person re-identification baseline in vitro. In: Proceedings of the IEEE International Conference on Computer Vision, pp 3754–3762
Zhou K, Xiang T (2019) Torchreid: A library for deep learning person re-identification in pytorch. arXiv:1910.10093
Zhou K, Yang Y, Cavallaro A, Xiang T (2019) Omni-scale feature learning for person re-identification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 3702–3712

Download references

Acknowledgments

This work was supported by the Ph.D. Start-up Fund of Guangdong Polytechnic Normal University (991641258 and 991641231), Guangzhou Science and Technology Program (105130372030), the National Natural Science Foundation of China (61803090), the Natural Science Foundation of Guangdong Province (2019A1515012109). We appreciate Prof. Rongjun Chen for his professional advice for our work. We thank the associate editor and all the reviewers for their time and evaluation on our work, which is very helpful for us to improve the quality and presentation of our paper.

Author information

Authors and Affiliations

Guangdong Polytechnic Normal University, Guangzhou, China
Xianxian Zeng & Shun Liu
Guangdong University of Technology, Guangzhou, China
Xiaodong Wang & Guanyu Lai
XAG Company, Guangzhou, China
Xianxian Zeng & Peichu Ye

Authors

Xianxian Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Shun Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaodong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Peichu Ye
View author publications
You can also search for this author in PubMed Google Scholar
Guanyu Lai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaodong Wang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The code is available at https://github.com/Zengxianxian727/FGOR

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zeng, X., Liu, S., Wang, X. et al. SSCRL: fine-grained object retrieval with switched shifted centralized ranking loss. Appl Intell 53, 336–350 (2023). https://doi.org/10.1007/s10489-022-03287-9

Download citation

Accepted: 21 January 2022
Published: 15 April 2022
Issue Date: January 2023
DOI: https://doi.org/10.1007/s10489-022-03287-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SSCRL: fine-grained object retrieval with switched shifted centralized ranking loss

Abstract

Access this article

Similar content being viewed by others

Fine-grained image retrieval by combining attention mechanism and context information

Image Retrieval Research Based on Significant Regions

Cross-Resolution Deep Features Based Image Search

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Abstract

Access this article

Similar content being viewed by others

Fine-grained image retrieval by combining attention mechanism and context information

Image Retrieval Research Based on Significant Regions

Cross-Resolution Deep Features Based Image Search

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation