Skip to main content
Log in

Ranking-based triplet loss function with intra-class mean and variance for fine-grained classification tasks

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

This paper proposed a deep ranking model for triplet selection to efficiently learn similarity metric from top ranked images. A modified distance criterion described in the current work leverages the intra-category variance in metric learning of a triplet network by learning a local sample structure. A multicolumn fusion architecture is used to capture different levels of variance, which when incorporated in the loss function strengthens it and optimizes the objective of the triplet networks. This enables a fine-grained classification strategy. State-of-the-art techniques use a group-sensitive triplet sampling to deal with this issue. However, these have the disadvantage of increased group sampling computations. Experiments are conducted over a variety of benchmark datasets including Model40, PatternNet, and In-Shop Clothing. The main purpose of these experiments are to verify whether the triplet learning technique can be applied over different kinds of data. Results demonstrate that the current work provides superior results in most cases. These results can further be improved with specific parameter tunings and ensembling techniques wherever applicable.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Bai S, Bai X, Zhou Z, Zhang Z, Jan Latecki L (2016) Gift: a real-time and scalable 3D shape search engine. In: CVPR, pp 5023–5032

  • Bai Y, Gao F, Lou Y, Wang S, Huang T, Duan L-Y (2017) Incorporating intra-class variance to fine-grained visual recognition. In: IEEE International Conference on Multimedia and Expo

  • Bai S, Zhou Z, Wang J, Bai X, Jan Latecki L, Tian Q (2017) Ensemble diffusion for retrieval. In: ICCV

  • Balntas V, Johns E, Tang L, Mikolajczyk K (2016) PN-NET: conjoined triple deep network for learning local image descriptors. In: CoRR arXiv:1601.05030

  • Cao R, Zhang Q, Zhu J, Li Q, Li Q, Liu B, Qiu G (2019) Enhancing remote sensing image retrieval with triplet deep metric learning network, arXiv:1902.05818v1

  • Chang AX, Funkhouser TA, Guibas LJ, Hanrahan P, Huang Q-X, Li Z, Savarese S, Savva M, Song S, Su H, Xiao J, Yi L, Yu F (2015) Shapenet: an information-rich 3d model repository, arXiv:1512.03012

  • Chen W, Chen X, Zhang J, Huang K (2017) Beyond triplet loss: a deep quadruplet network for person re-identification. In CVPR

  • Cheng D, Gong Y, Zhou S, Wang J, Zheng N (2016) Person re-identification by multi-channel parts-based CNN with improved triplet loss function. In: CVPR

  • Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A (2012) The PASCAL visual object classes challenge 2012 results. http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html

  • Furuya T, Ohbuchi R (2016) Deep aggregation of local 3D geometric features for 3D model retrieval. In: BMVC

  • Gecer B, Balntas V, Kim T-K (2017) Learning deep convolutional embeddings for face representation using joint sample- and set-based supervision. InL CVPR

  • Ge W, Huang W, Dong D, Scott MR (2018) Deep metric learning with hierarchical triplet loss. In: ECCV

  • Hadsell R, Chopra S, LeCun Y (2006) Dimensionality reduction by learning an invari-ant mapping. CVPR 2:1735–1742

    Google Scholar 

  • Hoffer Elad, Ailon Nir (2015) Deep metric learning using triplet network. In: ICLR

  • Hoffer E, Hubara I, Ailon N (2017) Deep unsupervised learning through spatial contrasting. In: ICLR

  • Hong WC, Li MW, Geng J, Zhang Y (2019) Novel chaotic bat algorithm for forecasting complex motion of floating platforms. Appl Math Model 72:425–443

    Article  MathSciNet  Google Scholar 

  • Huang Y, Cao X, Zhang B, Zheng J, Kong X (2017) Batch loss regularization in deep learning method for aerial scene classification. In: Integrated communications, navigation and surveillance conference (ICNS)

  • Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: NIPS

  • Manmatha R, Wu C-Y, Smola AJ (2017) Philipp krahenbuhl, sampling matters in deep embedding learning. In: ICCV

  • Notchenko A, Kapushev E, Burnaev E (2016) Sparse 3D convolutional neural networks for large-scale shape retrieval. arXiv:1611.09159

  • Pant T, Han C, Wang H (2019) Examination of errors of table integration in flamelet/progress variable modeling of a turbulent non-premixed jet flame. Appl Math Model 72:369–384

    Article  MathSciNet  Google Scholar 

  • Schroff F, Kalenichenko D, Philbin J (2015) Facenet: a unified embedding for face recognition and clustering. In CVPR, pp 815–823

  • Shi B, Bai S, Zhou Z, Bai X (2015) Deeppano: deep panoramic representation for 3-d shape recognition. IEEE Signal Process Lett 22(12):2339–2343

    Article  Google Scholar 

  • Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: ICLR

  • Song Hyun Oh, Xiang Yu, Jegelka Stefanie, Savarese Silvio (2016) Deep metric learning via lifted structured feature embedding. In: CVPR

  • Su H, Maji S, Kalogerakis E, Learned-Miller E (2015) Multi- view convolutional neural networks for 3d shape recognition. In: ICCV, pp 945–953

  • Vijay Kumar BG, Carneiro G, Reid I (2015) Learning local image descriptors with deep siamese and triplet convolutional networks by minimizing global loss functions. CoRR arXiv:1512.09272

  • Wang J, Song Y, Leung T, Rosenberg C, Wang J, Philbin J, Chen B, Wu Y (2014) Learning fine-grained image similarity with deep ranking. In: CVPR

  • Wang F, Xiang X, Liu C, Tran TD, Reiter A, Hager GD, Cheng J, Yuille AL (2017) Regularizing face verification nets for pain intensity regression. In: ICIP

  • Welinder P, Branson S, Mita T, Wah C, Schroff F, Belongie S, Perona P (2010) Caltech-UCSD Birds 200. California Institute of Technology. CNS-TR-2010-001

  • Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3D shapenets: a deep representation for volumetric shapes. In: CVPR, pp 1912–1920

  • Wu L, Wang Y, Gao J, Li X (2017) Deep adaptive feature embedding with local sample distributions for person re-identification. Pattern Recognit

  • Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: ECCV

  • Zhou W, Newsam S, Li C, Shao Z (2018) PatternNet: a benchmark dataset for performance evaluation of remote sensing image retrieval. ISPRS J Photogramm Remote Sens 145:197–209

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to J. Bhattacharya.

Ethics declarations

Human participants or animals

This article does not contain any studies with human participants or animals performed by any of the authors.

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bhattacharya, J., Sharma, R.K. Ranking-based triplet loss function with intra-class mean and variance for fine-grained classification tasks. Soft Comput 24, 15519–15528 (2020). https://doi.org/10.1007/s00500-020-04880-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-020-04880-1

Keywords

Navigation