Abstract
This paper proposed a deep ranking model for triplet selection to efficiently learn similarity metric from top ranked images. A modified distance criterion described in the current work leverages the intra-category variance in metric learning of a triplet network by learning a local sample structure. A multicolumn fusion architecture is used to capture different levels of variance, which when incorporated in the loss function strengthens it and optimizes the objective of the triplet networks. This enables a fine-grained classification strategy. State-of-the-art techniques use a group-sensitive triplet sampling to deal with this issue. However, these have the disadvantage of increased group sampling computations. Experiments are conducted over a variety of benchmark datasets including Model40, PatternNet, and In-Shop Clothing. The main purpose of these experiments are to verify whether the triplet learning technique can be applied over different kinds of data. Results demonstrate that the current work provides superior results in most cases. These results can further be improved with specific parameter tunings and ensembling techniques wherever applicable.
Similar content being viewed by others
References
Bai S, Bai X, Zhou Z, Zhang Z, Jan Latecki L (2016) Gift: a real-time and scalable 3D shape search engine. In: CVPR, pp 5023–5032
Bai Y, Gao F, Lou Y, Wang S, Huang T, Duan L-Y (2017) Incorporating intra-class variance to fine-grained visual recognition. In: IEEE International Conference on Multimedia and Expo
Bai S, Zhou Z, Wang J, Bai X, Jan Latecki L, Tian Q (2017) Ensemble diffusion for retrieval. In: ICCV
Balntas V, Johns E, Tang L, Mikolajczyk K (2016) PN-NET: conjoined triple deep network for learning local image descriptors. In: CoRR arXiv:1601.05030
Cao R, Zhang Q, Zhu J, Li Q, Li Q, Liu B, Qiu G (2019) Enhancing remote sensing image retrieval with triplet deep metric learning network, arXiv:1902.05818v1
Chang AX, Funkhouser TA, Guibas LJ, Hanrahan P, Huang Q-X, Li Z, Savarese S, Savva M, Song S, Su H, Xiao J, Yi L, Yu F (2015) Shapenet: an information-rich 3d model repository, arXiv:1512.03012
Chen W, Chen X, Zhang J, Huang K (2017) Beyond triplet loss: a deep quadruplet network for person re-identification. In CVPR
Cheng D, Gong Y, Zhou S, Wang J, Zheng N (2016) Person re-identification by multi-channel parts-based CNN with improved triplet loss function. In: CVPR
Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A (2012) The PASCAL visual object classes challenge 2012 results. http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html
Furuya T, Ohbuchi R (2016) Deep aggregation of local 3D geometric features for 3D model retrieval. In: BMVC
Gecer B, Balntas V, Kim T-K (2017) Learning deep convolutional embeddings for face representation using joint sample- and set-based supervision. InL CVPR
Ge W, Huang W, Dong D, Scott MR (2018) Deep metric learning with hierarchical triplet loss. In: ECCV
Hadsell R, Chopra S, LeCun Y (2006) Dimensionality reduction by learning an invari-ant mapping. CVPR 2:1735–1742
Hoffer Elad, Ailon Nir (2015) Deep metric learning using triplet network. In: ICLR
Hoffer E, Hubara I, Ailon N (2017) Deep unsupervised learning through spatial contrasting. In: ICLR
Hong WC, Li MW, Geng J, Zhang Y (2019) Novel chaotic bat algorithm for forecasting complex motion of floating platforms. Appl Math Model 72:425–443
Huang Y, Cao X, Zhang B, Zheng J, Kong X (2017) Batch loss regularization in deep learning method for aerial scene classification. In: Integrated communications, navigation and surveillance conference (ICNS)
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: NIPS
Manmatha R, Wu C-Y, Smola AJ (2017) Philipp krahenbuhl, sampling matters in deep embedding learning. In: ICCV
Notchenko A, Kapushev E, Burnaev E (2016) Sparse 3D convolutional neural networks for large-scale shape retrieval. arXiv:1611.09159
Pant T, Han C, Wang H (2019) Examination of errors of table integration in flamelet/progress variable modeling of a turbulent non-premixed jet flame. Appl Math Model 72:369–384
Schroff F, Kalenichenko D, Philbin J (2015) Facenet: a unified embedding for face recognition and clustering. In CVPR, pp 815–823
Shi B, Bai S, Zhou Z, Bai X (2015) Deeppano: deep panoramic representation for 3-d shape recognition. IEEE Signal Process Lett 22(12):2339–2343
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: ICLR
Song Hyun Oh, Xiang Yu, Jegelka Stefanie, Savarese Silvio (2016) Deep metric learning via lifted structured feature embedding. In: CVPR
Su H, Maji S, Kalogerakis E, Learned-Miller E (2015) Multi- view convolutional neural networks for 3d shape recognition. In: ICCV, pp 945–953
Vijay Kumar BG, Carneiro G, Reid I (2015) Learning local image descriptors with deep siamese and triplet convolutional networks by minimizing global loss functions. CoRR arXiv:1512.09272
Wang J, Song Y, Leung T, Rosenberg C, Wang J, Philbin J, Chen B, Wu Y (2014) Learning fine-grained image similarity with deep ranking. In: CVPR
Wang F, Xiang X, Liu C, Tran TD, Reiter A, Hager GD, Cheng J, Yuille AL (2017) Regularizing face verification nets for pain intensity regression. In: ICIP
Welinder P, Branson S, Mita T, Wah C, Schroff F, Belongie S, Perona P (2010) Caltech-UCSD Birds 200. California Institute of Technology. CNS-TR-2010-001
Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3D shapenets: a deep representation for volumetric shapes. In: CVPR, pp 1912–1920
Wu L, Wang Y, Gao J, Li X (2017) Deep adaptive feature embedding with local sample distributions for person re-identification. Pattern Recognit
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: ECCV
Zhou W, Newsam S, Li C, Shao Z (2018) PatternNet: a benchmark dataset for performance evaluation of remote sensing image retrieval. ISPRS J Photogramm Remote Sens 145:197–209
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Human participants or animals
This article does not contain any studies with human participants or animals performed by any of the authors.
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Communicated by V. Loia.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bhattacharya, J., Sharma, R.K. Ranking-based triplet loss function with intra-class mean and variance for fine-grained classification tasks. Soft Comput 24, 15519–15528 (2020). https://doi.org/10.1007/s00500-020-04880-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-020-04880-1