Adaptive Open Set Recognition with Multi-modal Joint Metric Learning

Fu, Yimin; Liu, Zhunga; Yang, Yanbo; Xu, Linfeng; Lan, Hua

doi:10.1007/978-3-031-18907-4_49

Yimin Fu¹⁵,
Zhunga Liu¹⁵,
Yanbo Yang¹⁵,
Linfeng Xu¹⁵ &
…
Hua Lan¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13534))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

3275 Accesses

Abstract

Open set recognition (OSR) aims to simultaneously identify known classes and reject unknown classes. However, existing researches on open set recognition are usually based on single-modal data. Single-modal perception is susceptible to external interference, which may cause incorrect recognition. The multi-modal perception can be employed to improve the OSR performance thanks to the complementarity between different modalities. So we propose a new multi-modal open set recognition (MMOSR) method in this paper. The MMOSR network is constructed with joint metric learning in logit space. By doing this, it can avoid the feature representation gap between different modalities, and effectively estimate the decision boundaries. Moreover, the entropy-based adaptive weight fusion method is developed to combine the multi-modal perception information. The weights of different modalities are automatically determined according to the entropy in the logit space. A bigger entropy will lead to a smaller weight of the corresponding modality. This can effectively prevent the influence of disturbance. Scaling the fusion logits by the single-modal relative reachability further enhances the unknown detection ability. Experiments show that our method can achieve more robust open set recognition performance with multi-modal input compared with other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Joint Learning Model for Open Set Recognition with Post-processing

Synergetic proto-pull and reciprocal points for open set recognition

Article 23 August 2024

Towards Accurate Open-Set Recognition via Background-Class Regularization

References

Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Scheirer, W.J., de Rezende Rocha, A., Sapkota, A., Boult, T.E.: Toward open set recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(7), 1757–1772 (2012)
Article Google Scholar
Baltrušaitis, T., Ahuja, C., Morency, L.P.: Multimodal machine learning: a survey and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 41(2), 423–443 (2018)
Article Google Scholar
Hong, D., et al.: More diverse means better: multimodal deep learning meets remote-sensing imagery classification. IEEE Trans. Geosci. Remote Sens. 59(5), 4340–4354 (2020)
Article Google Scholar
Feng, D., et al.: Deep multi-modal object detection and semantic segmentation for autonomous driving: datasets, methods, and challenges. IEEE Trans. Intell. Transp. Syst. 22(3), 1341–1360 (2020)
Article Google Scholar
Elmadany, N.E.D., He, Y., Guan, L.: Multimodal learning for human action recognition via bimodal/multimodal hybrid centroid canonical correlation analysis. IEEE Trans. Multimedia 21(5), 1317–1331 (2018)
Article Google Scholar
Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality reduction by learning an invariant mapping. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR2006), vol. 2, pp. 1735–1742. IEEE (2006)
Google Scholar
Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823 (2015)
Google Scholar
Sohn, K.: Improved deep metric learning with multi-class n-pair loss objective. In: Advances in Neural Information Processing Systems, pp. 1857–1865 (2016)
Google Scholar
Wen, Y., Zhang, K., Li, Z., Qiao, Yu.: A discriminative feature learning approach for deep face recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 499–515. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_31
Chapter Google Scholar
He, X., Zhou, Y., Zhou, Z., Bai, S., Bai, X.: Triplet-center loss for multi-view 3D object retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1945–1954 (2018)
Google Scholar
Cai, J., Meng, Z., Khan, A.S., Li, Z., O’Reilly, J., Tong, Y.: Island loss for learning discriminative features in facial expression recognition. In: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), pp. 302–309. IEEE (2018)
Google Scholar
Cevikalp, H.: Best fitting hyperplanes for classification. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1076 (2017)
Article Google Scholar
Bendale, A., Boult, T.: Towards open world recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1893–1902 (2015)
Google Scholar
Mendes Júnior, R.R., et al.: Nearest neighbors distance ratio open-set classifier. Mach. Learn. 106(3), 359–386 (2016). https://doi.org/10.1007/s10994-016-5610-8
Article MathSciNet MATH Google Scholar
Bendale, A., Boult, T.E.: Towards open set deep networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1563–1572 (2016)
Google Scholar
Yoshihashi, R., Shao, W., Kawakami, R., You, S., Iida, M., Naemura, T.: Classification-reconstruction learning for open-set recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4016–4025 (2019)
Google Scholar
Ge, Z., Demyanov, S., Chen, Z., Garnavi, R.: Generative openMax for multi-class open set classification. arXiv preprint arXiv:1707.07418 (2017)
Neal, L., Olson, M., Fern, X., Wong, W.-K., Li, F.: Open set learning with counterfactual images. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11210, pp. 620–635. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01231-1_38
Chapter Google Scholar
Oza, P., Patel, V.M.: C2AE: class conditioned auto-encoder for open-set recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2307–2316 (2019)
Google Scholar
Yang, H.M., Zhang, X.Y., Yin, F., Liu, C.L.: Robust classification with convolutional prototype learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3474–3482 (2018)
Google Scholar
Chen, G., et al.: Learning open set network with discriminative reciprocal points. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12348, pp. 507–522. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58580-8_30
Chapter Google Scholar
Miller, D., Sunderhauf, N., Milford, M., Dayoub, F.: Class anchor clustering: a loss for distance-based open set recognition. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 3570–3578 (2021)
Google Scholar
Savinov, N., et al.: Episodic curiosity through reachability. arXiv preprint arXiv:1810.02274 (2018)
Strese, M., Schuwerk, C., Iepure, A., Steinbach, E.: Multimodal feature-based surface material classification. IEEE Trans. Haptics 10(2), 226–239 (2016)
Article Google Scholar
Zheng, H., Fang, L., Ji, M., Strese, M., Özer, Y., Steinbach, E.: Deep learning for surface material classification using haptic and visual information. IEEE Trans. Multimedia 18(12), 2407–2416 (2016)
Article Google Scholar
Dhamija, A.R., Günther, M., Boult, T.E.: Reducing network agnostophobia. arXiv preprint arXiv:1811.04110 (2018)

Download references

Author information

Authors and Affiliations

Key Laboratory of Information Fusion Technology, Ministry of Education, Northwestern Polytechnical University, Xi’an, Shaanxi, 710072, China
Yimin Fu, Zhunga Liu, Yanbo Yang, Linfeng Xu & Hua Lan

Authors

Yimin Fu
View author publications
You can also search for this author in PubMed Google Scholar
Zhunga Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yanbo Yang
View author publications
You can also search for this author in PubMed Google Scholar
Linfeng Xu
View author publications
You can also search for this author in PubMed Google Scholar
Hua Lan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yimin Fu .

Editor information

Editors and Affiliations

Southern University of Science and Technology, Shenzhen, China
Shiqi Yu
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Zhaoxiang Zhang
Hong Kong Baptist University, Hong Kong, China
Pong C. Yuen
Northwestern Polytechnical University, Xi’an, China
Junwei Han
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Tieniu Tan
Hong Kong Baptist University, Hong Kong, China
Yike Guo
Sun Yat-sen University, Guangzhou, China
Jianhuang Lai
Southern University of Science and Technology, Shenzhen, China
Jianguo Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fu, Y., Liu, Z., Yang, Y., Xu, L., Lan, H. (2022). Adaptive Open Set Recognition with Multi-modal Joint Metric Learning. In: Yu, S., et al. Pattern Recognition and Computer Vision. PRCV 2022. Lecture Notes in Computer Science, vol 13534. Springer, Cham. https://doi.org/10.1007/978-3-031-18907-4_49

Download citation

DOI: https://doi.org/10.1007/978-3-031-18907-4_49
Published: 27 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-18906-7
Online ISBN: 978-3-031-18907-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Adaptive Open Set Recognition with Multi-modal Joint Metric Learning