Generalized attention-based deep multi-instance learning

Zhao, Lu; Yuan, Liming; Hao, Kun; Wen, Xianbin

doi:10.1007/s00530-022-00992-w

Generalized attention-based deep multi-instance learning

Regular Paper
Published: 07 September 2022

Volume 29, pages 275–287, (2023)
Cite this article

Multimedia Systems Aims and scope Submit manuscript

Lu Zhao¹,
Liming Yuan²,
Kun Hao¹ &
…
Xianbin Wen²

633 Accesses
2 Citations
Explore all metrics

Abstract

Attention-based deep multi-instance learning (MIL) is an effective and interpretable model. Its interpretability is attributed to the learnability of its inner attention-based MIL pooling. Its main problem is to learn a unique instance-level target concept for weighting instances. Another implicative issue is to assume that the bag and instance concepts are located in the same semantic space. In this paper, we relax these constraints as: (i) There exist multiple instance concepts; (ii) The bag and instance concepts live in different semantic spaces. Upon the two relaxed constraints, we propose a two-level attention-based MIL pooling that first learns several instance concepts in a low-level semantic space and subsequently captures the bag concept in a high-level semantic space. To effectively capture different types of instance concepts, we also present a new similarity-based loss. The experimental results show that our method achieves higher or very comparable performance with state-of-the-art methods on benchmark data sets and surpasses them in terms of performance and interpretability on a synthetic data set.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

Fig. 4

Simultaneous instance pooling and bag representation selection approach for multiple-instance learning (MIL) using vision transformer

Article Open access 16 February 2024

Attention Awareness Multiple Instance Neural Network

Attention-to-Embedding Framework for Multi-instance Learning

Notes

Compared to (3), Algorithm 1 (located in Lines 2 or 5) adopts a simpler weight computation fashion. Specifically, the former uses two fully connected layers whereas the latter depends on only one.

References

Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)
Article Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015)
Article Google Scholar
Liu, W., Wang, Z., Liu, X., et al.: A survey of deep neural network architectures and their applications. Neurocomputing 234, 11–26 (2017)
Article Google Scholar
Litjens, G., Kooi, T., Bejnordi, B.E., et al.: A survey on deep learning in medical image analysis. Med. Image Anal. 42, 60–88 (2017)
Article Google Scholar
Zhou, Z.-H.: A brief introduction to weakly supervised learning. Natl. Sci. Rev. 5, 44–53 (2018)
Article Google Scholar
Amores, J.: Multiple instance classification: review, taxonomy and comparative study. Artif. Intell. 201, 81–105 (2013)
Article MathSciNet MATH Google Scholar
Xu, Y., Mo, T., Feng, Q., et al.: Deep learning of feature representation with multiple instance learning for medical image analysis. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Florence, pp. 1626-1630 (2014)
Wei, X., Wu, J., Zhou, Z.-H.: Scalable algorithms for multi-instance learning. IEEE Transact. Neural Netw. Learn. Syst. 28, 975–987 (2017)
Article Google Scholar
Ilse, M., Tomczak, J.M., Welling, M.: Attention-based deep multiple instance learning. In: Proceedings of the 35th International Conference on Machine Learning, Stockholm, pp. 2127-2136 (2018)
Foulds, J., Frank, E.: A review of multi-instance learning assumptions. Knowledge Eng. Rev. 25, 1–25 (2010)
Article Google Scholar
Dietterich, T.G., Lathrop, R.H., Lozano-Pérez, T.: Solving the multiple instance problem with axis-parallel rectangles. Artif. Intell. 89, 31–71 (1997)
Article MATH Google Scholar
Maron, O., Lozano-Pérez, T.: A framework for multiple-instance learning. In: Proceedings of Advances in Neural Information Processing Systems 10, Denver, pp. 570-576 (1997)
Zhang, Q., Goldman, S.A.: EM-DD: An improved multiple-instance learning technique. In: Proceedings of Advances in Neural Information Processing Systems 14, Vancouver, pp. 1073-1080 (2001)
Andrews, S., Tsochantaridis, I., Hofmann, T.: Support vector machines for multiple-instance learning. In: Proceedings of Advances in Neural Information Processing Systems 15, Vancouver, pp. 561-568 (2002)
Zhou, Z.-H., Zhang, M.-L.: Neural networks for multi-instance learning. Technical Report, Computer Science & Technology Department at Nanjing University, pp. 1-14 (2002)
Wang, J., Zucker, J.D.: Solving multiple-instance problem: A lazy learning approach. In: Proceedings of the 17th International Conference on Machine Learning, Stanford, pp. 1119-1126 (2000)
Zhang, M.-L., Zhou, Z.-H.: Adapting RBF neural networks to multi-instance learning. Neural Process. Lett. 23, 1–26 (2006)
Article Google Scholar
Gärtner, T., Flach, P.A., Kowalczyk, A., et al.: Multi-instance kernels. In: Proceedings of the 19th International Conference on Machine Learning, Sydney, pp. 179-186 (2002)
Zhou, Z.-H., Sun, Y.-Y., Li, Y.-F.: Multi-instance learning by treating instances as non-i.i.d. samples. In: Proceedings of the 26th International Conference on Machine Learning, Montreal, pp. 1249-1256 (2009)
Chen, Y., Wang, J.Z.: Image categorization by learning and reasoning with regions. J. Mach. Learn. Res. 5, 913–939 (2004)
MathSciNet Google Scholar
Chen, Y., Bi, J., Wang, J.Z.: MILES: Multiple-instance learning via embedded instance selection. IEEE Trans. Pattern Anal. Mach. Intell. 28, 1931–1947 (2006)
Article Google Scholar
Zhou, Z.-H., Zhang, M.-L.: Solving multi-instance problems with classifier ensemble based on constructive clustering. Knowl. Inf. Syst. 11, 155–170 (2007)
Article Google Scholar
Zhang, M.-L., Zhou, Z.-H.: Multi-instance clustering with applications to multi-instance prediction. Appl. Intell. 31, 47–68 (2009)
Article Google Scholar
Cheplygina, V., Tax, D.M., Loog, M.: Multiple instance learning with bag dissimilarities. Pattern Recogn. 48, 264–275 (2015)
Article Google Scholar
Ramon, J., De Raedt, L.: Multi instance neural networks. In: the 17th ICML Workshop on Attribute-Value and Relational Learning, Stanford, pp. 53-60 (2000)
Zhang, M.-L., Zhou, Z.-H.: Improve multi-instance neural networks through feature selection. Neural Process. Lett. 19, 1–10 (2004)
Article Google Scholar
Wu, J., Yu, Y., Huang, C., et al.: Deep multiple instance learning for image classification and auto-annotation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Boston, pp. 3460-3469 (2015)
Wang, X., Yan, Y., Tang, P., et al.: Revisiting multiple instance neural networks. Pattern Recogn. 74, 15–24 (2018)
Article Google Scholar
Zaheer, M., Kottur, S., Ravanbakhsh, S., et al.: Deep sets. In: Proceedings of Advances in Neural Information Processing Systems 30, Long Beach, pp. 3391-3401 (2017)
Yan, Y., Wang, X., Guo, X., et al.: Deep multi-instance learning with dynamic pooling. In: Proceedings of the 10th Asian Conference on Machine Learning, Beijing, pp. 662-677 (2018)
Wang, X., Yan, Y., Tang, P., et al.: Bag similarity network for deep multi-instance learning. Inf. Sci. 504, 578–588 (2019)
Article MathSciNet MATH Google Scholar
Li, Z., Yuan, L., Xu, H., et al.: Deep multi-instance learning with induced self-attention for medical image classification. In: Proceedings of IEEE International Conference on Bioinformatics and Biomedicine, Seoul, pp. 446-450 (2020)
Rymarczyk, D., Borowa, A., Tabor, J., et al.: Kernel self-attention for weakly-supervised image classification using deep multiple instance learning. In: Proceedings of IEEE Winter Conference on Applications of Computer Vision, Waikoloa, pp. 1720-1729 (2021)
Yan, Z., Zhan, Y., Peng, Z., et al.: Multi-instance deep learning: discover discriminative local anatomies for bodypart recognition. IEEE Trans. Med. Imaging 35, 1332–1343 (2016)
Article Google Scholar
Liu, M., Zhang, J., Adeli, E., et al.: Landmark-based deep multi-instance learning for brain disease diagnosis. Med. Image Anal. 43, 157–168 (2018)
Article Google Scholar
Kong, Q., Yu, C., Xu, Y., et al.: Weakly labelled audioset tagging with attention neural networks. IEEE/ACM Transact Audio Speech Lang Process 27, 1791–1802 (2019)
Article Google Scholar
Wang, Y., Li, J., Metze, F.: A comparison of five multiple instance learning pooling functions for sound event detection with weak labeling. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Brighton, pp. 31-35 (2019)

Download references

Acknowledgements

This work was supported by the Special Foundation for Technology Innovation of Tianjin (21YDTPJC00250), the National Natural Science Foundation of China (61902273), the Open Foundation of Key Laboratory of Computer Vision and Systems of Ministry of Education (TJUT-CVS20170001), the Policy-Making Consulting Project of Tianjin Association for Science and Technology (TJSKXJCZXD202230), the Graduate Scientific Research Innovation Project of Tianjin (2021YJSS088), and the Undergraduate Innovation Projects of Tianjin University of Technology (202110060108, 202210060109).

Author information

Authors and Affiliations

School of Computer and Information Engineering, Tianjin Chengjian University, No. 26 Jinjing Road, Tianjin, 300384, China
Lu Zhao & Kun Hao
School of Computer Science and Engineering, Tianjin University of Technology, No. 391 Bin Shui Xi Dao Road, Tianjin, 300384, China
Liming Yuan & Xianbin Wen

Authors

Lu Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Liming Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Kun Hao
View author publications
You can also search for this author in PubMed Google Scholar
Xianbin Wen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Liming Yuan or Kun Hao.

Additional information

Communicated by B. Bao.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhao, L., Yuan, L., Hao, K. et al. Generalized attention-based deep multi-instance learning. Multimedia Systems 29, 275–287 (2023). https://doi.org/10.1007/s00530-022-00992-w

Download citation

Received: 02 January 2022
Accepted: 19 August 2022
Published: 07 September 2022
Issue Date: February 2023
DOI: https://doi.org/10.1007/s00530-022-00992-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Generalized attention-based deep multi-instance learning

Abstract

Access this article

Similar content being viewed by others

Simultaneous instance pooling and bag representation selection approach for multiple-instance learning (MIL) using vision transformer

Attention Awareness Multiple Instance Neural Network

Attention-to-Embedding Framework for Multi-instance Learning

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Generalized attention-based deep multi-instance learning

Abstract

Access this article

Similar content being viewed by others

Simultaneous instance pooling and bag representation selection approach for multiple-instance learning (MIL) using vision transformer

Attention Awareness Multiple Instance Neural Network

Attention-to-Embedding Framework for Multi-instance Learning

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation