Abstract
Attention-based deep multi-instance learning (MIL) is an effective and interpretable model. Its interpretability is attributed to the learnability of its inner attention-based MIL pooling. Its main problem is to learn a unique instance-level target concept for weighting instances. Another implicative issue is to assume that the bag and instance concepts are located in the same semantic space. In this paper, we relax these constraints as: (i) There exist multiple instance concepts; (ii) The bag and instance concepts live in different semantic spaces. Upon the two relaxed constraints, we propose a two-level attention-based MIL pooling that first learns several instance concepts in a low-level semantic space and subsequently captures the bag concept in a high-level semantic space. To effectively capture different types of instance concepts, we also present a new similarity-based loss. The experimental results show that our method achieves higher or very comparable performance with state-of-the-art methods on benchmark data sets and surpasses them in terms of performance and interpretability on a synthetic data set.
Similar content being viewed by others
Notes
Compared to (3), Algorithm 1 (located in Lines 2 or 5) adopts a simpler weight computation fashion. Specifically, the former uses two fully connected layers whereas the latter depends on only one.
References
Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015)
Liu, W., Wang, Z., Liu, X., et al.: A survey of deep neural network architectures and their applications. Neurocomputing 234, 11–26 (2017)
Litjens, G., Kooi, T., Bejnordi, B.E., et al.: A survey on deep learning in medical image analysis. Med. Image Anal. 42, 60–88 (2017)
Zhou, Z.-H.: A brief introduction to weakly supervised learning. Natl. Sci. Rev. 5, 44–53 (2018)
Amores, J.: Multiple instance classification: review, taxonomy and comparative study. Artif. Intell. 201, 81–105 (2013)
Xu, Y., Mo, T., Feng, Q., et al.: Deep learning of feature representation with multiple instance learning for medical image analysis. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Florence, pp. 1626-1630 (2014)
Wei, X., Wu, J., Zhou, Z.-H.: Scalable algorithms for multi-instance learning. IEEE Transact. Neural Netw. Learn. Syst. 28, 975–987 (2017)
Ilse, M., Tomczak, J.M., Welling, M.: Attention-based deep multiple instance learning. In: Proceedings of the 35th International Conference on Machine Learning, Stockholm, pp. 2127-2136 (2018)
Foulds, J., Frank, E.: A review of multi-instance learning assumptions. Knowledge Eng. Rev. 25, 1–25 (2010)
Dietterich, T.G., Lathrop, R.H., Lozano-Pérez, T.: Solving the multiple instance problem with axis-parallel rectangles. Artif. Intell. 89, 31–71 (1997)
Maron, O., Lozano-Pérez, T.: A framework for multiple-instance learning. In: Proceedings of Advances in Neural Information Processing Systems 10, Denver, pp. 570-576 (1997)
Zhang, Q., Goldman, S.A.: EM-DD: An improved multiple-instance learning technique. In: Proceedings of Advances in Neural Information Processing Systems 14, Vancouver, pp. 1073-1080 (2001)
Andrews, S., Tsochantaridis, I., Hofmann, T.: Support vector machines for multiple-instance learning. In: Proceedings of Advances in Neural Information Processing Systems 15, Vancouver, pp. 561-568 (2002)
Zhou, Z.-H., Zhang, M.-L.: Neural networks for multi-instance learning. Technical Report, Computer Science & Technology Department at Nanjing University, pp. 1-14 (2002)
Wang, J., Zucker, J.D.: Solving multiple-instance problem: A lazy learning approach. In: Proceedings of the 17th International Conference on Machine Learning, Stanford, pp. 1119-1126 (2000)
Zhang, M.-L., Zhou, Z.-H.: Adapting RBF neural networks to multi-instance learning. Neural Process. Lett. 23, 1–26 (2006)
Gärtner, T., Flach, P.A., Kowalczyk, A., et al.: Multi-instance kernels. In: Proceedings of the 19th International Conference on Machine Learning, Sydney, pp. 179-186 (2002)
Zhou, Z.-H., Sun, Y.-Y., Li, Y.-F.: Multi-instance learning by treating instances as non-i.i.d. samples. In: Proceedings of the 26th International Conference on Machine Learning, Montreal, pp. 1249-1256 (2009)
Chen, Y., Wang, J.Z.: Image categorization by learning and reasoning with regions. J. Mach. Learn. Res. 5, 913–939 (2004)
Chen, Y., Bi, J., Wang, J.Z.: MILES: Multiple-instance learning via embedded instance selection. IEEE Trans. Pattern Anal. Mach. Intell. 28, 1931–1947 (2006)
Zhou, Z.-H., Zhang, M.-L.: Solving multi-instance problems with classifier ensemble based on constructive clustering. Knowl. Inf. Syst. 11, 155–170 (2007)
Zhang, M.-L., Zhou, Z.-H.: Multi-instance clustering with applications to multi-instance prediction. Appl. Intell. 31, 47–68 (2009)
Cheplygina, V., Tax, D.M., Loog, M.: Multiple instance learning with bag dissimilarities. Pattern Recogn. 48, 264–275 (2015)
Ramon, J., De Raedt, L.: Multi instance neural networks. In: the 17th ICML Workshop on Attribute-Value and Relational Learning, Stanford, pp. 53-60 (2000)
Zhang, M.-L., Zhou, Z.-H.: Improve multi-instance neural networks through feature selection. Neural Process. Lett. 19, 1–10 (2004)
Wu, J., Yu, Y., Huang, C., et al.: Deep multiple instance learning for image classification and auto-annotation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Boston, pp. 3460-3469 (2015)
Wang, X., Yan, Y., Tang, P., et al.: Revisiting multiple instance neural networks. Pattern Recogn. 74, 15–24 (2018)
Zaheer, M., Kottur, S., Ravanbakhsh, S., et al.: Deep sets. In: Proceedings of Advances in Neural Information Processing Systems 30, Long Beach, pp. 3391-3401 (2017)
Yan, Y., Wang, X., Guo, X., et al.: Deep multi-instance learning with dynamic pooling. In: Proceedings of the 10th Asian Conference on Machine Learning, Beijing, pp. 662-677 (2018)
Wang, X., Yan, Y., Tang, P., et al.: Bag similarity network for deep multi-instance learning. Inf. Sci. 504, 578–588 (2019)
Li, Z., Yuan, L., Xu, H., et al.: Deep multi-instance learning with induced self-attention for medical image classification. In: Proceedings of IEEE International Conference on Bioinformatics and Biomedicine, Seoul, pp. 446-450 (2020)
Rymarczyk, D., Borowa, A., Tabor, J., et al.: Kernel self-attention for weakly-supervised image classification using deep multiple instance learning. In: Proceedings of IEEE Winter Conference on Applications of Computer Vision, Waikoloa, pp. 1720-1729 (2021)
Yan, Z., Zhan, Y., Peng, Z., et al.: Multi-instance deep learning: discover discriminative local anatomies for bodypart recognition. IEEE Trans. Med. Imaging 35, 1332–1343 (2016)
Liu, M., Zhang, J., Adeli, E., et al.: Landmark-based deep multi-instance learning for brain disease diagnosis. Med. Image Anal. 43, 157–168 (2018)
Kong, Q., Yu, C., Xu, Y., et al.: Weakly labelled audioset tagging with attention neural networks. IEEE/ACM Transact Audio Speech Lang Process 27, 1791–1802 (2019)
Wang, Y., Li, J., Metze, F.: A comparison of five multiple instance learning pooling functions for sound event detection with weak labeling. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Brighton, pp. 31-35 (2019)
Acknowledgements
This work was supported by the Special Foundation for Technology Innovation of Tianjin (21YDTPJC00250), the National Natural Science Foundation of China (61902273), the Open Foundation of Key Laboratory of Computer Vision and Systems of Ministry of Education (TJUT-CVS20170001), the Policy-Making Consulting Project of Tianjin Association for Science and Technology (TJSKXJCZXD202230), the Graduate Scientific Research Innovation Project of Tianjin (2021YJSS088), and the Undergraduate Innovation Projects of Tianjin University of Technology (202110060108, 202210060109).
Author information
Authors and Affiliations
Corresponding authors
Additional information
Communicated by B. Bao.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhao, L., Yuan, L., Hao, K. et al. Generalized attention-based deep multi-instance learning. Multimedia Systems 29, 275–287 (2023). https://doi.org/10.1007/s00530-022-00992-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-022-00992-w