Neighborhood Component Feature Selection for Multiple Instance Learning Paradigm

Turri, Giacomo; Romeo, Luca

doi:10.1007/978-3-031-70341-6_14

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14941))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

786 Accesses

Abstract

In a multiple instance learning (MIL) scenario, the outcome annotation is usually only reported at the bag level. Considering simplicity and convergence criteria, the lazy learning approach, i.e., k-nearest neighbors (kNN), plays a crucial role in predicting bag labels in the MIL domain. Notably, two variations of the kNN algorithm tailored to the MIL framework have been introduced, namely Bayesian-kNN (BkNN) and Citation-kNN (CkNN). These adaptations leverage the Hausdorff metric along with Bayesian or citation approaches. However, neither BkNN nor CkNN explicitly integrates feature selection methodologies, and when irrelevant and redundant features are present, the model’s generalization decreases. In the single-instance learning scenario, to overcome this limitation of kNN, a feature weighting algorithm named Neighborhood Component Feature Selection (NCFS) is often applied to find the optimal degree of influence of each feature. To address the significant gap existing in the literature, we introduce the NCFS method for the MIL framework. The proposed methodologies, i.e. NCFS-BkNN, NCFS-CkNN, and NCFS-Bayesian Citation-kNN (NCFS-BCkNN), learn the optimal features weighting vector by minimizing the regularized leave-one-out error of the training bags. Hence, the prediction of unseen bags is computed by combining the Bayesian and citation approaches based on the minimum optimally weighted Hausdorff distance. Through experiments with various benchmark MIL datasets in the biomedical informatics and affective computing fields, we provide statistical evidence that the proposed methods outperform state-of-the-art MIL algorithms that do not employ any a priori feature weighting strategy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Active Multi-Instance Multi-Label Learning

Multi-label feature selection using geometric series of relevance matrix

Article 02 April 2022

Multiple Instance Learning for Unilateral Data

References

Aziz, Y., Memon, K.H.: Fast geometrical extraction of nearest neighbors from multi-dimensional data. Pattern Recogn. 136, 109183 (2023)
Article Google Scholar
Carbonneau, M.A., Cheplygina, V., Granger, E., Gagnon, G.: Multiple instance learning: a survey of problem characteristics and applications. Pattern Recogn. 77, 329–353 (2018)
Article Google Scholar
Cunningham, P., Delany, S.J.: K-nearest neighbour classifiers-a tutorial. ACM Comput. Surv. (CSUR) 54(6), 1–25 (2021)
Article Google Scholar
Dietterich, T.G., Lathrop, R.H., Lozano-Pérez, T.: Solving the multiple instance problem with axis-parallel rectangles. Artif. Intell. 89(1), 31–71 (1997)
Article Google Scholar
Goldberger, J., Hinton, G.E., Roweis, S.T., Salakhutdinov, R.R.: Neighbourhood components analysis. In: Advances in Neural Information Processing Systems, pp. 513–520 (2005)
Google Scholar
Herrera, F., et al.: Multi-instance regression. In: Multiple Instance Learning, pp. 127–140. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-47759-6_6
Chapter Google Scholar
Herrera, F., et al.: Multiple instance learning. In: Multiple Instance Learning, pp. 17–33. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-47759-6_2
Chapter Google Scholar
Jiang, L., Cai, Z., Wang, D., Zhang, H.: Bayesian citation-KNN with distance weighting. Int. J. Mach. Learn. Cybern. 5(2), 193–199 (2014)
Article Google Scholar
Jung, T.P., Sejnowski, T.J., et al.: Utilizing deep learning towards multi-modal bio-sensing and vision-based affective computing. IEEE Trans. Affect. Comput. 13(1), 96–107 (2019)
Google Scholar
Kim, H., Lee, T.H., Kwon, T.: Normalized neighborhood component feature selection and feasible-improved weight allocation for input variable selection. Knowl.-Based Syst. 218, 106855 (2021)
Article Google Scholar
Koelstra, S., et al.: DEAP: a database for emotion analysis; using physiological signals. IEEE Trans. Affect. Comput. 3(1), 18–31 (2012)
Article Google Scholar
Li, J., Wang, J.Q.: An extended QUALIFLEX method under probability hesitant fuzzy environment for selecting green suppliers. Int. J. Fuzzy Syst. 19, 1866–1879 (2017)
Google Scholar
Liu, D.C., Nocedal, J.: On the limited memory BFGS method for large scale optimization. Math. Program. 45(1), 503–528 (1989)
Article MathSciNet Google Scholar
Mera, C., Orozco-Alzate, M., Branch, J.: Incremental learning of concept drift in multiple instance learning for industrial visual inspection. Comput. Ind. 109, 153–164 (2019)
Article Google Scholar
Muja, M., Lowe, D.G.: Scalable nearest neighbor algorithms for high dimensional data. IEEE Trans. Pattern Anal. Mach. Intell. 36(11), 2227–2240 (2014)
Article Google Scholar
Rodrigues, É.O.: An efficient and locality-oriented Hausdorff distance algorithm: proposal and analysis of paradigms and implementations. Pattern Recogn. 117, 107989 (2021)
Article Google Scholar
Paul, Y., Kumar, N.: A comparative study of famous classification techniques and data mining tools. In: Singh, P.K., Kar, A.K., Singh, Y., Kolekar, M.H., Tanwar, S. (eds.) Proceedings of ICRIC 2019. LNEE, vol. 597, pp. 627–644. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-29407-6_45
Chapter Google Scholar
Ren, T., Jia, X., Li, W., Chen, L., Li, Z.: Label distribution learning with label-specific features. In: IJCAI, pp. 3318–3324 (2019)
Google Scholar
Romeo, L., Cavallo, A., Pepa, L., Bianchi-Berthouze, N., Pontil, M.: Multiple instance learning for emotion recognition using physiological signals. IEEE Trans. Affect. Comput. 13(1), 389–407 (2019)
Article Google Scholar
Shahrjooihaghighi, A., Frigui, H.: Local feature selection for multiple instance learning. J. Intell. Inf. Syst., 1–25 (2021)
Google Scholar
Sudharshan, P., Petitjean, C., Spanhol, F., Oliveira, L.E., Heutte, L., Honeine, P.: Multiple instance learning for histopathological breast cancer image classification. Expert Syst. Appl. 117, 103–111 (2019)
Article Google Scholar
Taunk, K., De, S., Verma, S., Swetapadma, A.: A brief review of nearest neighbor algorithm for learning and classification. In: 2019 International Conference on Intelligent Computing and Control Systems (ICCS), pp. 1255–1260. IEEE (2019)
Google Scholar
Tuncer, T., Dogan, S., Acharya, U.R.: Automated accurate speech emotion recognition system using twine shuffle pattern and iterative neighborhood component analysis techniques. Knowl.-Based Syst. 211, 106547 (2021)
Article Google Scholar
Tuncer, T., Dogan, S., Subasi, A.: EEG-based driving fatigue detection using multilevel feature extraction and iterative hybrid feature selection. Biomed. Signal Process. Control 68, 102591 (2021)
Article Google Scholar
Vatsavai, R.R.: Gaussian multiple instance learning approach for mapping the slums of the world using very high resolution imagery. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1419–1426 (2013)
Google Scholar
Wang, J., Zucker, J.D.: Solving the multiple-instance problem: a lazy learning approach. In: Proceedings of the Seventeenth International Conference on Machine Learning, ICML 2000, pp. 1119–1126. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (2000)
Google Scholar
Xiao, Y., Yang, X., Liu, B.: A new self-paced method for multiple instance boosting learning. Inf. Sci. 515, 80–90 (2020)
Article Google Scholar
Yaman, O.: An automated faults classification method based on binary pattern and neighborhood component analysis using induction motor. Measurement 168, 108323 (2021)
Article Google Scholar
Yang, W., Wang, K., Zuo, W.: Neighborhood component feature selection for high-dimensional data. J. Comput. 7(1), 161–168 (2012)
Article Google Scholar

Download references

Acknowledgments

Funded by the European Union - NextGenerationEU and by the Ministry of University and Research (MUR), National Recovery and Resilience Plan (NRRP), Mission 4, Component 2, Investment 1.5, project “RAISE - Robotics and AI for Socio-economic Empowerment” (ECS00000035). G.T. is part of RAISE Innovation Ecosystem.

Author information

Authors and Affiliations

Computational Statistics and Machine Learning (CSML), Istituto Italiano di Tecnologia, Genoa, Italy
Giacomo Turri & Luca Romeo
Unit for Visually Impaired People (U-VIP), Istituto Italiano di Tecnologia, Genoa, Italy
Giacomo Turri
Department Economics and Law, Università degli Studi di Macerata, Macerata, Italy
Luca Romeo

Authors

Giacomo Turri
View author publications
You can also search for this author in PubMed Google Scholar
Luca Romeo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Luca Romeo .

Editor information

Editors and Affiliations

LTCI, Télécom Paris, Palaiseau Cedex, France
Albert Bifet
KU Leuven, Leuven, Belgium
Jesse Davis
Faculty of Informatics, Vytautas Magnus University, Akademija, Lithuania
Tomas Krilavičius
Institute of Computer Science, University of Tartu, Tartu, Estonia
Meelis Kull
Department of Computer Science, Bundeswehr University Munich, Munich, Germany
Eirini Ntoutsi
Department of Computer Science, University of Helsinki, Helsinki, Finland
Indrė Žliobaitė

Ethics declarations

Disclosure of Interests

The authors have no competing interests to declare that are relevant to the content of this article.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Turri, G., Romeo, L. (2024). Neighborhood Component Feature Selection for Multiple Instance Learning Paradigm. In: Bifet, A., Davis, J., Krilavičius, T., Kull, M., Ntoutsi, E., Žliobaitė, I. (eds) Machine Learning and Knowledge Discovery in Databases. Research Track. ECML PKDD 2024. Lecture Notes in Computer Science(), vol 14941. Springer, Cham. https://doi.org/10.1007/978-3-031-70341-6_14

Download citation

DOI: https://doi.org/10.1007/978-3-031-70341-6_14
Published: 22 August 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70340-9
Online ISBN: 978-3-031-70341-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community (opens in a new tab)

Neighborhood Component Feature Selection for Multiple Instance Learning Paradigm

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Active Multi-Instance Multi-Label Learning

Multi-label feature selection using geometric series of relevance matrix

Multiple Instance Learning for Unilateral Data

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Ethics declarations

Disclosure of Interests

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Subscribe and save

Buy Now

Navigation

Neighborhood Component Feature Selection for Multiple Instance Learning Paradigm

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Active Multi-Instance Multi-Label Learning

Multi-label feature selection using geometric series of relevance matrix

Multiple Instance Learning for Unilateral Data

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Ethics declarations

Disclosure of Interests

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation