Solving multi-instance problems with classifier ensemble based on constructive clustering

Zhou, Zhi-Hua; Zhang, Min-Ling

doi:10.1007/s10115-006-0029-3

Solving multi-instance problems with classifier ensemble based on constructive clustering

Regular Paper
Published: 10 August 2006

Volume 11, pages 155–170, (2007)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Zhi-Hua Zhou¹ &
Min-Ling Zhang¹

645 Accesses
107 Citations
Explore all metrics

Abstract

In multi-instance learning, the training set is composed of labeled bags each consists of many unlabeled instances, that is, an object is represented by a set of feature vectors instead of only one feature vector. Most current multi-instance learning algorithms work through adapting single-instance learning algorithms to the multi-instance representation, while this paper proposes a new solution which goes at an opposite way, that is, adapting the multi-instance representation to single-instance learning algorithms. In detail, the instances of all the bags are collected together and clustered into d groups first. Each bag is then re-represented by d binary features, where the value of the ith feature is set to one if the concerned bag has instances falling into the ith group and zero otherwise. Thus, each bag is represented by one feature vector so that single-instance classifiers can be used to distinguish different classes of bags. Through repeating the above process with different values of d, many classifiers can be generated and then they can be combined into an ensemble for prediction. Experiments show that the proposed method works well on standard as well as generalized multi-instance problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Link between Multiple-Instance Learning and Learning from Only Positive and Unlabelled Examples

Robust Multiple-Instance Learning with Superbags

Imbalanced instance selection based on Laplacian matrix decomposition with weighted k-nearest-neighbor graph

Article 21 April 2024

References

Abbass HA, Towsey M, Finn G (2001) C-Net: A method for generating non-deterministic and dynamic multivariate decision trees. Knowl Inform Syst 3(2):184–197
Article MATH Google Scholar
Alphonse É, Matwin S (2004) Filtering multi-instance problems to reduce dimensionality in relational learning. J Intell Inform Syst 22(1):23–40
Article Google Scholar
Amar RA, Dooly DR, Goldman SA, Zhang Q (2001) Multiple-instance learning of real-valued data. In: Proceedings of the 18th international conference on machine learning. Williamstown, MA, pp 3–10
Andrews S, Tsochantaridis I, Hofmann T (2003) Support vector machines for multiple-instance learning. In: Becker S, Thrun S, Obermayer K (eds) Advances in neural information processing systems, vol 15. MIT Press, Cambridge, MA, pp 561–568
Auer P (1997) On learning from multi-instance examples: Empirical evaluation of a theoretical approach. In: Proceedings of the 14th international conference on machine learning. Nashville, TN, pp 21–29
Auer P, Long PM, Srinivasan A (1998) Approximating hyper-rectangles: Learning and pseudo-random sets. J Comput Syst Sci 57(3):376–388
Article MATH MathSciNet Google Scholar
Blake C, Keogh E, Merz CJ (1998) UCI repository of machine learning databases. Department of Information and Computer Science, University of California, Irvine, CA [http://www.ics.uci.edu/∼mlearn/MLRepository.html]
Bloedorn E, Michalski RS (1998) Data-driven constructive induction. IEEE Intell Syst 13(2):30–37
Article Google Scholar
Blum A, Kalai A (1998) A note on learning from multiple-instance examples. Machine Learn 30(1):23–29
Article MATH Google Scholar
Chen Y, Wang JZ (2004) Image categorization by learning and reasoning with regions. J Machine Learn Res 5:913–939
Google Scholar
Chevaleyre Y, Zucker J-D (2001) Solving multiple-instance and multiple-part learning problems with decision trees and rule sets. Application to the mutagenesis problem. In: Stroulia E, Matwin S (eds) Lecture notes in artificial intelligence, vol 2056. Springer, Berlin Heidelberg New York, pp 204–214
De Raedt L (1998) Attribute-value learning versus inductive logic programming: The missing links. In: Page D (ed) Lecture notes in artificial intelligence, vol 1446. Springer, Berlin Heidelberg New York, pp 1–8
Dietterich TG (2000) Ensemble methods in machine learning. In: Kittler J, Roli F (eds) Lecture notes in computer science, vol 1867. Springer, Berlin Heidelberg New York, pp 1–15
Dietterich TG, Lathrop RH, Lozano-Pérez T (1997) Solving the multiple-instance problem with axis-parallel rectangles. Artif Intell 89(1–2):31–71
Article MATH Google Scholar
Gärtner T, Flach PA, Kowalczyk A, Smola AJ (2002) Multi-instance kernels. In: Proceedings of the 19th international conference on machine learning. Sydney, Australia, pp 179–186
Goldman SA, Kwek SS, Scott SD (2001) Agnostic learning of geometric patterns. J Comput Syst Sci 62(1):123–151
Article MATH MathSciNet Google Scholar
Goldman SA, Scott SD (2003) Multiple-instance learning of real-valued geometric patterns. Ann Math Artif Intell 39(3):259–290
Article MATH MathSciNet Google Scholar
Hinneburg A, Keim DA (2003) A general approach to clustering in large databases with noise. Knowl Inform Syst 5(4):387–415
Article Google Scholar
Hodge VJ, Austin J (2005) A binary neural k-nearest neighbour technique. Knowl Inform Syst 8(3):276–309
Article Google Scholar
Huang X, Chen S-C, Shyu M-L, Zhang C (2002) Mining high-level user concepts with multiple instance learning and relevance feedback for content-based image retrieval. In: Zaïane OR, Simoff SJ, Djeraba C (eds) Lecture notes in artificial intelligence, vol 2797. Springer, Berlin Heidelberg New York, pp 50–67
Long PM, Tan L (1998) PAC learning axis-aligned rectangles with respect to product distributions from multiple-instance examples. Machine Learn 30(1):7–21
Article MATH Google Scholar
Maron O (1998) Learning from ambiguity. PhD dissertation, Department of Electrical Engineering and Computer Science, MIT
Maron O, Lozano-Pérez T (1998) A framework for multiple-instance learning. In: Jordan MI, Kearns MJ, Solla SA (eds) Advances in neural information processing systems, vol 10. MIT Press, Cambridge, MA, pp 570–576
Maron O, Ratan AL (1998) Multiple-instance learning for natural scene classification. In: Proceedings of the 15th international conference on machine learning. Madison, WI, 1998, pp 341–349
Michalski RS (1983) A theory and methodology of inductive learning. In: Michalski RS, Carbonell JG, Mitchell TM (eds) Machine learning: An artificial intelligence approach. Tioga, Palo Alto, CA, pp 83–134
Ordonez C, Omiecinski E (2004) Accelarating EM clustering to find high-quality solutions. Knowl Inform Syst 7(2):135–157
Article Google Scholar
Ray S, Page D (2001) Multiple instance regression. In: Proceedings of the 18th international conference on machine learning. Williamstown, MA, 2001, pp 425–432
Ruffo G (2000) Learning single and multiple instance decision trees for computer security applications. PhD dissertation, Department of Computer Science, University of Turin, Torino, Italy
Scott SD, Zhang J, Brown J (2003) On generalized multiple-instance learning. Technical Report UNL-CSE-2003-5, Department of Computer Science, University of Nebraska, Lincoln, NE
Tao Q, Scott S, Vinodchandran NV, Osugi TT (2004) SVM-based generalized multiple-instance learning via approximate box counting. In: Proceedings of the 21st international conference on machine learning. Banff, Canada, pp 779–806
Tao Q, Scott S, Vinodchandran NV, Osugi TT, Mueller B (2004) An extended kernel for generalized multiple-instance learning. In: Proceedings of the 16th IEEE international conference on tools with artificial intelligence. Boca Raton, FL, pp 272–277
Wang J, Zucker J-D (2000) Solving the multiple-instance problem: A lazy learning approach. In: Proceedings of the 17th international conference on machine learning. San Francisco, CA, pp 1119–1125
Weidmann N, Frank E, Pfahringer B (2003) A two-level learning method for generalized multi-instance problem. In: Lavrač N, Gamberger D, Blockeel H, Todorovski L (eds) Lecture notes in artificial intelligence, vol 2837. Springer, Berlin Heidelberg New York, pp 468–479
Xu X, Frank E (2004) Logistic regression and boosting for labeled bags of instances. In: Dai H, Srikant R, Zhang C (eds) Lecture notes in artificial intelligence, vol 3056. Springer, Berlin Heidelberg New York, pp 272–281
Yang C, Lozano-Pérez T (2000) Image database retrieval with multiple-instance learning techniques. In: Proceedings of the 16th international conference on data engineering. San Diego, CA, pp 233–243
Zhang Q, Goldman SA (2002) EM-DD: An improved multi-instance learning technique. In: Dietterich TG, Becker S, Ghahramani Z (eds) Advances in neural information processing systems, vol 14. MIT Press, Cambridge, MA, pp 1073–1080
Zhang Q, Yu W, Goldman SA, Fritts JE (2002) Content-based image retrieval using multiple-instance learning. In: Proceedings of the 19th international conference on machine learning. Sydney, Australia, pp 682–689
Zhang M-L, Zhou Z-H (2004) Improve multi-instance neural networks through feature selection. Neural Process Lett 19(1):1–10
Article MATH Google Scholar
Zhou Z-H, Chen S, Chen Z (2000) FANNC: A fast adaptive neural network classifier. Knowl Inform Syst 2(1):115–129
Article MATH Google Scholar
Zhou Z-H, Jiang K, Li M (2005) Multi-instance learning based web mining. Appl Intell 22(2):135–147
Article Google Scholar
Zhou Z-H, Zhang M-L (2002) Neural networks for multi-instance learning. Technical Report, AI Lab, Department of Computer Science & Technology, Nanjing University, Nanjing, China
Zhou Z-H, Zhang M-L (2003) Ensembles of multi-instance learners. In: Lavrač N, Gamberger D, Blockeel H, Todorovski L (eds) Lecture notes in artificial intelligence, vol 2837. Springer, Berlin Heidelberg New York, pp 492–502
Zhou Z-H, Zhang M-L, Chen K-J (2003) A novel bag generator for image database retrieval with multi-instance learning techniques. In: Proceedings of the 15th IEEE international conference on tools with artificial intelligence. Sacramento, CA, pp 565–569

Download references

Author information

Authors and Affiliations

National Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210093, China
Zhi-Hua Zhou & Min-Ling Zhang

Authors

Zhi-Hua Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Min-Ling Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhi-Hua Zhou.

Additional information

Zhi-Hua Zhou is currently Professor in the Department of Computer Science & Technology and head of the LAMDA group at Nanjing University. His main research interests include machine learning, data mining, information retrieval, and pattern recognition. He is associate editor of Knowledge and Information Systems and on the editorial boards of Artificial Intelligence in Medicine, International Journal of Data Warehousing and Mining, Journal of Computer Science & Technology, and Journal of Software. He has also been involved in various conferences.

Min-Ling Zhang received his B.Sc. and M.Sc. degrees in computer science from Nanjing University, China, in 2001 and 2004, respectively. Currently he is a Ph.D. candidate in the Department of Computer Science & Technology at Nanjing University and a member of the LAMDA group. His main research interests include machine learning and data mining, especially in multi-instance learning and multi-label learning.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhou, ZH., Zhang, ML. Solving multi-instance problems with classifier ensemble based on constructive clustering. Knowl Inf Syst 11, 155–170 (2007). https://doi.org/10.1007/s10115-006-0029-3

Download citation

Received: 25 April 2005
Revised: 17 September 2005
Accepted: 14 January 2006
Published: 10 August 2006
Issue Date: February 2007
DOI: https://doi.org/10.1007/s10115-006-0029-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Solving multi-instance problems with classifier ensemble based on constructive clustering

Abstract

Access this article

Similar content being viewed by others

The Link between Multiple-Instance Learning and Learning from Only Positive and Unlabelled Examples

Robust Multiple-Instance Learning with Superbags

Imbalanced instance selection based on Laplacian matrix decomposition with weighted k-nearest-neighbor graph

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Solving multi-instance problems with classifier ensemble based on constructive clustering

Abstract

Access this article

Similar content being viewed by others

The Link between Multiple-Instance Learning and Learning from Only Positive and Unlabelled Examples

Robust Multiple-Instance Learning with Superbags

Imbalanced instance selection based on Laplacian matrix decomposition with weighted k-nearest-neighbor graph

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation