Skip to main content
Log in

Markov-Miml: A Markov chain-based multi-instance multi-label learning algorithm

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

The main aim of this paper is to propose an efficient and novel Markov chain-based multi-instance multi-label (Markov-Miml) learning algorithm to evaluate the importance of a set of labels associated with objects of multiple instances. The algorithm computes ranking of labels to indicate the importance of a set of labels to an object. Our approach is to exploit the relationships between instances and labels of objects. The rank of a class label to an object depends on (i) the affinity metric between the bag of instances of this object and the bag of instances of the other objects, and (ii) the rank of a class label of similar objects. An object, which contains a bag of instances that are highly similar to bags of instances of the other objects with a high rank of a particular class label, receives a high rank of this class label. Experimental results on benchmark data have shown that the proposed algorithm is computationally efficient and effective in label ranking for MIML data. In the comparison, we find that the classification performance of the Markov-Miml algorithm is competitive with those of the three popular MIML algorithms based on boosting, support vector machine, and regularization, but the computational time required by the proposed algorithm is less than those by the other three algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. Available at http://lamda.nju.edu.cn/datacode/miml-image-data.htm.

  2. Available at http://lamda.nju.edu.cn/datacode/miml-text-data.htm.

References

  1. Abbasnejad M, Ramachandram D, Mandava R (2011) A survey of the state of the art in learning the kernels. Knowl Inf Syst 31(2):193–221

    Google Scholar 

  2. Andrews S, Tsochantaridis I, Hofmann T (2003) Support vector machines for multiple-instance learning. Adv Neural Inf Process Syst 15:577–584

    Google Scholar 

  3. Boutell M, Luo J, Shen X, Brown C (2004) Learning multi-label scene classification. Pattern Recognit 37(9):1757–1771

    Article  Google Scholar 

  4. Cheung P, Kwok J (2006) A regularization framework for multiple-instance learning. In: Proceedings of the 23rd ICDM international conference on machine learning, pp 193–200

  5. Cui J, Liu H, He J, Li P, Du X, Wang P (2011) Tagclus: a random walk-based method for tag clustering. Knowl Inf Syst 27(2):193–225

    Article  MATH  Google Scholar 

  6. Dietterich T, Lathrop R, Lozano-Pérez T (1997) Solving the multiple instance problem with axis-parallel rectangles. Artif Intell 89(1–2):31–71

    Google Scholar 

  7. Elisseeff A, Weston J (2002) Kernel methods for multi-labelled classification and categorical regression problems. Adv Neural Inf Process Syst 14:681–687

    Google Scholar 

  8. Foulds J, Frank E (2012) A review of multi-instance learning assumptions. Knowl Eng Rev 25(1):2291–2320

    Google Scholar 

  9. Frank M, Wolfe P (1956) An algorithm for quadratic programming. Naval Res Logist Q 3(1–2):95–110

    Article  MathSciNet  Google Scholar 

  10. Gärtner T, Flach P, Kowalczyk A, Smola A (2002) Multi-instance kernels. In: Proceedings of the 19th ICML international conference on machine learning, pp 179–186

  11. Haveliwala T (2003) Topic-sensitive pagerank: a context-sensitive ranking algorithm for web search. IEEE Trans Knowl Data Eng 15(4):784–796

    Google Scholar 

  12. Jin R, Wang S, Zhou Z (2009) Learning a distance metric from multi-instance multi-label data. In: Proceedings of the IEEE CVPR conference on computer vision and pattern recognition, pp 896–902

  13. Joachims T (1998) Text categorization with support vector machines: learning with many relevant features. In: Proceeding ECML European conference on machine learning, pp 137–142

  14. Kong X, Yu P (2011) gmlc: a multi-label feature selection framework for graph classification. Knowl Inf Syst 31(2):281–305

    Google Scholar 

  15. Kwok J, Cheung P (2007) Marginalized multi-instance kernels. In: In international joint conferences on artificial intelligence, pp 901–906

  16. Li Y, Ji S, Kumar S, Ye J, Zhou Z (2009) Drosophila gene expression pattern annotation through multi-instance multi-label learning. IEEE/ACM Trans Comput Biol Bioinform 99:1–1

    Google Scholar 

  17. Lowe D, Broomhead D (1988) Multivariable functional interpolation and adaptive networks. Complex Syst 2:321–355

    MathSciNet  MATH  Google Scholar 

  18. Maron O, Lozano-Pérez T (1998) A framework for multiple-instance learning. Adv Neural Inf Process Syst 570–576

  19. Maron O, Ratan A (1998) Multiple-instance learning for natural scene classification. In: Proceedings of the 15th IEEE ICML international conference on machine learning, pp 341–349

  20. McCallum A (1999) Multi-label text classification with a mixture model trained by EM. In: Working notes of the AAAI workshop on text learning, pp 1–7

  21. Ray S, Craven M (2005) Supervised versus multiple instance learning: an empirical comparison. In: Proceedings of the 22nd IEEE ICDM international conference on machine learning, pp 697–704

  22. Ross SM (2003) Introduction to probability models, 8th edn. Academic Press, New York

  23. Salton G (1991) Developments in automatic text retrieval. Science 253:974–980

    Article  MathSciNet  Google Scholar 

  24. Schapire R, Singer Y (2000) Boostexter: a boosting-based system for text categorization. Mach Learn 39(2):135–168

    Article  MATH  Google Scholar 

  25. Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University Press, Cambridge

    Book  Google Scholar 

  26. Thabtah F, Cowling P, Peng Y (2004) Mmac: a new multi-class, multi-label associative classification approach. In: Proceedings of the 4th IEEE ICDM international conference on data mining, pp 217–224

  27. Thabtah F, Cowling P, Peng Y (2006) Multiple labels associative classification. Knowl Inf Syst 9(1): 109–129

    Google Scholar 

  28. Tong H, Faloutsos C, Pan J (2008) Random walk with restart: fast solutions and applications. Knowl Inf Syst 14(3):327–346

    Article  MATH  Google Scholar 

  29. Tsoumakas G, Katakis I (2007) Multi-label classification: an overview. Int J Data Warehous Min 3(3):1–13

    Article  Google Scholar 

  30. Wang D, Li J, Zhang B (2006) Multiple-instance learning via random walk. In: Proceeding ECML European conference on machine learning, pp 473–484

  31. Xu X, Frank E (2004) Logistic regression and boosting for labeled bags of instances. In: Proceedings of the Pacific Asia conference on knowledge discovery and data mining, pp 272–281

  32. Zha Z, Hua X, Mei T, Wang J, Qi G, Wang Z (2008) Joint multi-label multi-instance learning for image classification. In: Proceedings of the IEEE CVPR conference on computer vision and pattern recognition, pp 1–8

  33. Zhang M (2010) A k-nearest neighbor based multi-instance multi-label learning algorithm. In: 22nd IEEE international conference on tools with artificial intelligence (ICTAI, 2010), vol 2, pp 207–212

  34. Zhang M, Zhou Z (2007) Ml-knn: a lazy learning approach to multi-label learning. Pattern Recognit 40(7):2038–2048

    Article  MATH  Google Scholar 

  35. Zhang M, Zhou Z (2008) M3miml: a maximum margin method for multi-instance multi-label learning. In: Proceedings of the 8th IEEE ICDM international conference on data mining, pp 688–697

  36. Zhou Z (2004) Multi-instance learning: a survey. AI Lab, Department of Computer Science and Technology, Nanjing University, technical report

  37. Zhou Z, Zhang M (2007) Multi-instance multi-label learning with application to scene classification. Adv Neural Inf Process Syst 19:1609–1616

    Google Scholar 

  38. Zhou Z, Zhang M (2007) Solving multi-instance problems with classifier ensemble based on constructive clustering. Knowl Inf Syst 11(2):155–170

    Article  Google Scholar 

  39. Zhou Z, Zhang M, Huang S, Li Y (2012) Multi-instance multi-label learning. Artif Intell 176(1):2291–2320

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

Michael K. Ng would like to thank Professor Zhi-Hua Zhou to mention and discuss the MIML problem with him when he visited Nanjing University and provide suggestions and comments for the first version of this manuscript. M. Ng’s research is supported in part by Centre for Mathematical Imaging and Vision, HKRGC grant no. 201812 and HKBU FRG grant no. FRG2/11-12/127. Y. Ye’s research is supported in part by NSFC under Grant no. 61272538, and Shenzhen Science and Technology Program under Grant no. CXB201005250024A.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yunming Ye.

Appendix

Appendix

Proof of Theorem 1:

By Perron–Frobenius Theorem [22], we know that 1 is the spectral radius of \(\mathbf{Q},\) that is, 1 is the maximal eigenvalue of \(\mathbf{Q}\) in magnitude. The maximal eigenvalue of \((1-\alpha )\mathbf{Q}\) is equal to \(1-\alpha .\) It implies that the matrix \(I - (1-\alpha ) \mathbf{Q}\) is invertible. Thus, \(\mathbf{P} = (I- (1-\alpha ) \mathbf{Q})^{-1} \alpha \mathbf{D}\) is well defined. It shows that there exists a unique matrix \(\mathbf{P}\) satisfying \(\mathbf{P} = (1-\alpha ) \mathbf{Q} \mathbf{P} + \alpha \mathbf{D}.\) By using the series of expansion of \((I- (1-\alpha ) \mathbf{Q})^{-1}\) as follows:

$$\begin{aligned} \sum _{k=0}^{\infty } (1-\alpha )^k \mathbf{Q}^k \end{aligned}$$

and the fact that the entries of \(\mathbf{Q}\) are nonnegative, we have the entries of \(\mathbf{P}\) must be nonnegative. We note that when \(\mathbf{Q}\) is irreducible, there exists a positive integer \(k^{\prime }\) such that the entries \(\mathbf{Q}^k \mathbf{D}\) are even positive. Because \(\sum _{k=0}^{\infty } (1-\alpha )^k \alpha = 1\) and each column of \(\mathbf{Q}^k \mathbf{D}\) consists of \(N\) probability distribution vectors, the results follows.

Note that \(\mathbf{P}_t - \mathbf{P} = ((1-\alpha ) \mathbf{Q}) ( \mathbf{P}_{t-1} - \mathbf{P} ),\) and the spectral radius of \((1-\alpha )\mathbf{Q}\) is less than 1. The sequence converges for any initial starting vector \(\mathbf{P}_0\) satisfying the required property in the theorem. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, Q., Ng, M.K. & Ye, Y. Markov-Miml: A Markov chain-based multi-instance multi-label learning algorithm. Knowl Inf Syst 37, 83–104 (2013). https://doi.org/10.1007/s10115-012-0567-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-012-0567-9

Keywords

Navigation