Abstract
The main aim of this paper is to propose an efficient and novel Markov chain-based multi-instance multi-label (Markov-Miml) learning algorithm to evaluate the importance of a set of labels associated with objects of multiple instances. The algorithm computes ranking of labels to indicate the importance of a set of labels to an object. Our approach is to exploit the relationships between instances and labels of objects. The rank of a class label to an object depends on (i) the affinity metric between the bag of instances of this object and the bag of instances of the other objects, and (ii) the rank of a class label of similar objects. An object, which contains a bag of instances that are highly similar to bags of instances of the other objects with a high rank of a particular class label, receives a high rank of this class label. Experimental results on benchmark data have shown that the proposed algorithm is computationally efficient and effective in label ranking for MIML data. In the comparison, we find that the classification performance of the Markov-Miml algorithm is competitive with those of the three popular MIML algorithms based on boosting, support vector machine, and regularization, but the computational time required by the proposed algorithm is less than those by the other three algorithms.










Similar content being viewed by others
Notes
Available at http://lamda.nju.edu.cn/datacode/miml-image-data.htm.
Available at http://lamda.nju.edu.cn/datacode/miml-text-data.htm.
References
Abbasnejad M, Ramachandram D, Mandava R (2011) A survey of the state of the art in learning the kernels. Knowl Inf Syst 31(2):193–221
Andrews S, Tsochantaridis I, Hofmann T (2003) Support vector machines for multiple-instance learning. Adv Neural Inf Process Syst 15:577–584
Boutell M, Luo J, Shen X, Brown C (2004) Learning multi-label scene classification. Pattern Recognit 37(9):1757–1771
Cheung P, Kwok J (2006) A regularization framework for multiple-instance learning. In: Proceedings of the 23rd ICDM international conference on machine learning, pp 193–200
Cui J, Liu H, He J, Li P, Du X, Wang P (2011) Tagclus: a random walk-based method for tag clustering. Knowl Inf Syst 27(2):193–225
Dietterich T, Lathrop R, Lozano-Pérez T (1997) Solving the multiple instance problem with axis-parallel rectangles. Artif Intell 89(1–2):31–71
Elisseeff A, Weston J (2002) Kernel methods for multi-labelled classification and categorical regression problems. Adv Neural Inf Process Syst 14:681–687
Foulds J, Frank E (2012) A review of multi-instance learning assumptions. Knowl Eng Rev 25(1):2291–2320
Frank M, Wolfe P (1956) An algorithm for quadratic programming. Naval Res Logist Q 3(1–2):95–110
Gärtner T, Flach P, Kowalczyk A, Smola A (2002) Multi-instance kernels. In: Proceedings of the 19th ICML international conference on machine learning, pp 179–186
Haveliwala T (2003) Topic-sensitive pagerank: a context-sensitive ranking algorithm for web search. IEEE Trans Knowl Data Eng 15(4):784–796
Jin R, Wang S, Zhou Z (2009) Learning a distance metric from multi-instance multi-label data. In: Proceedings of the IEEE CVPR conference on computer vision and pattern recognition, pp 896–902
Joachims T (1998) Text categorization with support vector machines: learning with many relevant features. In: Proceeding ECML European conference on machine learning, pp 137–142
Kong X, Yu P (2011) gmlc: a multi-label feature selection framework for graph classification. Knowl Inf Syst 31(2):281–305
Kwok J, Cheung P (2007) Marginalized multi-instance kernels. In: In international joint conferences on artificial intelligence, pp 901–906
Li Y, Ji S, Kumar S, Ye J, Zhou Z (2009) Drosophila gene expression pattern annotation through multi-instance multi-label learning. IEEE/ACM Trans Comput Biol Bioinform 99:1–1
Lowe D, Broomhead D (1988) Multivariable functional interpolation and adaptive networks. Complex Syst 2:321–355
Maron O, Lozano-Pérez T (1998) A framework for multiple-instance learning. Adv Neural Inf Process Syst 570–576
Maron O, Ratan A (1998) Multiple-instance learning for natural scene classification. In: Proceedings of the 15th IEEE ICML international conference on machine learning, pp 341–349
McCallum A (1999) Multi-label text classification with a mixture model trained by EM. In: Working notes of the AAAI workshop on text learning, pp 1–7
Ray S, Craven M (2005) Supervised versus multiple instance learning: an empirical comparison. In: Proceedings of the 22nd IEEE ICDM international conference on machine learning, pp 697–704
Ross SM (2003) Introduction to probability models, 8th edn. Academic Press, New York
Salton G (1991) Developments in automatic text retrieval. Science 253:974–980
Schapire R, Singer Y (2000) Boostexter: a boosting-based system for text categorization. Mach Learn 39(2):135–168
Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University Press, Cambridge
Thabtah F, Cowling P, Peng Y (2004) Mmac: a new multi-class, multi-label associative classification approach. In: Proceedings of the 4th IEEE ICDM international conference on data mining, pp 217–224
Thabtah F, Cowling P, Peng Y (2006) Multiple labels associative classification. Knowl Inf Syst 9(1): 109–129
Tong H, Faloutsos C, Pan J (2008) Random walk with restart: fast solutions and applications. Knowl Inf Syst 14(3):327–346
Tsoumakas G, Katakis I (2007) Multi-label classification: an overview. Int J Data Warehous Min 3(3):1–13
Wang D, Li J, Zhang B (2006) Multiple-instance learning via random walk. In: Proceeding ECML European conference on machine learning, pp 473–484
Xu X, Frank E (2004) Logistic regression and boosting for labeled bags of instances. In: Proceedings of the Pacific Asia conference on knowledge discovery and data mining, pp 272–281
Zha Z, Hua X, Mei T, Wang J, Qi G, Wang Z (2008) Joint multi-label multi-instance learning for image classification. In: Proceedings of the IEEE CVPR conference on computer vision and pattern recognition, pp 1–8
Zhang M (2010) A k-nearest neighbor based multi-instance multi-label learning algorithm. In: 22nd IEEE international conference on tools with artificial intelligence (ICTAI, 2010), vol 2, pp 207–212
Zhang M, Zhou Z (2007) Ml-knn: a lazy learning approach to multi-label learning. Pattern Recognit 40(7):2038–2048
Zhang M, Zhou Z (2008) M3miml: a maximum margin method for multi-instance multi-label learning. In: Proceedings of the 8th IEEE ICDM international conference on data mining, pp 688–697
Zhou Z (2004) Multi-instance learning: a survey. AI Lab, Department of Computer Science and Technology, Nanjing University, technical report
Zhou Z, Zhang M (2007) Multi-instance multi-label learning with application to scene classification. Adv Neural Inf Process Syst 19:1609–1616
Zhou Z, Zhang M (2007) Solving multi-instance problems with classifier ensemble based on constructive clustering. Knowl Inf Syst 11(2):155–170
Zhou Z, Zhang M, Huang S, Li Y (2012) Multi-instance multi-label learning. Artif Intell 176(1):2291–2320
Acknowledgments
Michael K. Ng would like to thank Professor Zhi-Hua Zhou to mention and discuss the MIML problem with him when he visited Nanjing University and provide suggestions and comments for the first version of this manuscript. M. Ng’s research is supported in part by Centre for Mathematical Imaging and Vision, HKRGC grant no. 201812 and HKBU FRG grant no. FRG2/11-12/127. Y. Ye’s research is supported in part by NSFC under Grant no. 61272538, and Shenzhen Science and Technology Program under Grant no. CXB201005250024A.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Proof of Theorem 1:
By Perron–Frobenius Theorem [22], we know that 1 is the spectral radius of \(\mathbf{Q},\) that is, 1 is the maximal eigenvalue of \(\mathbf{Q}\) in magnitude. The maximal eigenvalue of \((1-\alpha )\mathbf{Q}\) is equal to \(1-\alpha .\) It implies that the matrix \(I - (1-\alpha ) \mathbf{Q}\) is invertible. Thus, \(\mathbf{P} = (I- (1-\alpha ) \mathbf{Q})^{-1} \alpha \mathbf{D}\) is well defined. It shows that there exists a unique matrix \(\mathbf{P}\) satisfying \(\mathbf{P} = (1-\alpha ) \mathbf{Q} \mathbf{P} + \alpha \mathbf{D}.\) By using the series of expansion of \((I- (1-\alpha ) \mathbf{Q})^{-1}\) as follows:
and the fact that the entries of \(\mathbf{Q}\) are nonnegative, we have the entries of \(\mathbf{P}\) must be nonnegative. We note that when \(\mathbf{Q}\) is irreducible, there exists a positive integer \(k^{\prime }\) such that the entries \(\mathbf{Q}^k \mathbf{D}\) are even positive. Because \(\sum _{k=0}^{\infty } (1-\alpha )^k \alpha = 1\) and each column of \(\mathbf{Q}^k \mathbf{D}\) consists of \(N\) probability distribution vectors, the results follows.
Note that \(\mathbf{P}_t - \mathbf{P} = ((1-\alpha ) \mathbf{Q}) ( \mathbf{P}_{t-1} - \mathbf{P} ),\) and the spectral radius of \((1-\alpha )\mathbf{Q}\) is less than 1. The sequence converges for any initial starting vector \(\mathbf{P}_0\) satisfying the required property in the theorem. \(\square \)
Rights and permissions
About this article
Cite this article
Wu, Q., Ng, M.K. & Ye, Y. Markov-Miml: A Markov chain-based multi-instance multi-label learning algorithm. Knowl Inf Syst 37, 83–104 (2013). https://doi.org/10.1007/s10115-012-0567-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-012-0567-9