Abstract
We describe a novel multi-label classification algorithm which works for discrete data. A matrix which gives the membership value of each discrete value of each attribute for every class. For a test pattern, looking at the values taken by each attribute, we find the subset of classes to which the pattern belongs. If the number of classes are large or the number of features are large, the space and time complexity of this algorithm will go up. To mitigate this problems, we have carried out feature selection before carrying out classification. We have compared two feature reduction techniques for getting good results. The results have been compared with the algorithm multi-label KNN or ML-KNN and found to give good results. Using feature reduction our classification accuracy and running time for algorithm is improved. The performance of the above algorithm is evaluated using some benchmark datasets and the results have been compared with the algorithm multi-label KNN or ML-KNN and found to give good results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Akhand, B., Devi, V.S.: Multi label classification of discrete data. In: IEEE-FUZZ, pp. 1–5 (2013)
Balasubramanian, K., Lebanon, G.: The landmark selection method for multiple output prediction. In: Proceedings of the 29th International Conference on Machine Learning, pp. 983–990 (2012)
Bi, K.: Multi-label classification on tree- and DAG- structured hierarchies. In: 28th International Conference on Machine Learning, pp. 17–24 (2011)
Bi, W., Kwok, J.T.: Efficient multi-label classification with many labels. In: Proceedings of the 30th International Conference on Machine Learning, pp. 405–413 (2013)
Boutell, M.R., Luo, J., Shen, X., Brown, C.M.: Learning multi-label scene classification. Pattern Recognition 37(9), 1757–1771 (2004)
Chen, Y.N., Lin, H.T.: Feature-aware label space dimension reduction for multi-label classification. Advances in Neural Information Processing Systems 25, 1538–1546 (2012)
Dembczynski, K., Cheng, W., Hullermeier, E.: Bayes optimal multilabel classification via probabilistic classifier chains. In: 27th International Conference on Machine Learning, pp. 279–286 (2010)
Filippone, M., Masulli, F., Rovetta, S.: Unsupervised gene selection and clustering using simulated annealing. In: Bloch, I., Petrosino, A., Tettamanzi, A.G.B. (eds.) WILF 2005. LNCS (LNAI), vol. 3849, pp. 229–235. Springer, Heidelberg (2006)
Fisher, R.A.: The use of multiple measurements in taxonomic problems. Annals of Eugenics 7(2), 179–188 (1936)
Godbole, S., Sarawagi, S.: Discriminative methods for multi-labeled classification. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 22–30. Springer, Heidelberg (2004)
Hardoon, D.R., Szedmak, S., Shawe-Taylor, J.: Canonical correlation analysis: An overview with application to learning methods. Neural Computation 16(12), 2639–2664 (2004)
Hariharan, B., Zelnik-Manor, L., Vishwanathan, S.V.N., Varma, M.: Large scale max-margin multi-labelclassification with priors. In: 27th International Conference on Machine Learning, pp. 423–430 (2010)
Hsu, D., Kakade, S.M., Langford, J., Zhang, T.: Multi-label prediction via compressed sensing. Advances in Neural Information Processing Systems 22, 772–780 (2009)
Joiţa, D.: Unsupervised Static Discretization Methods in Data Mining. Revista Mega, Bytes, vol. 9 (2010)
Jolliffe, I.T. (ed.): Principal Component Analysis. Springer, New York (1986)
Schapire, S.: Boostexter: a boosting-based system for text categorization. Machine Learning 39(2/3), 135–168 (2000)
Tai, F., Lin, H.T.: Multilabel classification with principal label space transformation. Neural Computation 24(9), 2508–2542 (2012)
Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining multi-label data. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 667–685 (2008)
Tsoumakas, G., Katakis, I., Vlahavas, I.: Random k-labelsets for multilabel classification. IEEE Transactions on Knowledge and Data Engineering 23(7), 1079–1089 (2011)
Xia, X., Yang, X., Li, S., Wu, C., Zhou, L.: Rw.knn: a proposed random walk knn algorithm for multi-label classification. In: Proceedings of the 4th Workshop on Workshop for Ph.D. Students in Information and Knowledge Management, PIKM 2011, New York, USA, pp. 87–90 (2011)
Zhang, M.L., Zhou, Z.H.: A k-nearest neighbor based algorithm for multi-label classification. In: 2005 IEEE International Conference on Granular Computing, vol. 2, pp. 718–721, July 2005
Zhang, M.L., Zhou, Z.H.: Ml-knn: A lazy learning approach to multi-label learning. Pattern Recognition 40(7), 2038–2048 (2007). http://www.sciencedirect.com/science/article/pii/S0031320307000027
Zhang, Y., Schneider, J.: Multi-label output codes using canonical correlation analysis. In: 14th International Conference on Artificial Intelligence and Statistics, pp. 873–882 (2012)
Zhang, Y., Zhou, Z.H.: Multilabel dimensionality reduction via dependence maximization. ACM Trans. Knowl. Discov. Data 4(3), 14:1–14:21 (2010). http://doi.acm.org/10.1145/1839490.1839495
Zhou, Z.H., Zhang, M.L., Huang, S.J., Li, Y.F.: Multi instance multi-label learning. Artificial Intelligence 176(1), 2291–2320 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Susheela Devi, V., Akhand, B. (2016). Feature Reduction for Multi Label Classification of Discrete Data. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2016. Lecture Notes in Computer Science(), vol 9729. Springer, Cham. https://doi.org/10.1007/978-3-319-41920-6_49
Download citation
DOI: https://doi.org/10.1007/978-3-319-41920-6_49
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41919-0
Online ISBN: 978-3-319-41920-6
eBook Packages: Computer ScienceComputer Science (R0)