Co-training Based Attribute Reduction for Partially Labeled Data

Zhang, Wei; Miao, Duoqian; Gao, Can; Yue, Xiaodong

doi:10.1007/978-3-319-11740-9_8

Wei Zhang^10,11,12,
Duoqian Miao^10,12,
Can Gao¹³ &
…
Xiaodong Yue¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8818))

Included in the following conference series:

International Conference on Rough Sets and Knowledge Technology

3824 Accesses
1 Citations

Abstract

Rough set theory is an effective supervised learning model for labeled data. However, it is often the case that practical problems involve both labeled and unlabeled data. In this paper, the problem of attribute reduction for partially labeled data is studied. A novel semi-supervised attribute reduction algorithm is proposed, based on co-training which capitalizes on the unlabeled data to improve the quality of attribute reducts from few labeled data. It gets two diverse reducts of the labeled data, employs them to train its base classifiers, then co-trains the two base classifiers iteratively. In every round, the base classifiers learn from each other on the unlabeled data and enlarge the labeled data, so better quality reducts could be computed from the enlarged labeled data and employed to construct base classifiers of higher performance. The experimental results with UCI data sets show that the proposed algorithm can improves the quality of reduct.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Pawlak, Z.: Rough sets. International Journal of Computer & Information Sciences 11(5), 341–356 (1982)
Article MathSciNet MATH Google Scholar
Pawlak, Z.: Rough sets: Theoretical aspects of reasoning about data. Kluwer Academic Publishers. Dordrecht & Boston (1991)
Google Scholar
Liu, Q.: Rough sets and rough reasoning. Academic Pub., Beijing (2001)
Google Scholar
Wang, G.Y.: Rough set theory and knowledge acquisition. Xi’an Jiaotong University Press, Xi’an (2001)
Google Scholar
Zhang, W.X., Wu, W.Z., Liang, J.Y., et al.: Rough set theory and methods. Science and Technology Press, Beijing (2001)
Google Scholar
Polkowski, L.: Rough sets: Mathematical foundations. Springer Science & Business (2013)
Google Scholar
Miao, D.Q., Li, D.G.: Rough Set Theory, Algorithms and Applications. Tsinghua University Press, Beijing (2008)
Google Scholar
Miao, D.Q., Zhao, Y., Yao, Y.Y., et al.: Relative reducts in consistent and inconsistent decision tables of the Pawlak rough set model. Information Sciences 179(24), 4140–4150 (2009)
Article MathSciNet MATH Google Scholar
Thangavel, K., Pethalakshmi, A.: Dimensionality reduction based on rough set theory: A review. Applied Soft Computing 9(1), 1–12 (2009)
Article Google Scholar
Xiaojin, Z.: Semi-supervised learning literature survey. Computer Sciences TR 1530. Department of Computer Sciences, University of Wisconsin (2008)
Google Scholar
Miao, D.Q., Gao, C., Zhang, N., et al.: Diverse reduct subspaces based co-training for partially labeled data. International Journal of Approximate Reasoning 52(8), 1103–1117 (2011)
MathSciNet Google Scholar
Yang, M.: An incremental updating algorithm for attribute reduction based on improved dis-cernibility matrix. Chinese Journal of Computers 30(5), 815–822 (2007)
MathSciNet Google Scholar
Skowron, A., Rauszer, C.: The discernibility matrices and functions in information systems. In: Intelligent Decision Support. Theory and Decision Library, vol. 11, pp. 331–362. Springer, Netherlands (1992)
Chapter Google Scholar
Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the 11th Annual Conference on Computational Learning Theory, pp. 92–100. ACM, New York (1998)
Google Scholar
Zhu, X.J., Goldberg, A.B.: Introduction to semi-supervised learning. Synthesis Lectures on Artificial Intelligence and Machine Learning 3(1), 1–130 (2009)
Article Google Scholar
Nigam, K., Ghani, R.: Analyzing the effectiveness and applicability of co-training. In: Proceedings of the Ninth International Conference on Information and Knowledge Management, pp. 86–93. ACM, New York (2000)
Google Scholar
Feger, F., Koprinska, I.: Co-Training using RBF nets and different feature splits. In: International Joint Conference on Neural Networks, pp. 1878–1885. IEEE, Piscataway (2006)
Google Scholar
Wang, J., Luo, S.W., Zeng, X.H.: A random subspace method for co training. Acta Electronica Sinica 36(12A), 60–65 (2008)
Google Scholar
Tang, H.L., Lin, Z.K., Lu, M.Y., et al.: An advanced co-training algorithm based on mutual independence and diversity measures. Journal of Computer Research and Development. 45 (11),1874-1881 (2008)
Google Scholar
Salaheldin, A., El Gayar, N.: New feature splitting criteria for co-training using genetic algorithm optimization. In: El Gayar, N., Kittler, J., Roli, F. (eds.) MCS 2010. LNCS, vol. 5997, pp. 22–32. Springer, Heidelberg (2010)
Chapter Google Scholar
Yaslan, Y., Cataltepe, Z.: Co-training with relevant random subspaces. Neurocomputing 73(10), 1652–1661 (2010)
Article Google Scholar
Goldman, S.A., Zhou, Y.: Enhancing Supervised Learning with Unlabeled Data. In: Proceedings of the 17th International Conference on Machine Learning, pp. 327–334. Morgan Kaufmann, San Francisco (2000)
Google Scholar
Zhou, Y., Goldman, S.: Democratic co-learning. In: The 16th IEEE International Conference on Tools with Artificial Intelligence, pp. 594–602. IEEE, Piscataway (2004)
Chapter Google Scholar
Zhou, Z.H., Li, M.: Tri-training: Exploiting unlabeled data using three classifiers. IEEE Transactions on Knowledge and Data Engineering 17(11), 1529–1541 (2005)
Article Google Scholar
Li, M., Zhou, Z.H.: Improve computer-aided diagnosis with machine learning techniques using undiagnosed samples. IEEE Transactions on Systems, Man, and Cybernetics: Systems 37(6), 1088–1098 (2007)
Article Google Scholar
Gao, C., Miao, D.Q., Zhang, Z.F., et al.: A Semi-Supervised rough set model for classification based on active learning and co-training. Pattern Recognition and Artificial Intelligence 25(5), 745–754 (2012)
Google Scholar
Blake, C., Merz, C.J.: UCI Repository of machine learning databases, http://archive.ics.uci.edu/ml/datasets.html
Øhrn, A., Komorowski, J.: ROSETTA–A Rough Set Toolkit for Analysis of Data. In: 5th International Workshop on Rough Sets and Soft Computing, pp. 403–407 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Electronics and Information Engineering, Tongji University, Shanghai, 201804, China
Wei Zhang & Duoqian Miao
School of Computer Science and Technology, Shanghai University of Electric Power, Shanghai, 200090, China
Wei Zhang
The Key Laboratory of Embedded System and Service Computing, Ministry of Education, Tongji University, Shanghai, 201804, China
Wei Zhang & Duoqian Miao
Zoomlion Heavy Industry Science And Technology Development Co.,Ltd., Changsha, 410013, China
Can Gao
School of Computer Engineering and Science, Shanghai University, Shanghai, 200444, China
Xiaodong Yue

Authors

Wei Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Duoqian Miao
View author publications
You can also search for this author in PubMed Google Scholar
Can Gao
View author publications
You can also search for this author in PubMed Google Scholar
Xiaodong Yue
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Tongji University, Shanghai, China
Duoqian Miao
Department of Electrical and Computer En, University of Alberta, Edmonton, Alberta, Canada
Witold Pedrycz
University of Warsaw, Warsaw, Poland
Dominik Ślȩzak
University of Applied Sciences, München, Germany
Georg Peters
Tianjin University, Tianjin, China
Qinghua Hu
Tongji University, Shanghai, China
Ruizhi Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, W., Miao, D., Gao, C., Yue, X. (2014). Co-training Based Attribute Reduction for Partially Labeled Data. In: Miao, D., Pedrycz, W., Ślȩzak, D., Peters, G., Hu, Q., Wang, R. (eds) Rough Sets and Knowledge Technology. RSKT 2014. Lecture Notes in Computer Science(), vol 8818. Springer, Cham. https://doi.org/10.1007/978-3-319-11740-9_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-11740-9_8
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11739-3
Online ISBN: 978-3-319-11740-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics