Skip to main content

Co-training Based Attribute Reduction for Partially Labeled Data

  • Conference paper
Book cover Rough Sets and Knowledge Technology (RSKT 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8818))

Included in the following conference series:

Abstract

Rough set theory is an effective supervised learning model for labeled data. However, it is often the case that practical problems involve both labeled and unlabeled data. In this paper, the problem of attribute reduction for partially labeled data is studied. A novel semi-supervised attribute reduction algorithm is proposed, based on co-training which capitalizes on the unlabeled data to improve the quality of attribute reducts from few labeled data. It gets two diverse reducts of the labeled data, employs them to train its base classifiers, then co-trains the two base classifiers iteratively. In every round, the base classifiers learn from each other on the unlabeled data and enlarge the labeled data, so better quality reducts could be computed from the enlarged labeled data and employed to construct base classifiers of higher performance. The experimental results with UCI data sets show that the proposed algorithm can improves the quality of reduct.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Pawlak, Z.: Rough sets. International Journal of Computer & Information Sciences 11(5), 341–356 (1982)

    Article  MathSciNet  MATH  Google Scholar 

  2. Pawlak, Z.: Rough sets: Theoretical aspects of reasoning about data. Kluwer Academic Publishers. Dordrecht & Boston (1991)

    Google Scholar 

  3. Liu, Q.: Rough sets and rough reasoning. Academic Pub., Beijing (2001)

    Google Scholar 

  4. Wang, G.Y.: Rough set theory and knowledge acquisition. Xi’an Jiaotong University Press, Xi’an (2001)

    Google Scholar 

  5. Zhang, W.X., Wu, W.Z., Liang, J.Y., et al.: Rough set theory and methods. Science and Technology Press, Beijing (2001)

    Google Scholar 

  6. Polkowski, L.: Rough sets: Mathematical foundations. Springer Science & Business (2013)

    Google Scholar 

  7. Miao, D.Q., Li, D.G.: Rough Set Theory, Algorithms and Applications. Tsinghua University Press, Beijing (2008)

    Google Scholar 

  8. Miao, D.Q., Zhao, Y., Yao, Y.Y., et al.: Relative reducts in consistent and inconsistent decision tables of the Pawlak rough set model. Information Sciences 179(24), 4140–4150 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  9. Thangavel, K., Pethalakshmi, A.: Dimensionality reduction based on rough set theory: A review. Applied Soft Computing 9(1), 1–12 (2009)

    Article  Google Scholar 

  10. Xiaojin, Z.: Semi-supervised learning literature survey. Computer Sciences TR 1530. Department of Computer Sciences, University of Wisconsin (2008)

    Google Scholar 

  11. Miao, D.Q., Gao, C., Zhang, N., et al.: Diverse reduct subspaces based co-training for partially labeled data. International Journal of Approximate Reasoning 52(8), 1103–1117 (2011)

    MathSciNet  Google Scholar 

  12. Yang, M.: An incremental updating algorithm for attribute reduction based on improved dis-cernibility matrix. Chinese Journal of Computers 30(5), 815–822 (2007)

    MathSciNet  Google Scholar 

  13. Skowron, A., Rauszer, C.: The discernibility matrices and functions in information systems. In: Intelligent Decision Support. Theory and Decision Library, vol. 11, pp. 331–362. Springer, Netherlands (1992)

    Chapter  Google Scholar 

  14. Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the 11th Annual Conference on Computational Learning Theory, pp. 92–100. ACM, New York (1998)

    Google Scholar 

  15. Zhu, X.J., Goldberg, A.B.: Introduction to semi-supervised learning. Synthesis Lectures on Artificial Intelligence and Machine Learning 3(1), 1–130 (2009)

    Article  Google Scholar 

  16. Nigam, K., Ghani, R.: Analyzing the effectiveness and applicability of co-training. In: Proceedings of the Ninth International Conference on Information and Knowledge Management, pp. 86–93. ACM, New York (2000)

    Google Scholar 

  17. Feger, F., Koprinska, I.: Co-Training using RBF nets and different feature splits. In: International Joint Conference on Neural Networks, pp. 1878–1885. IEEE, Piscataway (2006)

    Google Scholar 

  18. Wang, J., Luo, S.W., Zeng, X.H.: A random subspace method for co training. Acta Electronica Sinica 36(12A), 60–65 (2008)

    Google Scholar 

  19. Tang, H.L., Lin, Z.K., Lu, M.Y., et al.: An advanced co-training algorithm based on mutual independence and diversity measures. Journal of Computer Research and Development. 45 (11),1874-1881 (2008)

    Google Scholar 

  20. Salaheldin, A., El Gayar, N.: New feature splitting criteria for co-training using genetic algorithm optimization. In: El Gayar, N., Kittler, J., Roli, F. (eds.) MCS 2010. LNCS, vol. 5997, pp. 22–32. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  21. Yaslan, Y., Cataltepe, Z.: Co-training with relevant random subspaces. Neurocomputing 73(10), 1652–1661 (2010)

    Article  Google Scholar 

  22. Goldman, S.A., Zhou, Y.: Enhancing Supervised Learning with Unlabeled Data. In: Proceedings of the 17th International Conference on Machine Learning, pp. 327–334. Morgan Kaufmann, San Francisco (2000)

    Google Scholar 

  23. Zhou, Y., Goldman, S.: Democratic co-learning. In: The 16th IEEE International Conference on Tools with Artificial Intelligence, pp. 594–602. IEEE, Piscataway (2004)

    Chapter  Google Scholar 

  24. Zhou, Z.H., Li, M.: Tri-training: Exploiting unlabeled data using three classifiers. IEEE Transactions on Knowledge and Data Engineering 17(11), 1529–1541 (2005)

    Article  Google Scholar 

  25. Li, M., Zhou, Z.H.: Improve computer-aided diagnosis with machine learning techniques using undiagnosed samples. IEEE Transactions on Systems, Man, and Cybernetics: Systems 37(6), 1088–1098 (2007)

    Article  Google Scholar 

  26. Gao, C., Miao, D.Q., Zhang, Z.F., et al.: A Semi-Supervised rough set model for classification based on active learning and co-training. Pattern Recognition and Artificial Intelligence 25(5), 745–754 (2012)

    Google Scholar 

  27. Blake, C., Merz, C.J.: UCI Repository of machine learning databases, http://archive.ics.uci.edu/ml/datasets.html

  28. Øhrn, A., Komorowski, J.: ROSETTA–A Rough Set Toolkit for Analysis of Data. In: 5th International Workshop on Rough Sets and Soft Computing, pp. 403–407 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Zhang, W., Miao, D., Gao, C., Yue, X. (2014). Co-training Based Attribute Reduction for Partially Labeled Data. In: Miao, D., Pedrycz, W., Ślȩzak, D., Peters, G., Hu, Q., Wang, R. (eds) Rough Sets and Knowledge Technology. RSKT 2014. Lecture Notes in Computer Science(), vol 8818. Springer, Cham. https://doi.org/10.1007/978-3-319-11740-9_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11740-9_8

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11739-3

  • Online ISBN: 978-3-319-11740-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics