Skip to main content

A Selective Classifier for Incomplete Data

  • Conference paper
Advances in Knowledge Discovery and Data Mining (PAKDD 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5012))

Included in the following conference series:

  • 1702 Accesses

Abstract

Classifiers based on feature selection (selective classifiers) are a kind of algorithms that can effectively improve the accuracy and efficiency of classification by deleting irrelevant or redundant attributes of a data set. Due to the complexity of processing incomplete data, however, most of them deal with complete data. Yet actual data are often incomplete and have many redundant or irrelevant attributes. So constructing selective classifiers for incomplete data is an important problem. With the analysis of main methods of processing incomplete data for classification, a selective classifier for incomplete data named RBSR (ReliefF algorithm-Based Selective Robust Bayes Classifier), which is based on the Robust Bayes Classifiers (RBC) and ReliefF algorithm, is presented. The proposed algorithm needs no assumptions about data sets that are necessary for previous methods of processing incomplete data in classification. This algorithm can deal with incomplete data sets with many attributes and instances. Experiments were performed on twelve benchmark incomplete data sets. We compared RBSR with the very effective RBC and several other classifiers for incomplete data. The experimental results show that RBSR can not only enormously reduce the number of redundant or irrelevant attributes, but greatly improve the accuracy and stability of classification as well.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. Wiley, New York (1973)

    MATH  Google Scholar 

  2. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)

    Google Scholar 

  3. Kohavi, R., Becker, B., Sommerfield, D.: Improving simple Bayes. In: van Someren, M., Widmer, G. (eds.) Poster Papers of the ECML-97, pp. 78–87. Charles University, Prague (1997)

    Google Scholar 

  4. Dempster, A.P., Laird, D., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm (with discussion), J. J. Royal Statist. Soc. Ser. B 39, 1–38 (1977)

    MATH  MathSciNet  Google Scholar 

  5. Geman, S., Geman, D.: Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE Trans. Pattern Analysis and Machine Intelligence 6, 721–741 (1984)

    Article  MATH  Google Scholar 

  6. Russell, S., Binder, J., Koller, D., Kanazawa, K.: Local learning in probabilistic networks with hidden variables. In: Proc. IJCAI 1995, Montreal, Quebec, pp. 1146–1151. Morgan Kaufmann, San Francisco (1995)

    Google Scholar 

  7. Little, R.J.A., Rubin, D.B.: Statistical Analysis with Missing Data. Wiley, New York (1987)

    MATH  Google Scholar 

  8. Spiegelhalter, D.J., Cowell, R.G.: Learning in probabilistic expert systems. In: Bernardo, J., Berger, J., Dawid, A.P., Smith, A.F.M. (eds.) Bayesian Statistics, vol. 4, pp. 447–466. Oxford University Press, Oxford (1992)

    Google Scholar 

  9. Williams, D., Liao, X., Xue, Y., Carin, L., Krishnapuram, B.: On classification with incomplete data. IEEE Trans. Pattern Analysis and Machine Intelligence 29(3), 427–436 (2007)

    Article  Google Scholar 

  10. Ramoni, M., Sebastiani, P.: Robust Bayes classifiers. Artificial Intelligence 125(1-2), 209–226 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  11. Winston, P.H.: Artificial intelligence. Addison-Wesley, Reading (1992)

    Google Scholar 

  12. Kononenko, I.,: Estimating attributes: Analysis and extensions of Relief. In: Raedt, L.D., Bergadano, F. (eds.) Machine Learning: ECML 1994, pp. 171–182. Springer, Heidelberg (1994)

    Google Scholar 

  13. Kira, K., Rendell, L.: The feature selection problem:Traditional methods and a new algorithm. In: Proc. AAAI 1992, pp. 129–134. AAAI Press, Menlo Park (1992)

    Google Scholar 

  14. Hettich, S., Blake, C.L., Merz, C.J.: UCI Repository of machine learning databases, Department of Information and Computer Sciences, University of California, Irvine, CA (1998), http://www.ics.uci.edu/~mlearn/

  15. Witten, I.H., Frank, E.: Data Mining—Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Takashi Washio Einoshin Suzuki Kai Ming Ting Akihiro Inokuchi

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chen, J., Huang, H., Tian, F., Tian, S. (2008). A Selective Classifier for Incomplete Data. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2008. Lecture Notes in Computer Science(), vol 5012. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68125-0_86

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-68125-0_86

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68124-3

  • Online ISBN: 978-3-540-68125-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics