Skip to main content

Incomplete Data Classification Based on Multiple Views

  • Conference paper
  • First Online:
Web Technologies and Applications (APWeb 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9932))

Included in the following conference series:

Abstract

Missing values have negative impacts on big data analysis. However, in absence of extra knowledge, exact imputation can hardly be conducted for many data sets. Therefore, we have to tolerate missing values and perform data mining on incomplete data sets directly. To achieve high quality data mining on incomplete data, we propose a classification approach based on multiple views. We use various complete views of the data set to generate the base classifiers and combine the results of base classifiers. Since the amount of base classifiers will affect the effectiveness and efficiency of the classification, we aim to find proper view sets. We prove that the view set selection problem is an NP-hard problem and develop an approximation algorithm with approximate ratio \(ln|S|+1\) where S is the feature set of original data set. Extensive experimental results demonstrate the efficiency and effectiveness of the proposed approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://archive.ics.uci.edu/ml/.

References

  1. Troyanskaya, O., Cantor, M., Sherlock, G., Brown, T., Hastie, T., Tibshirani, R., Botstein, D., Altman, R.B.: Missing value estimation methods for DNA microarrays. Bioinformatics 17(6), 520–525 (2001)

    Article  Google Scholar 

  2. Oba, S., Sato, M.A., Takemasa, I., Monden, M., Matsubara, K.I., Ishii, S.: A Bayesian missing value estimation method for gene expression profile data. Bioinformatics 19(16), 2088–2096 (2003)

    Article  Google Scholar 

  3. Zhu, X., Zhang, S., Jin, Z., Zhang, Z., Xu, Z.: Missing value estimation for mixed-attribute data sets. IEEE Trans. Knowl. Data Eng. 23(1), 110–121 (2011)

    Article  Google Scholar 

  4. Setiawan, N.A., Venkatachalam, P.A., Hani, A.F.M.: Missing attribute value prediction based on artificial neural network and rough set theory. In: International Conference on BioMedical Engineering and Informatics, BMEI 2008. IEEE (2008)

    Google Scholar 

  5. Abdella, M., Marwala, T.: The use of genetic algorithms and neural networks to approximate missing data in database. In: IEEE 3rd International Conference on Computational Cybernetics, ICCC 2005. IEEE (2005)

    Google Scholar 

  6. Hagan, M.T., Demuth, H.B., Beale, M.H., De Jesús, O.: Neural Network Design. PWS publishing company, Boston (1996)

    Google Scholar 

  7. Agrawal, R., Srikant, R., et al.: Fast algorithms for mining association rules. In: Proceedings of 20th International Conference very large Data Bases, VLDB (1994)

    Google Scholar 

  8. Pei, J., Han, J., Mao, R., et al.: Closet: an efficient algorithm for mining frequent closed itemsets. In: ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (2000)

    Google Scholar 

  9. Christofides, N.: Graph Theory–An Algorithmic Approach. Academic Press Inc., New York (1975)

    MATH  Google Scholar 

  10. Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theor. 13(1), 21–27 (1967)

    Article  MATH  Google Scholar 

  11. Jin, L.: Research on missing value imputation of incomplete data. Harbin Institute of Technology (2013)

    Google Scholar 

Download references

Acknowledgement

This paper was partially supported by National Sci-Tech Support Plan 2015BAH10F01 and NSFC grant U1509216,61472099,61133002 and the Scientific Research Foundation for the Returned Overseas Chinese Scholars of Heilongjiang Provience LC2016026.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongzhi Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Sun, M., Wang, H., Meng, F., Li, J., Gao, H. (2016). Incomplete Data Classification Based on Multiple Views. In: Li, F., Shim, K., Zheng, K., Liu, G. (eds) Web Technologies and Applications. APWeb 2016. Lecture Notes in Computer Science(), vol 9932. Springer, Cham. https://doi.org/10.1007/978-3-319-45817-5_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-45817-5_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-45816-8

  • Online ISBN: 978-3-319-45817-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics