Skip to main content

Feature-Correlation Based Multi-view Detection

  • Conference paper
Computational Science and Its Applications – ICCSA 2005 (ICCSA 2005)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3483))

Included in the following conference series:

  • 1694 Accesses

Abstract

A view validation algorithm has been shown to predict whether or not the views are sufficiently compatible for solving a particular learning task. But it only works when a natural split of features exists. If the split does not exist, it will fail to manufacture a feature split to build the best views. In this paper, we present a general algorithm CCFP (Correlation and Compatibility based Feature Partitioner) to automate multi-view detection. CCFP first labels the large amount of unlabeled examples using single view algorithm, then calculates the conditional SU (Symmetric Uncertainty) between every pair of features and the IG (Information Gain) of each feature given the examples labeled previously by single view algorithm with high-confidence predictions. According to the estimated values of SU and IG, all the features will be partitioned into two views that are low correlated, compatible and sufficient enough. The experiment results show that multi-view learner with views generated by CCFP outperforms learner with views generated by other means clearly.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Yarowsky, D.: Unsupervised word sense disambiguation rivaling supervised methods. In: Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, pp. 189–196 (1995)

    Google Scholar 

  2. Brefeld, U., Scheffer, T.: Co-EM support vector learning. In: Proceedings of the 21st international conference on Machine learning (2004)

    Google Scholar 

  3. Nigam, K., Ghani, R.: Analyzing the effectiveness and applicability of co-training. In: Proceedings of Information and Knowledge Management (2000)

    Google Scholar 

  4. Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society B 39 (1977)

    Google Scholar 

  5. Ion, M., Minton, S., Knoblock, C.: Adaptive view validation: A first step towards automatic view detection. In: The 19th International Conference on Machine Learning (ICML 2002), pp. 443–450 (2002)

    Google Scholar 

  6. Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the Conference on Computational Learning Theory, pp. 92–100 (1998)

    Google Scholar 

  7. Rayid, G.: Combining labeled and unlabeled data for text classification with a large number of categories. In: Proceedings of IEEE Conference on Data Mining (2001)

    Google Scholar 

  8. Rayid, G.: Combining labeled and unlabeled data for multiclass text classification. In: Proceedings of the 19th International Conference on Machine Learning (ICML 2002), pp. 187–194 (2002)

    Google Scholar 

  9. Yu, L., Liu, H.: Feature Selection for High-Dimensional Data: A Fast Correlation- Based Filter Solution. In: Proceedings of the 19th International Conference on Machine Learning (ICML 2003), pp. 856–863 (2003)

    Google Scholar 

  10. Quinlan, J.: C4.5: Programs for machine learning. Morgan Kaufmann, San Francisco (1993)

    Google Scholar 

  11. Press, W.H., Flannery, B.P., Teukolsky, S.A., Vetterling, W.T.: Numerical recipes in C. Cambridge University Press, Cambridge (1988)

    MATH  Google Scholar 

  12. Joachims, T.: A probabilistic analysis of the Rocchio algorithm with TFIDF for text categorization. In: Proceedings of ICML 1997 (1997)

    Google Scholar 

  13. Miller, G.: WordNet: An online lexical database. International Journal of Lexicography (1990)

    Google Scholar 

  14. Ion, M., Minton, S., Knoblock, C.: Active + Semi-supervised Learning = Robust Multi-view Learning. In: The 19th International Conference on Machine Learning (ICML 2002), pp. 435–442 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhang, K., Tang, J., Li, J., Wang, K. (2005). Feature-Correlation Based Multi-view Detection. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2005. ICCSA 2005. Lecture Notes in Computer Science, vol 3483. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11424925_127

Download citation

  • DOI: https://doi.org/10.1007/11424925_127

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25863-6

  • Online ISBN: 978-3-540-32309-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics