Skip to main content

An ICA-Based Multivariate Discretization Algorithm

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4092))

Abstract

Discretization is an important preprocessing technique in data mining tasks. Univariate Discretization is the most commonly used method. It discretizes only one single attribute of a dataset at a time, without considering the interaction information with other attributes. Since it is multi-attribute rather than one single attribute determines the targeted class attribute, the result of Univariate Discretization is not optimal. In this paper, a new Multivariate Discretization algorithm is proposed. It uses ICA (Independent Component Analysis) to transform the original attributes into an independent attribute space, and then apply Univariate Discretization to each attribute in the new space. Data mining tasks can be conducted in the new discretized dataset with independent attributes. The numerical experiment results show that our method improves the discretization performance, especially for the nongaussian datasets, and it is competent compared to PCA-based multivariate method.

Supported by a SRG Grant (7001805) from the City University of Hong Kong.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Liu, H.: Discretization: An Enabling Technique. Data Mining and Knowledge Discovery 6, 393–423 (2002)

    Article  MathSciNet  Google Scholar 

  2. Mehta, S.: Toward Unsupervised Correlation Preserving Discretization. IEEE Transaction On Knowledge and Data Engineering 17(9), 1174–1185 (2005)

    Article  MathSciNet  Google Scholar 

  3. Dougherty, J., Kohavi, R., Sahami, M.: Supervised and unsupervised discretization of continuous features. In: Proceedings of the Twelfth International Conference on Machine Learning (1995)

    Google Scholar 

  4. Kerber, R.: Chimerge discretization of numeric attributes. In: Proceedings of the 10th International Conference on Artificial Intelligence (1991)

    Google Scholar 

  5. Zeta, K.M.H.O.: A Global Method for Discretization of Continuous Variables. In: The Third International Conference on Knowledge Discovery and Data Mining. (1997)

    Google Scholar 

  6. Liu, X., Wang, H.: A Discretization Algorithm Based on a Heterogeneity Criterion. IEEE Transactions on Knowledge and Data Engineering 17(9), 1166–1173 (2005)

    Article  Google Scholar 

  7. Ferrandiz, S., Boullé, M.: Multivariate Discretization by Recursive Supervised Bipartition of Graph. In: Perner, P., Imiya, A. (eds.) MLDM 2005. LNCS, vol. 3587, pp. 253–264. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  8. Bay, S.D.: Multivariate Discretization of Continuous Variables for Set Ming. Knowledge and Information Systems 3(4), 491–512 (2001)

    Article  MATH  Google Scholar 

  9. Huang, Y., Luo, S.: Genetic Algorithm Applied to ICA Feature Selection. In: Proceedings of the International Joint Conference on Neural Networks (2003)

    Google Scholar 

  10. Bach, F.R., Jordan, M.I.: Kernel Independent Component Analysis. Journal of Machine Learning Research 3 (2002)

    Google Scholar 

  11. Hyvärinen, A.: Independent Component Analysis:Algorithms and Applications. Neural Networks 13, 411–430 (2000)

    Article  Google Scholar 

  12. Comon, P.: Independent component analysis, A new concept? Signal Processing 36, 287–314 (1994)

    Article  MATH  Google Scholar 

  13. Fayyad, U., Irani, K.B.: Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning. In: Proceeding of 13th International Joint Conference on Artificial Intelligence (1993)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kang, Y., Wang, S., Liu, X., Lai, H., Wang, H., Miao, B. (2006). An ICA-Based Multivariate Discretization Algorithm. In: Lang, J., Lin, F., Wang, J. (eds) Knowledge Science, Engineering and Management. KSEM 2006. Lecture Notes in Computer Science(), vol 4092. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11811220_47

Download citation

  • DOI: https://doi.org/10.1007/11811220_47

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-37033-8

  • Online ISBN: 978-3-540-37035-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics