Skip to main content

Testing Component Independence Using Data Compressors

  • Conference paper
Artificial Neural Networks – ICANN 2007 (ICANN 2007)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4669))

Included in the following conference series:

  • 1887 Accesses

Abstract

We propose a new nonparametric test for component independence which is based on application of data compressors to ranked data. For two-component data sample the idea is to break the sample in two parts and permute one of the components in the second part, while leaving the first part intact. The resulting two samples are then jointly ranked and a data compressor is applied to the resulting (binary) data string. The components are deemed independent if the string cannot be compressed. This procedure gives a provably valid test against all possible alternatives (that is, the test is distribution-free) provided the data compressor was ideal.

This research was supported by the Swiss NSF grants 200020-107616 and 200021-113364.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cilibrasi, R., Vitányi, P.: Clustering by Compression. IEEE Transactions on Information Theory 51(4) (2005)

    Google Scholar 

  2. Cilibrasi, R., de Wolf, R., Vitányi, P.: Algorithmic Clustering of Music. Computer Music Journal 28(4), 49–67 (2004)

    Article  Google Scholar 

  3. Guyon, I., Elisseeff, A.: An Introduction to Variable and Feature Selection. Journal of Machine Learning Research 3, 1157–1182 (2003)

    Article  MATH  Google Scholar 

  4. Lehmann, E.: Testing Statistical Hypotheses, 2nd edn. John Wiley & Sons, New York (1986)

    MATH  Google Scholar 

  5. Li, M., Chen, X., Li, X., Ma, B., Vitányi, P.: The similarity metric. IEEE Trans. Inform. Th. 50(12), 3250–3264 (2004)

    Article  Google Scholar 

  6. Li, M., Vitányi, P.: An introduction to Kolmogorov complexity and its applications, 2nd edn. Springer, Heidelberg (1997)

    MATH  Google Scholar 

  7. Liu, H., Motoda., Hiroshi.: Feature Selection for Knowledge Discovery and Data Mining. Springer, Heidelberg (1998)

    MATH  Google Scholar 

  8. Ryabko, B.: Prediction of random sequences and universal coding. Problems of Inform. Transmission 24(2), 87–96 (1988)

    MATH  Google Scholar 

  9. Ryabko, B., Astola, J.: Universal Codes as a Basis for Time Series Testing. Statistical Methodology 3, 375–397 (2006)

    Article  Google Scholar 

  10. Ryabko, B., Monarev, V.: Using information theory approach to randomness testing. Journal of Statistical Planning and Inference 133(1), 95–110 (2005)

    Article  MATH  Google Scholar 

  11. Vereshchagin, N., Shen, A., Uspensky, V.: Lecture Notes on Kolmogorov Complexity, Unpublished (2004), http://lpcs.math.msu.su/~ver/kolm-book

  12. Ziv, J., Lempel, A.: Compression of individual sequences via variable-rate coding. IEEE Trans. Inform. Theory IT-24(5), 530–536 (1978)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Joaquim Marques de Sá Luís A. Alexandre Włodzisław Duch Danilo Mandic

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ryabko, D. (2007). Testing Component Independence Using Data Compressors. In: de Sá, J.M., Alexandre, L.A., Duch, W., Mandic, D. (eds) Artificial Neural Networks – ICANN 2007. ICANN 2007. Lecture Notes in Computer Science, vol 4669. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74695-9_83

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74695-9_83

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74693-5

  • Online ISBN: 978-3-540-74695-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics