Abstract
Normalized Information Distance (NID) [1] is a general-purpose similarity metric based on the concept of Kolmogorov Complexity. We have developed this notion into a valid kernel distance, called LZ78-based string kernel [2] and have shown that it can be used effectively for a variety of 1D sequence classification tasks [3]. In this paper, we further demonstrate its applicability on 2D images. We report experiments with our technique on two real datasets: (i) a collection of real-life photographs and (ii) a collection of medical diagnostic images from Magnetic Resonance (MR) data. The classification results are compared with those of the original similarity metric (i.e. NID) and several conventional classification algorithms. In all cases, the proposed kernel approach demonstrates better or equivalent performance when compared with other candidate methods but with lower computational overhead.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Li, M., Chen, X., Ma, B., Vitanyi, P.: The similarity metric. In: Proceedings of the 14th ACM-SIAM Symposium on Discrete Algorithms, pp. 863–872 (2003)
Li, M., Sleep, R.M.: A LZ78-based string kernel. In: Li, X., Wang, S., Dong, Z.Y. (eds.) ADMA 2005. LNCS (LNAI), vol. 3584, pp. 678–689. Springer, Heidelberg (2005)
Li, M., Sleep, R.M.: A robust approach to sequence classification. In: Proceedings of the 17th IEEE Conference on Tools with Artificial Intelligence, Hong Kong, China (2005)
Li, M., Vitanyi, P.: An Introduction to Kolmogorov Complexity and Its Applications. Springer, Heidelberg (1997)
Teahan, W.J., Harper, D.J.: Using compression-based language models for text categorization. In: Workshop on Language Modeling and Information Retrieval,, Carnegie Mellon University, pp. 83–88 (2001)
Benedetto, D., Caglioti, E., Loreto, V.: Language trees and zipping. Physical Review Letters 88 (2000)
Cilibrasi, R., Vitanyi, P.: Clustering by compression. IEEE Transactions on Information Theory 51, 1523–1545 (2005)
Lan, Y., Harvey, R.: Image classification using compression distance. In: Proceedings of the 2nd International Conference on Vision, Video and Graphics, Edinburgh (2005)
Platt, J.: Sequential minimal optimization: A fast algorithm for training support vector machines. Microsoft Research Technical Report MSR-TR-98-14 (1998), Available at, http://research.microsoft.com/users/jplatt/smo.html
Ziv, J., Lempel, A.: Compression of individual sequences via variable-rate coding. IEEE Transactions on Information Theory 24, 530–536 (1978)
Cleary, J., Witten, I.: Data compression using adaptive coding and partial string matching. IEEE Transactions on Communication COM 32, 396–402 (1984)
Zhu, Y., Williams, S., Fisher, M., Zwiggelaar, R.: The use of grey-level profiles for detection of extracapsular extension of prostate cancer from MRI. In: Proceedings of Medical Image Understanding and Analysis, pp. 215–218 (2005)
Bangham, A.J., Harvey, R., Ling, P., Aldridge, R.: Morphological scale-space preserving transforms in many dimensions. Journal of Electronic Imaging 5, 283–299 (1996)
Keogh, E., Lonardi, S., Rtanamahatana, C.A.: Toward parameter free data mining. In: Proceeding of the 10th ACM SIGKDD, Seattle, Washington, USA, pp. 206–215 (2004)
Burrows, M., Wheeler, D.J.: A blocksorting lossless data compression algorithm. SRC Research Report 124 (1994)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, M., Zhu, Y. (2006). Image Classification Via LZ78 Based String Kernel: A Comparative Study. In: Ng, WK., Kitsuregawa, M., Li, J., Chang, K. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2006. Lecture Notes in Computer Science(), vol 3918. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11731139_81
Download citation
DOI: https://doi.org/10.1007/11731139_81
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33206-0
Online ISBN: 978-3-540-33207-7
eBook Packages: Computer ScienceComputer Science (R0)