Skip to main content

Improving Evolved Alphabet Using Tabu Set

  • Conference paper
Hybrid Artificial Intelligent Systems (HAIS 2012)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7208))

Included in the following conference series:

Abstract

Data compression is very important today and it will be even more important in the future. Textual data use only limited alphabet - total number of used symbols (letters, numbers, diacritics, dots, spaces, etc.). In most languages, letters are joined into syllables and words. Both these approaches has pros and cons, but none of them is the best for any file. This paper describes a variant of algorithm for evolving alphabet from characters and 2-grams, which is optimal for compressed text files. The efficiency of the new variant will be tested on three compression algorithms and a new compression algorithm based on LZ77 will be also used with this new approach.

This work was supported by the Grant Agency of the Czech Republic, under the grant no. P202/11/P142.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abramson, N.: Information Theory and Coding. McGraw-Hill, New York (1963)

    Google Scholar 

  2. Andres, J.: On a conjecture about the fractal structure of language (2008) (preprint)

    Google Scholar 

  3. Arnold, R., Bell, T.: A corpus for the evaluation of lossless compression algorithms. In: Storer, J.A., Cohn, M. (eds.) Proc. 1997 IEEE Data Compression Conference, pp. 201–210. IEEE Computer Society Press, Los Alamitos (1997)

    Chapter  Google Scholar 

  4. Bentley, J.L., Sleator, D.D., Tarjan, R.E., Wei, V.K.: A locally adaptive data compression scheme. Commun. ACM 29(4), 320–330 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  5. Burrows, M., Wheeler, D.J.: A block-sorting lossless data compression algorithm. Tech. rep., Digital SRC Research Report (1994)

    Google Scholar 

  6. Cleary, J.G., Witten, I.H.: Data compression using adaptive coding and partial string matching. IEEE Transactions on Communications 32, 396–402 (1984)

    Article  Google Scholar 

  7. Glover, F., McMillan, C.: The general employee scheduling problem: an integration of ms and ai. Comput. Oper. Res. 13, 563–573 (1986), http://dl.acm.org/citation.cfm?id=15310.15313

    Article  Google Scholar 

  8. Huffman, D.A.: A method for the construction of minimum-redundancy codes. Institute of Radio Engineers 40(9), 1098–1101 (1952)

    Google Scholar 

  9. Isal, R.Y.K., Moffat, A.: Parsing strategies for bwt compression. In: DCC 2001: Proceedings of the Data Compression Conference, p. 429. IEEE Computer Society, Washington, DC (2001)

    Chapter  Google Scholar 

  10. Koza, J.: Genetic programming: A paradigm for genetically breeding populations of computer programs to solve problems. Technical Report STAN-CS-90-1314, Dept. of Computer Science, Stanford University (1990)

    Google Scholar 

  11. Kuthan, T., Lansky, J.: Genetic algorithms in syllable-based text compression. In: Pokorný, J., Snásel, V., Richta, K. (eds.) DATESO. CEUR Workshop Proceedings, vol. 235. CEUR-WS.org (2007)

    Google Scholar 

  12. Lansky, J., Chernik, K., Vlickova, Z.: Comparison of text models for bwt. In: DCC 2007: Proceedings of the 2007 Data Compression Conference, p. 389. IEEE Computer Society, Washington, DC (2007)

    Google Scholar 

  13. Mitchell, M.: An Introduction to Genetic Algorithms. MIT Press, Cambridge (1996)

    Google Scholar 

  14. Moffat, A.: Implementing the ppm data compression scheme. IEEE Transactions on Communications 38(11), 1917–1921 (1990)

    Article  Google Scholar 

  15. Platos, J., Kromer, P.: Optimizing alphabet using genetic algorithms. In: 11th International Conference on Intelligent Systems Design and Applications (ISDA 2011), pp. 498–503 (November 2011)

    Google Scholar 

  16. Platos, J., Kromer, P.: Reducing Alphabet Using Genetic Algorithms. In: Snasel, V., Platos, J., El-Qawasmeh, E. (eds.) ICDIPC 2011, Part II. CCIS, vol. 189, pp. 82–92. Springer, Heidelberg (2011), http://dx.doi.org/10.1007/978-3-642-22410-2_7 , doi:10.1007/978-3-642-22410-2_7

    Chapter  Google Scholar 

  17. Platos, J., Kromer, P.: Reducing Alphabet Using Genetic Algorithms. In: Snasel, V., Platos, J., El-Qawasmeh, E. (eds.) ICDIPC 2011, Part II. CCIS, vol. 189, pp. 82–92. Springer, Heidelberg (2011), http://dx.doi.org/10.1007/978-3-642-22410-2_7 , doi:10.1007/978-3-642-22410-2_7

    Chapter  Google Scholar 

  18. Rissanen, J.: Generalized kraft inequality and arithmetic coding. IBM Journal of Research and Development 20(3), 198–203 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  19. Rissanen, J., Langdon Jr., G.G.: Arithmetic coding. IBM Journal of Research and Development 23(2), 149–162 (1979)

    Article  MathSciNet  MATH  Google Scholar 

  20. Salomon, D.: Data Compression - The Complete Reference, 4th edn. Springer-Verlag London Limited (2007)

    Google Scholar 

  21. Shannon, C.E.: A mathematical theory of communication. Bell System Technical Journal 27, 379–423, 623–656 (1948)

    Google Scholar 

  22. Shannon, C.E.: Prediction and entropy of printed english. Bell Systems Technical Journal 30, 50–64 (1951)

    MATH  Google Scholar 

  23. Storer, J.A., Szymanski, T.G.: Data compression via textual substitution. Journal of the ACM 26(26/82), 928–951 (1982)

    Article  MathSciNet  Google Scholar 

  24. Üçoluk, G., Toroslu, I.H.: A genetic algorithm approach for verification of the syllable-based text compression technique. Journal of Information Science 23(5), 365–372 (1997), http://jis.sagepub.com/content/23/5/365.abstract

    Article  Google Scholar 

  25. Welch, T.: A technique for high-performance data compression. Computer 17(6), 8–19 (1984)

    Article  Google Scholar 

  26. Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Transactions on Information Theory IT-23(3), 337–343 (1977)

    Article  MathSciNet  Google Scholar 

  27. Ziv, J., Lempel, A.: Compression of individual sequences via variable-rate coding. IEEE Transactions on Information Theory IT-24(5), 530–536 (1978)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Platos, J., Kromer, P. (2012). Improving Evolved Alphabet Using Tabu Set. In: Corchado, E., Snášel, V., Abraham, A., Woźniak, M., Graña, M., Cho, SB. (eds) Hybrid Artificial Intelligent Systems. HAIS 2012. Lecture Notes in Computer Science(), vol 7208. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28942-2_59

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28942-2_59

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28941-5

  • Online ISBN: 978-3-642-28942-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics