Skip to main content
Log in

A bit-level text compression scheme based on the ACW algorithm

  • Published:
International Journal of Automation and Computing Aims and scope Submit manuscript

Abstract

This paper presents a description and performance evaluation of a new bit-level, lossless, adaptive, and asymmetric data compression scheme that is based on the adaptive character wordlength (ACW(n)) algorithm. The proposed scheme enhances the compression ratio of the ACW(n) algorithm by dividing the binary sequence into a number of subsequences (s), each of them satisfying the condition that the number of decimal values (d) of the n-bit length characters is equal to or less than 256. Therefore, the new scheme is referred to as ACW(n, s), where n is the adaptive character wordlength and s is the number of subsequences. The new scheme was used to compress a number of text files from standard corpora. The obtained results demonstrate that the ACW(n, s) scheme achieves higher compression ratio than many widely used compression algorithms and it achieves a competitive performance compared to state-of-the-art compression tools.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. J. Lánský, M. Žemlička. Text compression: Syllables. In Proceedings of the Dateso Workshop on Databases, Texts, Specifications and Objects, pp. 32–45, 2005.

  2. A. Mofat, R. Y. K. Isal. Word-based text compression using the burrows-wheeler transform. Information Processing and Management, vol. 41, no. 5, pp. 1175–1192, 2005.

    Article  Google Scholar 

  3. J. Adiego, P. de la Feunte. On the use of words as source alphabet symbols in PPM. In Proceedings of Data Compression Conference, IEEE, pp. 435, 2006.

  4. J. Dvorsky, J. Pokorny, V. Snasel. Word-based compression methods for large text documents. In Proceedings of Data Compression Conference, IEEE, pp. 523, 1999.

  5. J. Lánský, M. Žemlička. Compression of a dictionary. In Proceedings of DATESO Workshop on Databases, Texts, Specifications and Objects, pp. 11–20, 2006.

  6. H. Al-Bahadili, A. Rababa’a. An adaptive bit-level text compression scheme based on the HCDC algorithm. In Proceedings of Mosharaka International Conference on Communications, Networking and Information Technology, Amman, Jordan, pp. 51–56, 2007.

  7. H. Al-Bahadili, S. M. Hussain. An adaptive character wordlength algorithm for data compression. Computers & Mathematics with Applications, vol. 55, no. 6, pp. 1250–1256, 2008.

    Article  MATH  MathSciNet  Google Scholar 

  8. Y. Weng, J. Jiang. Real-time and automatic close-up retrieval from compressed videos. International Journal of Automation and Computing, vol. 5, no. 2, pp. 198–201, 2008.

    Article  Google Scholar 

  9. L. Zhu, G. Y. Wang, C. Wang. Formal photograph compression algorithm based on object segmentation. International Journal of Automation and Computing, vol. 5, no. 3, pp. 276–283, 2008.

    Article  Google Scholar 

  10. K. Saydood. Introduction to Data Compression, 3rd ed., Morgan Kaufmann, 2006.

  11. Y. Ye, P. Cosman. Dictionary design for text image compression with JBIG2. IEEE Transactions on Image Processing, vol. 10, no. 6, pp. 818–828, 2001.

    Article  MATH  Google Scholar 

  12. I. H. Witten, A. Moffat, T. C. Bell. Managing gigabytes: Compressing and indexing documents and images. IEEE Transactions on Information Theory, vol. 41, no. 6, Part 2, pp. 2101–2102, 1995.

    Article  Google Scholar 

  13. T. C. Bell, J. G. Cleary, I. H. Witten. Text Compression, NJ, USA: Prentice-Hall, 1990.

    Google Scholar 

  14. H. Al-Bahadili. A novel lossless data compression scheme based on the error correcting Hamming codes. Computers & Mathematics with Applications, vol. 56, no. 1, pp. 143–150, 2008.

    Article  MATH  MathSciNet  Google Scholar 

  15. S. Nofal. Bit-level text compression. In Proceedings of the 1st International Conference on Digital Communications and Computer Applications, Irbid, Jordan, pp. 486–488, 2007.

  16. G. Caire, S. Shamai, S. Verdu. Noiseless data compression with low density parity check codes. Advances in Network Information Theory, DIMACS Series in Discrete Mathematics and Theoretical Computer Science, P. Gupta, G. Kramer, A. J. van Wijngaarden, Ed., vol. 66, pp. 263–284

  17. A. A. Sharieh. An enhancement of Huffman coding for the compression of multimedia files. Transactions of Engineering Computing and Technology, vol. 3, no. 1, pp. 303–305, 2004.

    Google Scholar 

  18. M. V. Mahoney. Fast text compression with neural networks. In Proceedings of the 13th International Florida Artificial Intelligence Research Society Conference, pp. 230–234, 2000.

  19. A. Rababaá. An Adaptive Bit-Level Text Compression Scheme Based on the HCDC Algorithm, M. Sc. dissertation, Amman Arab University for Graduate Studies, Amman, Jordan, 2008.

    Google Scholar 

  20. R. Arnold, T. Bell. A corpus for the evaluation of lossless compression algorithms. In Proceedings of the Conference on Data Compression, IEEE, pp. 201–210, 1997.

  21. J. S. Vitter. Dynamic Huffman codes. Journal of the ACM, vol. 34, no. 4, pp. 158–167, 1989.

    Google Scholar 

  22. J. S. Vitter. Design and analysis of dynamic Huffman coding. Journal of the ACM, vol. 34, no. 4, pp. 825–845, 1987.

    Article  MATH  MathSciNet  Google Scholar 

  23. L. Rueda, B. J. Oommen. A fast and efficient nearly-optimal adaptive Fano coding scheme. Information Science, vol. 176, no. 12, pp. 1656–1683, 2006.

    Article  MATH  MathSciNet  Google Scholar 

  24. H. Plantinga. An asymmetric, semi-adaptive text compression algorithm. In Proceedings of IEEE Data Compression Conference, 1994.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hussein Al-Bahadili.

Additional information

Hussein Al-Bahadili received the B. Sc. degree in engineering from University of Baghdad, Iraq in 1986, and the M.Sc. and Ph.D. degrees in engineering from University of London, UK in 1988 and 1991, respectively. He is currently an associate professor at the Arab Academy for Banking and Financial Sciences (AABFS). He is a visiting researcher at the Wireless Networks and Communications Centre (WNCC) at University of Brunel, UK. He is also a visiting researcher at the Centre of Osmosis Research and Applications (CORA), University of Surrey, UK.

His research interests include parallel and distributed computing, wireless communications, computer networks, cryptography and network security, data compression, image processing, and artificial intelligence and expert systems.

Shakir M. Hussain received the B.A. degree in statistics from University of Al-Mustansiriyah, Iraq in 1976 and M. Sc. degree in computing and information science from Oklahoma State University, USA in 1984. In 1997, he received the Ph.D. degree in computer science from University of Technology, Iraq. From 1997 to 2008, he was a faculty member at Applied Science University, Jordan. Currently, he is the head of Computer Science Department at Petra University, Jordan. He is a member of ACM.

His research interests include block cipher, key generation, authentication, and data compression.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Al-Bahadili, H., Hussain, S.M. A bit-level text compression scheme based on the ACW algorithm. Int. J. Autom. Comput. 7, 123–131 (2010). https://doi.org/10.1007/s11633-010-0123-6

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11633-010-0123-6

Keywords

Navigation