Abstract
Data compression and decompression have been widely applied in modern communication and data transmission fields. But how to decompress corrupted lossless compressed files is still a challenge. This paper presents a fault-tolerant decompression (FTD) method for corrupted Huffman files. It is achieved by utilizing source prior information and heuristic method. In this paper, we propose to use Huffman coding rules and grammar rules to model the source prior. According to the source prior information, we can roughly estimate the range of error bits. As for error correction, a heuristic algorithm is developed to determine the accurate positions of error bits and correct the errors. The experimental results demonstrate that the proposed FTD method can achieve a correction rate of 96.84% for corrupted Huffman files when the source prior information is accurate. More importantly, the proposed method is a general model that can be applied to decompress various types of lossless compressed files of which the original files are natural language texts.




Similar content being viewed by others
References
Hilbert, M., & Lopez, P. (2011). The world’s technological capacity to store, communicate, and compute information. Science, 332(6025), 60–65.
Hamschin, B. M., Ferguson, J. D., & Grabbe, M. T. (2017). Interception of multiple low-power linear frequency modulated continuous wave signals. IEEE Transactions on Aerospace and Electronic Systems, 53(2), 789–804.
Nieto, A., Roman, R., & Lopez, J. (2016). Digital witness: Safeguarding digital evidence by using secure architectures in personal devices. IEEE Network, 30(6), 34–41.
Huang, W.-J., & McCluskey, E. J. (2000). Transient errors and rollback recovery in LZ compression. In Dependable computing, 2000 Pacific Rim international symposium on dependable computing, December 20, 2000 (pp. 128–135).
Murin, Y., Dabora, R., & Gündüz, D. (2014). On joint source channel coding for correlated sources over multiple-access relay channels. IEEE Transactions on Information Theory, 60(10), 6231–6253.
Cover, T. M., & Thomas, J. A. (2012). Elements of information theory. New York: Wiley.
Brejza, M. F., Wang, T., Zhang, W.-b., et al. (2016). Exponential Golomb and rice error correction codes for generalized near-capacity joint source and channel coding. IEEE Access, 4, 7154–7175.
Huang, W.-J., Saxena, N., & McCluskey, E. J. (2000). A reliable LZ data compressor on reconfigurable coprocessors. In 2000 IEEE symposium on field-programmable custom computing machines, April 17–19, 2000 (pp. 249–258).
Zhou, R.-s., & Li, S.-h. (2005). Study on recovery of zip archive data. Computer Development and Applications, 10, 2–3.
Park, B., Savoldi, A., Gubian, P., et al. (2008). Recovery of damaged compressed files for digital forensic purposes. In International conference on multimedia and ubiquitous engineering, April 24–26, 2008 (pp. 365–372).
Li, C.-h., & Zheng, H. (2006). A fault-tolerance decoding algorithm for text compression. Radio Communications Technology, 32(2), 36–38.
Chen, Y.-x., Zhao, H Z Y.-q., et al. (2010). Novel error resilient decoding algorithm for LZW based residual source redundancy. Journal of Wuhan University of Technology, 32(10), 159–163.
Narimani, H., & Khosravifard, M. (2014). Huffman redundancy for large alphabet sources. IEEE Transactions on Information Theory, 60(3), 1412–1427.
Higgs, M. B., Perkins, S., & Smith, D. H. (2009). The construction of variable length codes with good synchronization properties. IEEE Transactions on information Theory, 55(4), 1696–1700.
Konstantinides, J. M., & Andreadis, I. (2016). Performance analysis for canonical Huffman coding with fixed window size. Electronics Letters, 52(7), 525–527.
Strawn, G. (2014). Claude shannon: Mastermind of information theory. IT Professional, 16(6), 70–72.
Cinlar, E. (2013). Introduction to stochastic processes. Chelmsford: Courier Corporation.
Manning, C. D., & Schtze, H. (1999). Foundations of statistical natural language processing (pp. 91–92). Cambridge: MIT press.
American National Corpus Project. American national corpus (2002–2015). http://www.anc.org/data/oanc/download/. March 5, 2016.
English e-books (2012–2016). http://english-ebooks.net. July 24, 2016.
Loyal books. (2016). http://www.loyalbooks.com. July 24, 2016.
Siivola, V., Hirsimaki, T., & Virpioja, S. (2007). On growing and pruning kneserney smoothed n-gram models. IEEE Transactions on Audio, Speech and Language Processing, 15(5), 1617–1624.
Maynard, D., Bontcheva, K., & Augenstein, I. (2016). Natural language processing for the semantic web. Synthesis Lectures on the Semantic Web: Theory and Technology, 6(2), 9–23.
Chaurasiya, R. K., Londhe, N. D., & Ghosh, S. (2016). A novel weighted edit distance-based spelling correction approach for improving the reliability of Devanagari script-based p300 speller system. IEEE Access, 4, 8184–8198.
Li, Y.-j., & Liu, B. (2007). A normalized Levenshtein distance metric. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(6), 1091–1095.
Balhaf, K., et al. (2017). Accelerating Levenshtein and Damerau edit distance algorithms using GPU with unified memory. In 2017 8th international conference on information and communication systems (ICICS), April 4–6, 2017 (pp. 7–11).
Yang, A., Han, Y., Pan, Y., et al. (2017). Optimum surface roughness prediction for titanium alloy by adopting response surface methodology. Results in Physics, 7, 1046–1050.
Li, J., Huang, L., Zhou, Y., et al. (2017). Computation partitioning for mobile cloud computing in a big data environment. IEEE Transactions on Industrial Informatics, 13(4), 2009–2018.
Li, J., Yu, F. R., Deng, G., et al. (2017). Industrial internet: A survey on the enabling technologies, applications, and challenges. IEEE Communications Surveys and Tutorials, 19(3), 1504–1526.
Li, J., Deng, G., Luo, C., et al. (2016). A hybrid path planning method in unmanned air/ground vehicle (UAV/UGV) cooperative systems. IEEE Transactions on Vehicular Technology, 65(12), 9585–9596.
Li, J. Q., Li, W. L., Deng, G. Q., et al. (2016). Continuous-behavior and discrete-time combined control for linear induction motor-based Urban Rail Transit. IEEE Transactions on Magnetics, 52(7), 1–4.
Cui, K., Yang, W., & Gou, H. (2017). Experimental research and finite element analysis on the dynamic characteristics of concrete steel bridges with multi-cracks. Journal of Vibro Engineering, 19(6), 4198–4209.
Wei, W., Fan, X., Song, H., et al. (2016). Imperfect information dynamic stackelberg game based resource allocation using hidden Markov for cloud computing. IEEE Transactions on Services Computing, 99, 1–13.
Cui, K., & Zhao, T. T. (2017). Unsaturated dynamic constitutive model under cyclic loading. Cluster Computing, 20(4), 2869–2879.
Wei, W., Song, H., Li, W., et al. (2017). Gradient-driven parking navigation using a continuous information potential field based on wireless sensor network. Information Sciences, 408, 100–114.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wang, D., Zhao, X. & Sun, Q. Novel Fault-Tolerant Decompression Method of Corrupted Huffman Files. Wireless Pers Commun 102, 2555–2574 (2018). https://doi.org/10.1007/s11277-018-5277-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11277-018-5277-5