Abstract
An information entropy graph shows the probabilities of each piece of information being included in a dataset as entropy values using information entropy. Well-known filetypes exhibit different information entropy graph characteristics; hence, they can be detected and differentiated using these characteristics. In this paper, a method that detects damaged files using information entropy graphs is proposed. The proposed method expands on conventional proposals that use only information entropy values to facilitate differentiation of different filetypes that present the same entropy values. In experiments conducted, patterns that have significance for analysis and detection were shown in the information entropy graphs of well-known files. In addition, even when files had damaged header, footer, or body regions, the similarity of the graph pattern was preserved, even though the entropy values differed. The proposed method also enables quantitative comparison of the similarity of files before and after damage with their original versions through graph pattern similarity tests.
Similar content being viewed by others
References
Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27:379–423 623–656
Harris R (2006) Arriving at an anti-forensics consensus: examining how to define and control the anti-forensics problem. In: DFRWS ’06 vol 3, pp 44–49
Sparta RL, Hamrock J, Bradley M (2007) Using entropy analysis to find encrypted and packed malware. IEEE Secur Priv 5:40–45
Jeong G, Choo E, Lee J, Bat-Erdene M, Lee H (2010) Generic unpacking using entropy analysis. In: Proceedings of 2010 5th International Conference on Malicious and Unwanted Software (MALWARE), pp 98–105
Garfinkel SL (2007) Carving contiguous and fragmented files with fast object validation. Digit Invest 4S:S2–S12
Pal A, Sencar HT, Memon N (2008) Detecting file fragmentation point using sequential hypothesis testing. Digit Invest 5:S2–S13
Shahabi C, Kim SH, Nocera L, Constantinou G, Lu Y, Cai Y, Medioni G, Nevatia R, Banaei-Kashani F (2014) Multi source event detection and collection system for effective surveillance of criminal activity. J Inf Process Syst 10:1–22
Juneja M, Sandhu PS (2013) A new approach for information security using an improved steganography technique. J Inf Process Syst 9:405–424
Seo JH, Park HB (2006) Data-hiding method using digital watermark in the public multimedia network. J Inf Process Syst 2:82–87
Teelink S, Erbacher RF (2006) Improving the computer forensic analysis process through visualization. Commun ACM 49:71–75
Stallard T, Levitt K (2003) Automated analysis for digital forensic science: semantic integrity checking. In: Proceedings of the 19th Annual Computer Security Applications Conference, pp 160–167
Gloe T (2012) Forensic analysis of ordered data structures on the example of JPEG files. In: Proceedings of the 2012 IEEE International Workshop on Information Forensics and Security (WIFS), pp 139–144
De Bock J, De Smet P (2016) JPGcarve: an advanced tool for automated recovery of fragmented JPEG files. IEEE Trans Inf Forensics Secur 11:19–34
Acknowledgements
This research was supported by the MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2017-2016-0-00304) supervised by the IITP (Institute for Information & communications Technology Promotion).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cho, C., Chung, K. & Won, Y. Detection of damaged files and measurement of similarity to originals using entropy graph characteristics. J Supercomput 74, 6719–6728 (2018). https://doi.org/10.1007/s11227-017-2121-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-017-2121-8