ABSTRACT
File carving is a data recovery technique used in many investigations in digital forensics, with some limitations. Especially JPEG files are difficult to recover when fragmented, because they consist almost entirely of large blobs of highly compressed entropy-coded data, with no clearly discernible structure.
This paper describes an approach that leverages two observations about many JPEG files in practice. First, the Huffman tables used to decode a large proportion of the entropy-coded data often do not use all possible code values at their longest code length, offering possibilities to detect errors when invalid codes are encountered. Second, after translating Huffman codes to symbols, the next step in decoding involves filling quantization arrays with exactly 64 values, offering another possibility to detect errors when an overflow is encountered.
This paper presents an algorithm to validate the entropy-coded data using these two observations and finds that the odds of finding fragmentation points are quite high, especially with regard to invalid Huffman codes. It will work with the example Huffman tables provided by the JPEG standard that are used by many digital cameras, but also with many optimized Huffman tables generated by specialized applications.
- Brandon Birmingham, Reuben A. Farrugia, and Mark Vella. 2017. Using thumbnail affinity for fragmentation point detection of JPEG files. In 17th International Conference on Smart Technologies (IEEE EUROCON). IEEE, 3–8.Google ScholarCross Ref
- Simson L Garfinkel. 2007. Carving contiguous and fragmented files with fast object validation. Digital Investigation 4(2007), 2–12.Google ScholarDigital Library
- ITU/CCIT/JPEG. 1992. Recommendation T.81: Digital Compression and Coding of Continuous-Tone Still Images - Requirements and Guidelines. Technical Report. International Telecommunication Union.Google Scholar
- Martin Karresand and Nahid Shahmehri. 2008. Reassembly of fragmented JPEG images containing restart markers. In 2008 European Conference on Computer Network Defense. IEEE, 25–32.Google ScholarDigital Library
- Qiming Li, Bilgehan Sahin, Ee-Chien Chang, and Vrizlynn L. L. Thing. 2011. Content based JPEG fragmentation point detection. In IEEE International Conference on Multimedia and Expo (ICME). IEEE, 1–6.Google Scholar
- Dutch T. Meyer and William J. Bolosky. 2012. A study of practical deduplication. ACM Transactions on Storage 7, 4 (2012), 14:1–14:20.Google ScholarDigital Library
- Kamaruddin Malik Mohamad and Mustafa Mat Deris. 2009. Fragmentation Point Detection of JPEG Images at DHT Using Validator. In First International Conference on Future Generation Information Technology (FGIT). Springer Berlin Heidelberg, Berlin, Heidelberg, 173–180.Google ScholarDigital Library
- Anandabrata Pal and Nasir Memon. 2009. The evolution of file carving. IEEE Signal Processing Magazine 26, 2 (2009), 59–71.Google ScholarCross Ref
- Anandabrata Pal, Husrev T Sencar, and Nasir Memon. 2008. Detecting file fragmentation point using sequential hypothesis testing. Digital Investigation 5(2008), S2–S13.Google ScholarDigital Library
- Husrev T Sencar and Nasir Memon. 2009. Identification and recovery of JPEG files with missing fragments. Digital Investigation 6(2009), S88–S98.Google ScholarDigital Library
- Yanbin Tang, Junbin Fang, KP Chow, SM Yiu, Jun Xu, Bo Feng, Qiong Li, and Qi Han. 2016. Recovery of heavily fragmented JPEG files. Digital Investigation 18(2016), S108–S117.Google ScholarDigital Library
- Vincent van der Meer, Hugo Jonker, Guy Dols, Harm van Beek, Jeroen van den Bos, and Marko van Eekelen. 2019. File Fragmentation in the Wild: a Privacy-Friendly Approach. In IEEE International Workshop on Information Forensics and Security (WIFS). IEEE, 1–6.Google ScholarCross Ref
- Hwei-Ming Ying and Vrizlynn L. L. Thing. 2010. A Novel Inequality-Based Fragmented File Carving Technique. In Third International ICST Conference, e-Forensics(LNICST, Vol. 56). Springer, 28–39.Google Scholar
Recommendations
Fragmentation Point Detection of JPEG Images at DHT Using Validator
FGIT '09: Proceedings of the 1st International Conference on Future Generation Information TechnologyFile carving is an important, practical technique for data recovery in digital forensics investigation and is particularly useful when filesystem metadata is unavailable or damaged. The research on reassembly of JPEG files with RST markers, fragmented ...
JPEG optimization using an entropy-constrained quantization framework
DCC '95: Proceedings of the Conference on Data CompressionPrevious works, including adaptive quantizer selection and adaptive coefficient thresholding, have addressed the optimization of a baseline-decodable JPEG coder in a rate-distortion (R-D) sense. In this work, by developing an entropy-constrained ...
Joint optimization of run-length coding, Huffman coding, and quantization table with complete baseline JPEG decoder compatibility
To maximize rate distortion performance while remaining faithful to the JPEG syntax, the joint optimization of the Huffman tables, quantization step sizes, and DCT indices of a JPEG encoder is investigated. Given Huffman tables and quantization step ...
Comments