Abstract
Storage drives are huge reservoirs of digital evidence. The acquisition and examination of storage drives for evidentiary artifacts require enormous amounts of manual effort and computing resources, leading to huge case backlogs. This chapter describes a forensic triage methodology that leverages random sampling and unsupervised clustering to provide insights about the regions of interest on a storage drive. The number of sector samples to be evaluated during triage for legitimate inferences to be drawn about drive content is also discussed. Experiments involving storage drives of various capacities illustrate the effectiveness and utility of the extracted patterns for rapid drive triage.
Chapter PDF
Similar content being viewed by others
References
N. Beebe, Digital forensic research: The good, the bad and the unaddressed, in Advances in Digital Forensics V, G. Peterson and S. Shenoi (Eds.), Springer, Heidelberg, Germany, pp. 17–36, 2009.
N. Beebe and J. Clark, Dealing with terabyte data sets in digital investigations, in Advances in Digital Forensics, M. Pollitt and S. Shenoi (Eds.), Springer, Boston, Massachusetts, pp. 3–16, 2006.
N. Bharadwaj and U. Singh, Efficiently searching for target data traces in storage devices with region-based random sector sampling, Digital Investigation, vol. 24, pp. 128–141, 2018.
N. Bharadwaj and U. Singh, Significant data region identification and analysis using \(k\)-means in large storage drive forensics, Security and Privacy, vol. 1(4), paper no. e40, 2018.
N. Canceill, Random Sampling Applied to Rapid Disk Analysis, Master’s Research Project Report, Department of System and Network Engineering, University of Amsterdam, Amsterdam, The Netherlands, 2013.
W. Cochran, Sampling Techniques, John Wiley and Sons, New York, 1977.
S. Garfinkel, Digital forensics research: The next 10 years, Digital Investigation, vol. 7(S), pp. S64–S73, 2010.
S. Garfinkel, Fast disk analysis with random sampling, presented at the Annual CENIC Conference, 2010.
G. Israel, Determining Sample Size, Fact Sheet PEOD-6, Florida Cooperative Extension Service, University of Florida, Gainesville, Florida, 1992.
B. Jones, S. Pleno and M. Wilkinson, The use of random sampling in investigations involving child abuse material, Digital Investigation, vol. 9(S), pp. S99–S107, 2012.
R. Lyda and J. Hamrock, Using entropy analysis to find encrypted and packed malware, IEEE Security and Privacy, vol. 5(2), pp. 40–45, 2007.
D. Quick and K. Choo, Data reduction and data mining framework for digital forensic evidence: Storage, intelligence, review and archival, Trends and Issues in Crime and Criminal Justice, no. 480, 2014.
D. Quick and K. Choo, Impacts of the increasing volume of digital forensic data: A survey and future research challenges, Digital Investigation, vol. 11(4), pp. 273–294, 2014.
D. Quick and K. Choo, Big forensic data reduction: Digital forensic images and electronic evidence, Cluster Computing, vol. 19(2), pp. 723–740, 2016.
G. Richard and V. Roussev, Next-generation digital forensics, Communications of the ACM, vol. 49(2), pp. 76–80, 2006.
V. Roussev and C. Quates, Content triage with similarity digests: The M57 case study, Digital Investigation, vol. 9(S), pp. S60–S68, 2012.
V. Roussev, C. Quates and R. Martell, Real-time digital forensics and triage, Digital Investigation, vol. 10(2), pp. 158–167, 2013.
scikit-learn, Machine learning in Python (scikit-learn.org), 2019.
M. Shannon, Forensic relative strength scoring: ASCII and entropy scoring, International Journal of Digital Evidence, vol. 2(4), 2004.
A. Singh and M Masuku, Sampling techniques and determination of sample size in applied statistics research: An overview, International Journal of Economics, Commerce and Management, vol. II(11), 2014.
J. Taguchi, Optimal Sector Sampling for Drive Triage, M.S. Thesis, Department of Computer Science, Naval Postgraduate School, Monterey, California, 2013.
R. Verma, J. Govindaraj and G. Gupta, Data privacy perceptions about digital forensic investigations in India, in Advances in Digital Forensics XII, G. Peterson and S. Shenoi (Eds.), Springer, Cham, Switzerland, pp. 25–45, 2016.
R. Verma, J. Govindaraj and G. Gupta, DF 2.0: Designing an automated, privacy preserving and efficient digital forensic framework, Proceedings of the Annual ADFSL Conference on Digital Forensics, Security and Law, pp. 127–150, 2018.
J. Young, K. Foster, S. Garfinkel and K. Fairbanks, Distinct sector hashes for target file detection, IEEE Computer, vol. 45(12), pp. 28–35, 2012.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 IFIP International Federation for Information Processing
About this paper
Cite this paper
Bharadwaj, N., Singh, U., Gupta, G. (2020). Resident Data Pattern Analysis Using Sector Clustering for Storage Drive Forensics. In: Peterson, G., Shenoi, S. (eds) Advances in Digital Forensics XVI. DigitalForensics 2020. IFIP Advances in Information and Communication Technology, vol 589. Springer, Cham. https://doi.org/10.1007/978-3-030-56223-6_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-56223-6_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-56222-9
Online ISBN: 978-3-030-56223-6
eBook Packages: Computer ScienceComputer Science (R0)