short-paper

Machine learning in computer forensics (and the lessons learned from machine learning in computer security)

Authors:

Giorgio Giacinto,

Fabio RoliAuthors Info & Claims

AISec '11: Proceedings of the 4th ACM workshop on Security and artificial intelligence

Pages 99 - 104

https://doi.org/10.1145/2046684.2046700

Published: 21 October 2011 Publication History

Abstract

In this paper, we discuss the role that machine learning can play in computer forensics. We begin our analysis by considering the role that machine learning has gained in computer security applications, with the aim of aiding the computer forensics community in learning the lessons from the experience of the computer security community. Afterwards, we propose a brief literature review, with the purpose of illustrating the areas of computer forensics where machine learning techniques have been used until now. Then, we remark the technical requirements that should be meet by tools for computer security and computer forensics applications, with the goal of illustrating in which way machine learning algorithms can be of any practical help. We intend this paper to foster applications of machine learning in computer forensics, and we hope that the ideas in this paper may represent promising directions to pursue in the quest for more efficient and effective computer forensics tools.

References

[1]

E. Anaya, M. Nakano-Miyatake, and H. Perez Meana. Network forensics with neurofuzzy techniques. In Circuits and Systems, 2009. MWSCAS '09. 52nd IEEE International Midwest Symposium on, pages 848--852, August 2009.

[2]

D. Ariu, R. Tronci, and G. Giacinto. HMMpayl: An Intrusion Detection System Based On Hidden Markov Models. Computers & Security, 30(4):221 -- 241, 2011.

Digital Library

[3]

M. Barreno, P. L. Bartlett, F. J. Chi, A. D. Joseph, B. Nelson, B. I. P. Rubinstein, U. Saini, and J. D. Tygar. Open problems in the security of learning. In D. Balfanz and J. Staddon, editors, AISec, pages 19--26. ACM, 2008.

Digital Library

[4]

M. Barreno, B. Nelson, R. Sears, A. D. Joseph, and J. D. Tygar. Can machine learning be secure? In F.-C. Lin, D.-T. Lee, B.-S. P. Lin, S. Shieh, and S. Jajodia, editors, ASIACCS, pages 16--25. ACM, 2006.

Digital Library

[5]

N. Beebe. Digital forensic research: The good, the bad and the unaddressed. In G. Peterson and S. Shenoi, editors, Advances in Digital Forensics V, volume 306 of IFIP Advances in Information and Communication Technology, pages 17--36. Springer Boston, 2009.

[6]

D. E. Bell and L. J. LaPadula. Secure computer systems: Mathematical foundations and model. Technical Report M74244 1, MITRE Corporation Bedford MA, May 1973.

[7]

K. J. Biba. Integrity considerations for secure computer systems. Technical report a423930, MITRE Corporation Bedford MA, April 1977.

[8]

A. Case, A. Cristina, L. Marziale, G. G. Richard, and V. Roussev. Face: Automated digital evidence discovery and correlation. Digital Investigation, 5(Supplement 1):S65 -- S75, 2008. The Proceedings of the Eighth Annual DFRWS Conference.

Digital Library

[9]

N. Cheng, R. Chandramouli, and K. Subbalakshmi. Author gender identification from text. Digital Investigation, 8(1):78 -- 88, 2011.

Digital Library

[10]

O. de Vel. File classification using byte sub-stream kernels. Digital Investigation, 1(2):150 -- 157, 2004.

Digital Library

[11]

O. de Vel, A. Anderson, M. Corney, and G. Mohay. Mining e-mail content for author identification forensics. ACM SIGMOD Record, 30:55--64, December 2001.

Digital Library

[12]

D. Denning. An intrusion-detection model. Software Engineering, IEEE Transactions on, SE-13(2):222 -- 232, February 1987.

Digital Library

[13]

FBI. RCFL Program Annual Report for Fiscal Year 2010.

[14]

B. Fei, J. Eloff, H. Venter, and M. Olivier. Exploring forensic data with self-organizing maps. In M. Pollitt and S. Shenoi, editors, Advances in Digital Forensics, volume 194 of IFIP International Federation for Information Processing, pages 113--123. Springer Boston, 2005.

[15]

S. Garfinkel, P. Farrell, V. Roussev, and G. Dinolt. Bringing science to digital forensics with standardized forensic corpora. Digital Investigation, 6:S2 -- S11, 2009.

Digital Library

[16]

P. Giura and N. Memon. Netstore: An efficient storage infrastructure for network forensics and monitoring. In S. Jha, R. Sommer, and C. Kreibich, editors, RAID, volume 6307 of Lecture Notes in Computer Science, pages 277--296. Springer, 2010.

Digital Library

[17]

F. Iqbal, H. Binsalleeh, B. C. Fung, and M. Debbabi. Mining writeprints from anonymous e-mails for forensic investigation. Digital Investigation, 7(1-2):56 -- 64, 2010.

Digital Library

[18]

F. Iqbal, H. Binsalleeh, B. C. Fung, and M. Debbabi. A unified data mining solution for authorship analysis in anonymous textual communications. Information Sciences, In Press, Corrected Proof:--, 2011.

Digital Library

[19]

M. Karresand and N. Shahmehri. File type identification of data fragments by their binary structure. In Information Assurance Workshop, 2006 IEEE, pages 140--147, June 2006.

[20]

M. Khan, C. Chatwin, and R. Young. A framework for post-event timeline reconstruction using neural networks. Digital Investigation, 4(3-4):146 -- 157, 2007.

Digital Library

[21]

W.-J. Li, K. Wang, S. Stolfo, and B. Herzog. Fileprints: identifying file types by n-gram analysis. In Information Assurance Workshop, 2005. IAW '05. Proceedings from the 6th Annual IEEE SMC, pages 64 -- 71, June 2005.

[22]

N. Liao, S. Tian, and T. Wang. Network forensics based on fuzzy logic and expert system. Computer Communications, 32(17):1881 -- 1892, 2009.

Digital Library

[23]

M. McDaniel and M. Heydari. Content based file type detection algorithms. In System Sciences, 2003. Proceedings of the 36th Annual Hawaii International Conference on, January 2003.

Digital Library

[24]

J. McHugh. Testing Intrusion detection systems: a critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratory. ACM Transactions on Information and System Security, 3:262--294, November 2000.

Digital Library

[25]

T. M. Mitchell. The discipline of machine learning. Technical Report Carnegie Mellon University-ML-06-108, Machine Learning Department, School of Computer Science, Carnegie Mellon University, 2006.

[26]

R. Perdisci, W. Lee, and N. Feamster. Behavioral Clustering of HTTP-Based Malware and Signature Generation Using Malicious Network Traces. In NSDI, pages 391--404. USENIX Association, 2010.

Digital Library

[27]

V. Roussev and S. Garfinkel. File fragment classification-the case for specialized approaches. In Systematic Approaches to Digital Forensic Engineering, 2009. SADFE '09. 4th International IEEE Workshop on, pages 3--14, May 2009.

Digital Library

[28]

F. Sebastiani. Machine learning in automated text categorization. ACM Computing Surveys, 34:1--47, 2002.

Digital Library

[29]

O. Thonnard and M. Dacier. A framework for attack patterns' discovery in honeynet data. Digital Investigation, 5(Supplement 1):S128 -- S139, 2008. The Proceedings of the Eighth Annual DFRWS Conference.

Digital Library

[30]

A. Valdes and K. Skinner. Probabilistic alert correlation. In W. Lee, L. Mé, and A. Wespi, editors, Recent Advances in Intrusion Detection, volume 2212 of Lecture Notes in Computer Science, pages 54--68. Springer, 2001.

Digital Library

[31]

W. Wang and T. E. Daniels. A graph based approach toward network forensics analysis. ACM Transactions on Information and System Security, 12:4:1--4:33, October 2008.

Digital Library

Cited By

Çiftçi RDönmez EKurtoğlu AEken ÖSamee NAlkanhel R(2024)Human gender estimation from CT images of skull using deep feature selection and feature fusionScientific Reports10.1038/s41598-024-65521-314:1Online publication date: 23-Jul-2024
https://doi.org/10.1038/s41598-024-65521-3
Venugopal SRengaswamy RWinster Sathianesan G(2022)IoT based cyber forensics in big data optimization and privacy using deep neural anomaly detection with Hadoop clustering and convolution based Adam optimizerConcurrency and Computation: Practice and Experience10.1002/cpe.688134:11Online publication date: 22-Mar-2022
https://doi.org/10.1002/cpe.6881
Qadir SNoor B(2021)Applications of Machine Learning in Digital Forensics2021 International Conference on Digital Futures and Transformative Technologies (ICoDT2)10.1109/ICoDT252288.2021.9441543(1-8)Online publication date: 20-May-2021
https://doi.org/10.1109/ICoDT252288.2021.9441543
Show More Cited By

Index Terms

Machine learning in computer forensics (and the lessons learned from machine learning in computer security)

Recommendations

Machine Learning: The State of the Art

The two fundamental problems in machine learning (ML) are statistical analysis and algorithm design. The former tells us the principles of the mathematical models that we establish from the observation data. The latter defines the conditions on which ...
Computer Security and Machine Learning: Worst Enemies or Best Friends?
SYSSEC '11: Proceedings of the 2011 First SysSec Workshop

Computer systems linked to the Internet are confronted with a plethora of security threats, ranging from classic computer worms to involved drive-by downloads and bot networks. In the last years these threats have reached a new quality of automatization ...
Lifelong Machine Learning

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

AISec '11: Proceedings of the 4th ACM workshop on Security and artificial intelligence

October 2011

124 pages

ISBN:9781450310031

DOI:10.1145/2046684

General Chair:
Yan Chen
Northwestern University, USA
,
Program Chairs:
Alvaro A. Cárdenas
Fujitsu Laboratories of America, USA
,
Rachel Greenstadt
Drexel University, USA
,
Ben Rubinstein
Microsoft Research, USA

Copyright © 2011 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGSAC: ACM Special Interest Group on Security, Audit, and Control

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 October 2011

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Conference

CCS'11

Sponsor:

SIGSAC

CCS'11: the ACM Conference on Computer and Communications Security

October 21, 2011

Illinois, Chicago, USA

Acceptance Rates

Overall Acceptance Rate 94 of 231 submissions, 41%

Upcoming Conference

CCS '25

Sponsor:
sigsac

ACM SIGSAC Conference on Computer and Communications Security

October 13 - 17, 2025

Taipei , Taiwan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
1,683
Total Downloads

Downloads (Last 12 months)36
Downloads (Last 6 weeks)3

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Çiftçi RDönmez EKurtoğlu AEken ÖSamee NAlkanhel R(2024)Human gender estimation from CT images of skull using deep feature selection and feature fusionScientific Reports10.1038/s41598-024-65521-314:1Online publication date: 23-Jul-2024
https://doi.org/10.1038/s41598-024-65521-3
Venugopal SRengaswamy RWinster Sathianesan G(2022)IoT based cyber forensics in big data optimization and privacy using deep neural anomaly detection with Hadoop clustering and convolution based Adam optimizerConcurrency and Computation: Practice and Experience10.1002/cpe.688134:11Online publication date: 22-Mar-2022
https://doi.org/10.1002/cpe.6881
Qadir SNoor B(2021)Applications of Machine Learning in Digital Forensics2021 International Conference on Digital Futures and Transformative Technologies (ICoDT2)10.1109/ICoDT252288.2021.9441543(1-8)Online publication date: 20-May-2021
https://doi.org/10.1109/ICoDT252288.2021.9441543
Nazar NShukla VKaur GPandey N(2021)Integrating Web Server Log Forensics through Deep Learning2021 9th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO)10.1109/ICRITO51393.2021.9596324(1-6)Online publication date: 3-Sep-2021
https://doi.org/10.1109/ICRITO51393.2021.9596324
Ting CJohnson NOnunkwo UTucker J(2021)Faster classification using compression analytics2021 International Conference on Data Mining Workshops (ICDMW)10.1109/ICDMW53433.2021.00105(813-822)Online publication date: Dec-2021
https://doi.org/10.1109/ICDMW53433.2021.00105
Kamoun FIqbal FEsseghir MBaker T(2020)AI and machine learning: A mixed blessing for cybersecurity2020 International Symposium on Networks, Computers and Communications (ISNCC)10.1109/ISNCC49221.2020.9297323(1-7)Online publication date: 20-Oct-2020
https://doi.org/10.1109/ISNCC49221.2020.9297323
Jain VSahu DSingh Tomar D(2020)An Approach to Identify Vulnerable Features of Instant Messenger2020 Third ISEA Conference on Security and Privacy (ISEA-ISAP)10.1109/ISEA-ISAP49340.2020.235003(71-80)Online publication date: Feb-2020
https://doi.org/10.1109/ISEA-ISAP49340.2020.235003
Qadir AVarol A(2020)The Role of Machine Learning in Digital Forensics2020 8th International Symposium on Digital Forensics and Security (ISDFS)10.1109/ISDFS49300.2020.9116298(1-5)Online publication date: Jun-2020
https://doi.org/10.1109/ISDFS49300.2020.9116298
Awais MNaeem FRasool NMahmood S(2018)Identification of sex from footprint dimensions using machine learning: a study on population of Punjab in PakistanEgyptian Journal of Forensic Sciences10.1186/s41935-018-0106-28:1Online publication date: 22-Dec-2018
https://doi.org/10.1186/s41935-018-0106-2
Poisel RRybnicek MTjoa S(2014)Taxonomy of Data Fragment Classification TechniquesDigital Forensics and Cyber Crime10.1007/978-3-319-14289-0_6(67-85)Online publication date: 23-Dec-2014
https://doi.org/10.1007/978-3-319-14289-0_6

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten