skip to main content
10.1145/2046684.2046700acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
short-paper

Machine learning in computer forensics (and the lessons learned from machine learning in computer security)

Published:21 October 2011Publication History

ABSTRACT

In this paper, we discuss the role that machine learning can play in computer forensics. We begin our analysis by considering the role that machine learning has gained in computer security applications, with the aim of aiding the computer forensics community in learning the lessons from the experience of the computer security community. Afterwards, we propose a brief literature review, with the purpose of illustrating the areas of computer forensics where machine learning techniques have been used until now. Then, we remark the technical requirements that should be meet by tools for computer security and computer forensics applications, with the goal of illustrating in which way machine learning algorithms can be of any practical help. We intend this paper to foster applications of machine learning in computer forensics, and we hope that the ideas in this paper may represent promising directions to pursue in the quest for more efficient and effective computer forensics tools.

References

  1. E. Anaya, M. Nakano-Miyatake, and H. Perez Meana. Network forensics with neurofuzzy techniques. In Circuits and Systems, 2009. MWSCAS '09. 52nd IEEE International Midwest Symposium on, pages 848--852, August 2009.Google ScholarGoogle ScholarCross RefCross Ref
  2. D. Ariu, R. Tronci, and G. Giacinto. HMMpayl: An Intrusion Detection System Based On Hidden Markov Models. Computers & Security, 30(4):221 -- 241, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. Barreno, P. L. Bartlett, F. J. Chi, A. D. Joseph, B. Nelson, B. I. P. Rubinstein, U. Saini, and J. D. Tygar. Open problems in the security of learning. In D. Balfanz and J. Staddon, editors, AISec, pages 19--26. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Barreno, B. Nelson, R. Sears, A. D. Joseph, and J. D. Tygar. Can machine learning be secure? In F.-C. Lin, D.-T. Lee, B.-S. P. Lin, S. Shieh, and S. Jajodia, editors, ASIACCS, pages 16--25. ACM, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. N. Beebe. Digital forensic research: The good, the bad and the unaddressed. In G. Peterson and S. Shenoi, editors, Advances in Digital Forensics V, volume 306 of IFIP Advances in Information and Communication Technology, pages 17--36. Springer Boston, 2009.Google ScholarGoogle Scholar
  6. D. E. Bell and L. J. LaPadula. Secure computer systems: Mathematical foundations and model. Technical Report M74244 1, MITRE Corporation Bedford MA, May 1973.Google ScholarGoogle Scholar
  7. K. J. Biba. Integrity considerations for secure computer systems. Technical report a423930, MITRE Corporation Bedford MA, April 1977.Google ScholarGoogle Scholar
  8. A. Case, A. Cristina, L. Marziale, G. G. Richard, and V. Roussev. Face: Automated digital evidence discovery and correlation. Digital Investigation, 5(Supplement 1):S65 -- S75, 2008. The Proceedings of the Eighth Annual DFRWS Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. N. Cheng, R. Chandramouli, and K. Subbalakshmi. Author gender identification from text. Digital Investigation, 8(1):78 -- 88, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. O. de Vel. File classification using byte sub-stream kernels. Digital Investigation, 1(2):150 -- 157, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. O. de Vel, A. Anderson, M. Corney, and G. Mohay. Mining e-mail content for author identification forensics. ACM SIGMOD Record, 30:55--64, December 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. D. Denning. An intrusion-detection model. Software Engineering, IEEE Transactions on, SE-13(2):222 -- 232, February 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. FBI. RCFL Program Annual Report for Fiscal Year 2010.Google ScholarGoogle Scholar
  14. B. Fei, J. Eloff, H. Venter, and M. Olivier. Exploring forensic data with self-organizing maps. In M. Pollitt and S. Shenoi, editors, Advances in Digital Forensics, volume 194 of IFIP International Federation for Information Processing, pages 113--123. Springer Boston, 2005.Google ScholarGoogle Scholar
  15. S. Garfinkel, P. Farrell, V. Roussev, and G. Dinolt. Bringing science to digital forensics with standardized forensic corpora. Digital Investigation, 6:S2 -- S11, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. P. Giura and N. Memon. Netstore: An efficient storage infrastructure for network forensics and monitoring. In S. Jha, R. Sommer, and C. Kreibich, editors, RAID, volume 6307 of Lecture Notes in Computer Science, pages 277--296. Springer, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. F. Iqbal, H. Binsalleeh, B. C. Fung, and M. Debbabi. Mining writeprints from anonymous e-mails for forensic investigation. Digital Investigation, 7(1-2):56 -- 64, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. F. Iqbal, H. Binsalleeh, B. C. Fung, and M. Debbabi. A unified data mining solution for authorship analysis in anonymous textual communications. Information Sciences, In Press, Corrected Proof:--, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. Karresand and N. Shahmehri. File type identification of data fragments by their binary structure. In Information Assurance Workshop, 2006 IEEE, pages 140--147, June 2006.Google ScholarGoogle ScholarCross RefCross Ref
  20. M. Khan, C. Chatwin, and R. Young. A framework for post-event timeline reconstruction using neural networks. Digital Investigation, 4(3-4):146 -- 157, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. W.-J. Li, K. Wang, S. Stolfo, and B. Herzog. Fileprints: identifying file types by n-gram analysis. In Information Assurance Workshop, 2005. IAW '05. Proceedings from the 6th Annual IEEE SMC, pages 64 -- 71, June 2005.Google ScholarGoogle Scholar
  22. N. Liao, S. Tian, and T. Wang. Network forensics based on fuzzy logic and expert system. Computer Communications, 32(17):1881 -- 1892, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. McDaniel and M. Heydari. Content based file type detection algorithms. In System Sciences, 2003. Proceedings of the 36th Annual Hawaii International Conference on, January 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. J. McHugh. Testing Intrusion detection systems: a critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratory. ACM Transactions on Information and System Security, 3:262--294, November 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. T. M. Mitchell. The discipline of machine learning. Technical Report Carnegie Mellon University-ML-06-108, Machine Learning Department, School of Computer Science, Carnegie Mellon University, 2006.Google ScholarGoogle Scholar
  26. R. Perdisci, W. Lee, and N. Feamster. Behavioral Clustering of HTTP-Based Malware and Signature Generation Using Malicious Network Traces. In NSDI, pages 391--404. USENIX Association, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. V. Roussev and S. Garfinkel. File fragment classification-the case for specialized approaches. In Systematic Approaches to Digital Forensic Engineering, 2009. SADFE '09. 4th International IEEE Workshop on, pages 3--14, May 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. F. Sebastiani. Machine learning in automated text categorization. ACM Computing Surveys, 34:1--47, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. O. Thonnard and M. Dacier. A framework for attack patterns' discovery in honeynet data. Digital Investigation, 5(Supplement 1):S128 -- S139, 2008. The Proceedings of the Eighth Annual DFRWS Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. A. Valdes and K. Skinner. Probabilistic alert correlation. In W. Lee, L. Mé, and A. Wespi, editors, Recent Advances in Intrusion Detection, volume 2212 of Lecture Notes in Computer Science, pages 54--68. Springer, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. W. Wang and T. E. Daniels. A graph based approach toward network forensics analysis. ACM Transactions on Information and System Security, 12:4:1--4:33, October 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Machine learning in computer forensics (and the lessons learned from machine learning in computer security)

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            AISec '11: Proceedings of the 4th ACM workshop on Security and artificial intelligence
            October 2011
            124 pages
            ISBN:9781450310031
            DOI:10.1145/2046684

            Copyright © 2011 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 21 October 2011

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • short-paper

            Acceptance Rates

            Overall Acceptance Rate94of231submissions,41%

            Upcoming Conference

            CCS '24
            ACM SIGSAC Conference on Computer and Communications Security
            October 14 - 18, 2024
            Salt Lake City , UT , USA

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader