Skip to main content

Abstract

In the last decades digital forensics have become a prominent activity in modern investigations. Indeed, an important data source is often constituted by information contained in devices on which investigational activity is performed. Due to the complexity of this inquiring activity, the digital tools used for investigation constitute a central concern. In this paper a clustering-based text mining technique is introduced for investigational purposes. The proposed methodology is experimentally applied to the publicly available Enron dataset that well fits a plausible forensics analysis context.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. U.S. Department of Justice, Electronic Crime Scene Investigation: A Guide for First Responders, I Edition, NCJ 219941 (2008), http://www.ncjrs.gov/pdffiles1/nij/219941.pdf

  2. Chen, H., Chung, W., Xu, J.J., Wang, G., Qin, Y., Chau, M.: Crime data mining: a general framework and some examples. IEEE Trans. Computer 37, 50–56 (2004)

    Google Scholar 

  3. Seifert, J.W.: Data Mining and Homeland Security: An Overview. CRS Report RL31798 (2007), www.epic.org/privacy/fusion/crs-dataminingrpt.pdf

  4. Mena, J.: Investigative Data Mining for Security and Criminal Detection. Butterworth-Heinemann (2003)

    Google Scholar 

  5. Sullivan, D.: Document warehousing and text mining. John Wiley and Sons, Chichester (2001)

    Google Scholar 

  6. Fan, W., Wallace, L., Rich, S., Zhang, Z.: Tapping the power of text mining. Comm. of the ACM 49, 76–82 (2006)

    Article  Google Scholar 

  7. Decherchi, S., Gastaldo, P., Redi, J., Zunino, R.: Hypermetric k-means clustering for content-based document management. In: First Workshop on Computational Intelligence in Security for Information Systems, Genova (2008)

    Google Scholar 

  8. The Enron Email Dataset, http://www-2.cs.cmu.edu/~enron/

  9. Carrier, B.: File System Forensic Analysis. Addison-Wesley, Reading (2005)

    Google Scholar 

  10. Popp, R., Armour, T., Senator, T., Numrych, K.: Countering terrorism through information technology. Comm. of the ACM 47, 36–43 (2004)

    Article  Google Scholar 

  11. Zanasi, A. (ed.): Text Mining and its Applications to Intelligence, CRM and KM, 2nd edn. WIT Press (2007)

    Google Scholar 

  12. Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)

    MATH  Google Scholar 

  13. Baeza-Yates, R., Ribiero-Neto, B.: Modern Information Retrieval. ACM Press, New York (1999)

    Google Scholar 

  14. Salton, G., Wong, A., Yang, L.S.: A vector space model for information retrieval. Journal Amer. Soc. Inform. Sci. 18, 613–620 (1975)

    MATH  Google Scholar 

  15. Linde, Y., Buzo, A., Gray, R.M.: An algorithm for vector quantizer design. IEEE Trans. Commun. COM 28, 84–95 (1980)

    Article  Google Scholar 

  16. Bekkerman, R., McCallum, A., Huang, G.: Automatic Categorization of Email into Folders: Benchmark Experiments on Enron and SRI Corpora. CIIR Technical Report IR-418 (2004), http://www.cs.umass.edu/~ronb/papers/email.pdf

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Decherchi, S., Tacconi, S., Redi, J., Leoncini, A., Sangiacomo, F., Zunino, R. (2009). Text Clustering for Digital Forensics Analysis. In: Herrero, Á., Gastaldo, P., Zunino, R., Corchado, E. (eds) Computational Intelligence in Security for Information Systems. Advances in Intelligent and Soft Computing, vol 63. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04091-7_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04091-7_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04090-0

  • Online ISBN: 978-3-642-04091-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics