skip to main content
10.1145/2467696.2467716acmconferencesArticle/Chapter ViewAbstractPublication PagesjcdlConference Proceedingsconference-collections
research-article

Redeye: a digital library for forensic document triage

Published:22 July 2013Publication History

ABSTRACT

Forensic document analysis has become an important aspect of investigation of many different kinds of crimes from money laundering to fraud and from cybercrime to smuggling. The current workflow for analysts includes powerful tools, such as Palantir and Analyst's Notebook, for moving from evidence to actionable intelligence and tools for finding documents among the millions of files on a hard disk, such as Forensic Toolkit (FTK). Analysts often leave the process of sorting through collections of seized documents to filter out noise from actual evidence to highly labor-intensive manual efforts. This paper presents the Redeye Analysis Workbench, a tool to help analysts move from manual sorting of a collection of documents to performing intelligent document triage over a digital library. We will discuss the tools and techniques we build upon in addition to an in-depth discussion of our tool and how it addresses two major use cases we observed analysts performing. Finally, we also include a new layout algorithm for radial graphs that is used to visualize clusters of documents in our system.

References

  1. J. W. Reed, T. E. Potok, and R. M. Patton, "A multi-agent system for distributed cluster analysis," in Proceedings of Third International Workshop on Software Engineering for Large-Scale Multi-Agent Systems (SELMAS'04) Workshop in conjunction with the 26th International Conference on Software Engineering Edinburgh, Scotland, UK: IEE, 2004, pp. 152--5. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. W. Reed, Y. Jiao, T. E. Potok, B. A. Klump, M. T. Elmore, and A. R. Hurson, "TF-ICF: A new term weighting scheme for clustering dynamic data streams," in Machine Learning and Applications, 2006. ICMLA'06. 5th International Conference on, 2006, pp. 258--263. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. M. Patton, W. McNair, C. T. Symons, J. N. Treadwell, and T. E. Potok, "A Text Analysis Approach to Motivate Knowledge Sharing via Microsoft SharePoint," in System Science (HICSS), 2012 45th Hawaii International Conference on, 2012, pp. 3670--3678. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Bae, R. Badi, K. Meintanis, J. Moore, A. Zacchi, H. Hsieh, C. Marshall, and F. Shipman, "Effects of display configurations on document triage," Human-Computer Interaction-INTERACT 2005, pp. 130--143, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C. C. Marshall and F. M. Shipman III, "Spatial hypertext and the practice of information triage," in Proceedings of the eighth ACM conference on Hypertext, 1997, pp. 124--133. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. F. M. Shipman, H. Hsieh, J. M. Moore, and A. Zacchi, "Supporting personal collections across digital libraries in spatial hypertext," in Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries, 2004, pp. 358--367. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Bae, D. H. Kim, K. Meintanis, J. M. Moore, A. Zacchi, F. Shipman, H. Hsieh, and C. C. Marshall, "Supporting document triage via annotation-based multi-application visualizations," in Proceedings of the 10th annual joint conference on Digital libraries, 2010, pp. 177--186. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. G. Buchanan and T. Owen, "Improving skim reading for document triage," in Proceedings of the second international symposium on Information interaction in context, 2008, pp. 83--88. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. F. Loizides and G. R. Buchanan, "Performing document triage on small screen devices. part 1: structured documents," in Proceedings of the third symposium on Information interaction in context, 2010, pp. 341--346. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. G. Cantrell, D. Dampier, Y. S. Dandass, N. Niu, and C. Bogen, "Research toward a Partially-Automated, and Crime Specific Digital Triage Process Model," Computer and Information Science, vol. 5, p. p29, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  11. G. Buchanan, "Rapid document navigation for information triage support," in Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries, 2007, pp. 503--503. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Payne, J. Solomon, R. Sankar, and B. McGrew, "Grand challenge award: Interactive visual analytics palantir: The future of analysis," in Visual Analytics Science and Technology, 2008. VAST '08. IEEE Symposium on, 2008, pp. 201--202.Google ScholarGoogle Scholar
  13. IBM Corporation. (2012). IBM i2 Analyst's Notebook datasheet. Available: http://public.dhe.ibm.com/common/ssi/ecm/en/zzd03127usen/ZZD03127USEN.PDFGoogle ScholarGoogle Scholar
  14. i2 Limited, "i2 Analyst's Notebook 8: product overview.," i2 Limited 2009.Google ScholarGoogle Scholar
  15. J. L. John, "Adapting existing technologies for digitally archiving personal lives," iPRES 2008, p. 48, 2008.Google ScholarGoogle Scholar
  16. D. Rubel, "The heart of eclipse," Queue, vol. 4, pp. 36--44, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. 10gen Incorporated. (2013). mongoDB. Available: http://www.mongodb.org/Google ScholarGoogle Scholar
  18. Apache Software Foundation. (2012). Apache POI - the Java API for Microsoft Documents. Available: http://poi.apache.orgGoogle ScholarGoogle Scholar
  19. Microsoft Corporation. (2012). IFilter interface. Available: http://msdn.microsoft.com/en-us/library/ms691105(v=vs.85).aspxGoogle ScholarGoogle Scholar
  20. Apache Software Foundation. (2012). Apache Solr. Available: http://lucene.apache.org/solr/Google ScholarGoogle Scholar
  21. Massachusetts Institute of Technology. (2009). SIMILE Widgets Timeline. Available: http://www.simile-widgets.org/timeline/Google ScholarGoogle Scholar
  22. I. Herman, G. Melançon, and M. S. Marshall, "Graph visualization and navigation in information visualization: A survey," Visualization and Computer Graphics, IEEE Transactions on, vol. 6, pp. 24--43, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. Bastian, S. Heymann, and M. Jacomy, Gephi: An Open Source Software for Exploring and Manipulating Networks, 2009.Google ScholarGoogle Scholar
  24. G. M. Draper, Y. Livnat, and R. F. Riesenfeld, "A survey of radial methods for information visualization," Visualization and Computer Graphics, IEEE Transactions on, vol. 15, pp. 759--776, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. K. Pearson, "Mathematical Contributions to the Theory of Evolution. XIX. Second Supplement to a Memoir on Skew Variation," Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, vol. 216, pp. 429--457, January 1, 1916 1916.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Redeye: a digital library for forensic document triage

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            JCDL '13: Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries
            July 2013
            480 pages
            ISBN:9781450320771
            DOI:10.1145/2467696

            Copyright © 2013 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 22 July 2013

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            JCDL '13 Paper Acceptance Rate28of95submissions,29%Overall Acceptance Rate415of1,482submissions,28%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader