ABSTRACT
Forensic document analysis has become an important aspect of investigation of many different kinds of crimes from money laundering to fraud and from cybercrime to smuggling. The current workflow for analysts includes powerful tools, such as Palantir and Analyst's Notebook, for moving from evidence to actionable intelligence and tools for finding documents among the millions of files on a hard disk, such as Forensic Toolkit (FTK). Analysts often leave the process of sorting through collections of seized documents to filter out noise from actual evidence to highly labor-intensive manual efforts. This paper presents the Redeye Analysis Workbench, a tool to help analysts move from manual sorting of a collection of documents to performing intelligent document triage over a digital library. We will discuss the tools and techniques we build upon in addition to an in-depth discussion of our tool and how it addresses two major use cases we observed analysts performing. Finally, we also include a new layout algorithm for radial graphs that is used to visualize clusters of documents in our system.
- J. W. Reed, T. E. Potok, and R. M. Patton, "A multi-agent system for distributed cluster analysis," in Proceedings of Third International Workshop on Software Engineering for Large-Scale Multi-Agent Systems (SELMAS'04) Workshop in conjunction with the 26th International Conference on Software Engineering Edinburgh, Scotland, UK: IEE, 2004, pp. 152--5. Google ScholarDigital Library
- J. W. Reed, Y. Jiao, T. E. Potok, B. A. Klump, M. T. Elmore, and A. R. Hurson, "TF-ICF: A new term weighting scheme for clustering dynamic data streams," in Machine Learning and Applications, 2006. ICMLA'06. 5th International Conference on, 2006, pp. 258--263. Google ScholarDigital Library
- R. M. Patton, W. McNair, C. T. Symons, J. N. Treadwell, and T. E. Potok, "A Text Analysis Approach to Motivate Knowledge Sharing via Microsoft SharePoint," in System Science (HICSS), 2012 45th Hawaii International Conference on, 2012, pp. 3670--3678. Google ScholarDigital Library
- S. Bae, R. Badi, K. Meintanis, J. Moore, A. Zacchi, H. Hsieh, C. Marshall, and F. Shipman, "Effects of display configurations on document triage," Human-Computer Interaction-INTERACT 2005, pp. 130--143, 2005. Google ScholarDigital Library
- C. C. Marshall and F. M. Shipman III, "Spatial hypertext and the practice of information triage," in Proceedings of the eighth ACM conference on Hypertext, 1997, pp. 124--133. Google ScholarDigital Library
- F. M. Shipman, H. Hsieh, J. M. Moore, and A. Zacchi, "Supporting personal collections across digital libraries in spatial hypertext," in Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries, 2004, pp. 358--367. Google ScholarDigital Library
- S. Bae, D. H. Kim, K. Meintanis, J. M. Moore, A. Zacchi, F. Shipman, H. Hsieh, and C. C. Marshall, "Supporting document triage via annotation-based multi-application visualizations," in Proceedings of the 10th annual joint conference on Digital libraries, 2010, pp. 177--186. Google ScholarDigital Library
- G. Buchanan and T. Owen, "Improving skim reading for document triage," in Proceedings of the second international symposium on Information interaction in context, 2008, pp. 83--88. Google ScholarDigital Library
- F. Loizides and G. R. Buchanan, "Performing document triage on small screen devices. part 1: structured documents," in Proceedings of the third symposium on Information interaction in context, 2010, pp. 341--346. Google ScholarDigital Library
- G. Cantrell, D. Dampier, Y. S. Dandass, N. Niu, and C. Bogen, "Research toward a Partially-Automated, and Crime Specific Digital Triage Process Model," Computer and Information Science, vol. 5, p. p29, 2012.Google ScholarCross Ref
- G. Buchanan, "Rapid document navigation for information triage support," in Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries, 2007, pp. 503--503. Google ScholarDigital Library
- J. Payne, J. Solomon, R. Sankar, and B. McGrew, "Grand challenge award: Interactive visual analytics palantir: The future of analysis," in Visual Analytics Science and Technology, 2008. VAST '08. IEEE Symposium on, 2008, pp. 201--202.Google Scholar
- IBM Corporation. (2012). IBM i2 Analyst's Notebook datasheet. Available: http://public.dhe.ibm.com/common/ssi/ecm/en/zzd03127usen/ZZD03127USEN.PDFGoogle Scholar
- i2 Limited, "i2 Analyst's Notebook 8: product overview.," i2 Limited 2009.Google Scholar
- J. L. John, "Adapting existing technologies for digitally archiving personal lives," iPRES 2008, p. 48, 2008.Google Scholar
- D. Rubel, "The heart of eclipse," Queue, vol. 4, pp. 36--44, 2006. Google ScholarDigital Library
- 10gen Incorporated. (2013). mongoDB. Available: http://www.mongodb.org/Google Scholar
- Apache Software Foundation. (2012). Apache POI - the Java API for Microsoft Documents. Available: http://poi.apache.orgGoogle Scholar
- Microsoft Corporation. (2012). IFilter interface. Available: http://msdn.microsoft.com/en-us/library/ms691105(v=vs.85).aspxGoogle Scholar
- Apache Software Foundation. (2012). Apache Solr. Available: http://lucene.apache.org/solr/Google Scholar
- Massachusetts Institute of Technology. (2009). SIMILE Widgets Timeline. Available: http://www.simile-widgets.org/timeline/Google Scholar
- I. Herman, G. Melançon, and M. S. Marshall, "Graph visualization and navigation in information visualization: A survey," Visualization and Computer Graphics, IEEE Transactions on, vol. 6, pp. 24--43, 2000. Google ScholarDigital Library
- M. Bastian, S. Heymann, and M. Jacomy, Gephi: An Open Source Software for Exploring and Manipulating Networks, 2009.Google Scholar
- G. M. Draper, Y. Livnat, and R. F. Riesenfeld, "A survey of radial methods for information visualization," Visualization and Computer Graphics, IEEE Transactions on, vol. 15, pp. 759--776, 2009. Google ScholarDigital Library
- K. Pearson, "Mathematical Contributions to the Theory of Evolution. XIX. Second Supplement to a Memoir on Skew Variation," Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, vol. 216, pp. 429--457, January 1, 1916 1916.Google ScholarCross Ref
Index Terms
- Redeye: a digital library for forensic document triage
Recommendations
Supporting document triage via annotation-based multi-application visualizations
JCDL '10: Proceedings of the 10th annual joint conference on Digital librariesFor open-ended information tasks, users must sift through many potentially relevant documents, a practice we refer to as document triage. Normally, people perform triage using multiple applications in concert: a search engine interface presents lists of ...
Cognitive and human factors in digital forensics: Problems, challenges, and the way forward
AbstractDigital forensics is an important and growing forensic domain. Research on miscarriages of justice and misleading evidence, as well as various inquires in the UK and the US, have highlighted human error as an issue within forensic ...
A Comparative Study of Forensic Science and Computer Forensics
SSIRI '09: Proceedings of the 2009 Third IEEE International Conference on Secure Software Integration and Reliability ImprovementAs the internet has reached every corner of the world as well as every aspect of our life, illegal activities go with it as well. In dealing with this phenomenon, a new professional and academic field, computer forensics, has emerged since the beginning ...
Comments