ABSTRACT
Electronic mail poses a number of unusual challenges for the design of information retrieval systems and test collections, including informal expression, conversational structure, variable document granularity (e.g., messages, threads, or longer-term interactions), a naturally occuring integration between free text and structural metadata, and incompletely characterized user needs. This paper reports on initial experiments with a large collection of public mailing lists from the World Wide Web consortium that will be used for the TREC 2005 Enterprise Search Track. Automatic subject-line threading and removal of duplicated text were found to have little effect in a small pilot study. Those observations motivated development of a question typology and more detailed analysis of collection characteristics; preliminary results for both are reported.
- Derek Lam, Steven L. Rohall, Chris Schmandt, and Mia K. Stern, 2002. Exploiting e-mail structure to improve summarization. In ACM CSCW 2002 Interactive Posters, New Orleans, LA.Google Scholar
- Apache Lucene. http://lucene.apache.org/java/docs/Google Scholar
Index Terms
- Indexing emails and email threads for retrieval
Recommendations
Texture based medical image indexing and retrieval: application to cardiac imaging
MIR '04: Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrievalAlthough digital images indexing and querying techniques have extensively been studied for the last years, few systems are dedicated to medical images today while the need for content-based analysis and retrieval tools increases with the growth of ...
A comprehensive review of significant researches on content based indexing and retrieval of visual information
Developments in multimedia technologies have paved way for the storage of huge collections of video documents on computer systems. It is essential to design tools for content-based access to the documents, so as to allow an efficient exploitation of ...
Combining fields in known-item email search
SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrievalEmails are examples of structured documents with various fields. These fields can be exploited to enhance the retrieval effectiveness of an Information Retrieval (IR) system that mailing list archives. In recent experiments of the TREC2005 Enterprise ...
Comments