ABSTRACT
The Live Nugget Extractor system provides users with a method of efficiently and accurately collecting relevant information for any web query rather than providing a simple ranked lists of documents. The system utilizes an online learning procedure to infer relevance of unjudged documents while extracting and ranking information from judged documents. This creates a set of judged and inferred relevance scores for both documents and text fragments, which can be used for test collections, summarization, and other tasks where high accuracy and large collections with minimal human effort are needed.
- B. Carterette and I. Soboroff. The effect of assessor error on IR system evaluation. SIGIR '10. Google ScholarDigital Library
- V. Pavlu, S. Rajput, P. B. Golbus, and J. A. Aslam. IR system evaluation using nugget-based test collections. WSDM '12. Google ScholarDigital Library
- S. Rajput, M. Ekstrand-Abueg, V. Pavlu, and J. A. Aslam. Constructing test collections by inferring document relevance via extracted relevant information. CIKM '12. Google ScholarDigital Library
Index Terms
- Live nuggets extractor: a semi-automated system for text extraction and test collection creation
Recommendations
Constructing test collections by inferring document relevance via extracted relevant information
CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge managementThe goal of a typical information retrieval system is to satisfy a user's information need---e.g., by providing an answer or information "nugget"---while the actual search space of a typical information retrieval system consists of documents---i.e., ...
IR system evaluation using nugget-based test collections
WSDM '12: Proceedings of the fifth ACM international conference on Web search and data miningThe development of information retrieval systems such as search engines relies on good test collections, including assessments of retrieved content. The widely employed Cranfield paradigm dictates that the information relevant to a topic be encoded at ...
UQV100: A Test Collection with Query Variability
SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information RetrievalWe describe the UQV100 test collection, designed to incorporate variability from users. Information need ?backstories? were written for 100 topics (or sub-topics) from the TREC 2013 and 2014 Web Tracks. Crowd workers were asked to read the backstories, ...
Comments