Abstract
In this chapter, we report our experiences from attempting to measure the effectiveness of large e-Discovery result sets in the TREC Legal Track campaigns of 2007–2009. For effectiveness measures, we have focused on recall, precision and F 1. We state the estimators that we have used for these measures, and we outline both the rank-based and set-based approaches to sampling that we have taken. We share our experiences with the sampling error in the resulting estimates for the absolute performance on individual topics, relative performance on individual topics, mean performance across topics, and relative performance across topics. Finally, we discuss our experiences with assessor error, which we have found has often had a larger impact than sampling error.
Similar content being viewed by others
References
Allan J, Carterette B, Dachev B et al. (2008) Million query track 2007 overview. In: Proceedings of TREC 2007. http://trec.nist.gov/pubs/trec16/papers/1MQ.OVERVIEW16.pdf
Baron JR (ed) (2007) The Sedona conference® best practices commentary on the use of search and information retrieval methods in e-discovery. Sedona Conf J VIII:189–223
Baron JR, Lewis DD, Oard DW (2007) TREC-2006 legal track overview. In: Proceedings of TREC 2006. http://trec.nist.gov/pubs/trec15/papers/LEGAL06.OVERVIEW.pdf
Buckley C, Dimmick D, Soboroff I, Voorhees E (2006) Bias and the limits of pooling. In: SIGIR 2006, pp 619–620
Büttcher S, Clarke CLA, Soboroff I (2007) The TREC 2006 terabyte track. In: Proceedings of TREC 2006. http://trec.nist.gov/pubs/trec15/papers/TERA06.OVERVIEW.pdf
Carterette B, Soboroff I (2010) The effect of assessor errors on IR system evaluation. In: SIGIR 2010, pp 539–546
Harman DK (2005) The TREC test collections. In: TREC: Experiment and evaluation in information retrieval, pp 21–52
Hedin B, Tomlinson S, Baron JR, Oard DW (2010) Overview of the TREC 2009 legal track. In: Proceedings of TREC 2009. http://trec-legal.umiacs.umd.edu/LegalOverview09.pdf
Lewis D, Agam G, Argamon S et al. (2006) Building a test collection for complex document information processing. In: SIGIR 2006, pp 665–666
Oard DW, Baron JR, Hedin B et al (2010) Evaluation of information retrieval for e-discovery. Artif Intell Law
Oard DW, Hedin B, Tomlinson S, Baron JR (2009) Overview of the TREC 2008 legal track. In: Proceedings of TREC 2008. http://trec.nist.gov/pubs/trec17/papers/LEGAL.OVERVIEW08.pdf
Thompson SK (2002) Sampling, 2nd edn. Wiley, New York
Tomlinson S (2007) Experiments with the negotiated boolean queries of the TREC 2006 legal discovery track. In: Proceedings of TREC 2006. http://trec.nist.gov/pubs/trec15/papers/opentext.legal.final.pdf
Tomlinson S (2008) Experiments with the negotiated boolean queries of the TREC 2007 legal discovery track. In: Proceedings of TREC 2007. http://trec.nist.gov/pubs/trec16/papers/open-text.legal.final.pdf
Tomlinson S (2009) Experiments with the negotiated boolean queries of the TREC 2008 legal track. In: Proceedings of TREC 2008. http://trec.nist.gov/pubs/trec17/papers/open-text.legal.rev.pdf
Tomlinson S, Oard DW, Baron JR, Thompson P (2008) Overview of the TREC 2007 legal track. In: Proceedings of TREC 2007. http://trec.nist.gov/pubs/trec16/papers/LEGAL.OVERVIEW16.pdf
van Rijsbergen CJ (1979) Information retrieval, 2nd edn. Butterworths, London. http://www.dcs.gla.ac.uk/Keith/Preface.html
Voorhees EM (2000) Variations in relevance judgments and the measurement of retrieval effectiveness. Inf Process Manag 36(5):697–716
Voorhees EM, Harman D (1997) Overview of the fifth text retrieval conference (TREC-5). In: Proceedings of TREC-5. http://trec.nist.gov/pubs/trec5/papers/overview.ps.gz
Webber W (2010) Accurate recall confidence intervals for stratified sampling. Manuscript
Webber W, Oard DW, Scholer F, Hedin B (2010) Assessor error in stratified evaluation. In: CIKM 2010, pp 539–548
Yilmaz E, Aslam JA (2006) Estimating average precision with incomplete and imperfect judgments. In: CIKM 2006, pp 102–111
Zobel J (1998) How reliable are the results of large-scale information retrieval experiments. In: SIGIR 1998, pp 307–314
Acknowledgements
We thank Doug Oard, William Webber, Jason Baron and the two anonymous reviewers for their helpful remarks on drafts of this chapter. Also, we would like to thank Jason Baron, Doug Oard, Ian Soboroff and Ellen Voorhees for their support and advice in undertaking the various challenges of measuring effectiveness in the TREC Legal Track, and also all of the track contributors and participants without whom the track would not have been possible.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Tomlinson, S., Hedin, B. (2011). Measuring Effectiveness in the TREC Legal Track. In: Lupu, M., Mayer, K., Tait, J., Trippe, A. (eds) Current Challenges in Patent Information Retrieval. The Information Retrieval Series, vol 29. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19231-9_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-19231-9_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19230-2
Online ISBN: 978-3-642-19231-9
eBook Packages: Computer ScienceComputer Science (R0)