poster

Assessing the reliability and reusability of an E-discovery privilege test collection

Authors:

SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval

Pages 1047 - 1050

https://doi.org/10.1145/2600428.2609506

Published: 03 July 2014 Publication History

Get Access

Abstract

In some jurisdictions, parties to a lawsuit can request documents from each other, but documents subject to a claim of privilege may be withheld. The TREC 2010 Legal Track developed what is presently the only public test collection for evaluating privilege classification. This paper examines the reliability and reusability of that collection. For reliability, the key question is the extent to which privilege judgments correctly reflect the opinion of the senior litigator whose judgment is authoritative. For reusability, the key question is the degree to which systems whose results contributed to creation of the test collection can be fairly compared with other systems that use those privilege judgments in the future. These correspond to measurement error and sampling error, respectively. The results indicate that measurement error is the larger problem.

References

[1]

C. Buckley et al. Bias and the limits of pooling for large collections. Information Retrieval, 10(6), 2007.

Digital Library

Google Scholar

[2]

G. Cormack et al. Overview of the TREC 2010 legal track. In TREC, 2010.

Google Scholar

[3]

D. Oard et al. Overview of the TREC 2008 legal track. In TREC, 2008.

Google Scholar

[4]

K. Sp\"arck Jones et al. Information retrieval test collections. Journal of Documentation, 32(1), 1976.

Google Scholar

[5]

E. Voorhees. Variations in relevance judgments & the measurement of retrieval effectiveness. IP & M, 2000.

Digital Library

Google Scholar

[6]

W. Webber. Approximate recall confidence intervals. Transactions on Information Systems, 31(1), 2013.

Digital Library

Google Scholar

[7]

W. Webber et al. Assessor error in stratified evaluation. In CIKM, 2010.

Digital Library

Google Scholar

[8]

E. Yilmaz et al. Estimating average precision with incomplete and imperfect judgments. In CIKM, 2006.

Digital Library

Google Scholar

[9]

E. Yilmaz et al. A simple and efficient sampling method for estimating AP & NDCG. In SIGIR, 2008.

Digital Library

Google Scholar

[10]

J. Zobel. How reliable are the results of large-scale information retrieval experiments? In SIGIR, 1998.

Digital Library

Google Scholar

Cited By

View all

Wei FYang JMao QQin HDabrowski A(2022)An Empirical Comparison of DistilBERT, Longformer and Logistic Regression for Predictive Coding2022 IEEE International Conference on Big Data (Big Data)10.1109/BigData55660.2022.10020486(3336-3340)Online publication date: 17-Dec-2022
https://doi.org/10.1109/BigData55660.2022.10020486
Liu J(2022)Toward Cranfield-inspired reusability assessment in interactive information retrieval evaluationInformation Processing and Management: an International Journal10.1016/j.ipm.2022.10300759:5Online publication date: 1-Sep-2022
https://dl.acm.org/doi/10.1016/j.ipm.2022.103007
Sayed MMallekav NOard D(2022)Comparing Intrinsic and Extrinsic Evaluation of Sensitivity ClassificationAdvances in Information Retrieval10.1007/978-3-030-99739-7_25(215-222)Online publication date: 5-Apr-2022
https://doi.org/10.1007/978-3-030-99739-7_25
Show More Cited By

Index Terms

Assessing the reliability and reusability of an E-discovery privilege test collection
1. Information systems
  1. Information retrieval
    1. Evaluation of retrieval results

Recommendations

Measuring the reusability of test collections
WSDM '10: Proceedings of the third ACM international conference on Web search and data mining

While test collection construction is a time-consuming and expensive process, the true cost is amortized by reusing the collection over hundreds or thousands of experiments. Some of these experiments may involve systems that retrieve documents not ...
Test theory for evaluating reliability of IR test collections

Classical test theory offers theoretically derived reliability measures such as Cronbach's alpha, which can be applied to measure the reliability of a set of Information Retrieval test results. The theory also supports item analysis, which identifies ...
A Test Collection for Evaluating Legal Case Law Search
SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval

Test collection based evaluation represents the standard of evalua- tion for information retrieval systems. Legal IR, more speci cally case law retrieval, has no such standard test collection for evalua- tion. In this paper, we present a test collection ...

Comments

Information & Contributors

Information

Published In

SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval

July 2014

1330 pages

ISBN:9781450322577

DOI:10.1145/2600428

General Chairs:
Shlomo Geva
Queensland University of Technology
,
Andrew Trotman
University of Dunedin
,
Program Chairs:
Peter Bruza
Queensland University of Technology
,
Charles L.A. Clarke
University of Waterloo
,
Kal Järvelin
University of Tampere

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 July 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Poster

Funding Sources

National Science Foundation

Conference

SIGIR '14

Sponsor:

SIGIR

SIGIR '14: The 37th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 6 - 11, 2014

Queensland, Gold Coast, Australia

Acceptance Rates

SIGIR '14 Paper Acceptance Rate 82 of 387 submissions, 21%;

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

18
Total Citations
View Citations
192
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)1

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Wei FYang JMao QQin HDabrowski A(2022)An Empirical Comparison of DistilBERT, Longformer and Logistic Regression for Predictive Coding2022 IEEE International Conference on Big Data (Big Data)10.1109/BigData55660.2022.10020486(3336-3340)Online publication date: 17-Dec-2022
https://doi.org/10.1109/BigData55660.2022.10020486
Liu J(2022)Toward Cranfield-inspired reusability assessment in interactive information retrieval evaluationInformation Processing and Management: an International Journal10.1016/j.ipm.2022.10300759:5Online publication date: 1-Sep-2022
https://dl.acm.org/doi/10.1016/j.ipm.2022.103007
Sayed MMallekav NOard D(2022)Comparing Intrinsic and Extrinsic Evaluation of Sensitivity ClassificationAdvances in Information Retrieval10.1007/978-3-030-99739-7_25(215-222)Online publication date: 5-Apr-2022
https://doi.org/10.1007/978-3-030-99739-7_25
Zhao HYe SYang J(2021)An Empirical Study on Transfer Learning for Privilege Review2021 IEEE International Conference on Big Data (Big Data)10.1109/BigData52589.2021.9672008(2729-2733)Online publication date: 15-Dec-2021
https://doi.org/10.1109/BigData52589.2021.9672008
Yuksel C(2020)A Class of C2 Interpolating SplinesACM Transactions on Graphics10.1145/340030139:5(1-14)Online publication date: 21-Aug-2020
https://dl.acm.org/doi/10.1145/3400301
Beigi GTang JLiu H(2020)Social Science–guided Feature EngineeringACM Transactions on Intelligent Systems and Technology10.1145/336422211:1(1-27)Online publication date: 9-Jan-2020
https://dl.acm.org/doi/10.1145/3364222
Missier PMalik TCala J(2019)Report on the First International Workshop on Incremental Re-computationACM SIGMOD Record10.1145/3335409.333541847:4(35-38)Online publication date: 17-May-2019
https://dl.acm.org/doi/10.1145/3335409.3335418
Bernstein P(2019)SIGMOD 2018 Program Committee Chair's ReportACM SIGMOD Record10.1145/3335409.333541747:4(29-34)Online publication date: 17-May-2019
https://dl.acm.org/doi/10.1145/3335409.3335417
Abedjan ZBreß SMarkl VRabl TSoto J(2019)Data Management Systems Research at TU BerlinACM SIGMOD Record10.1145/3335409.333541547:4(23-28)Online publication date: 17-May-2019
https://dl.acm.org/doi/10.1145/3335409.3335415
Bonifati ADumbrava S(2019)Graph QueriesACM SIGMOD Record10.1145/3335409.333541147:4(5-16)Online publication date: 17-May-2019
https://dl.acm.org/doi/10.1145/3335409.3335411
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Index Terms

Recommendations

Measuring the reusability of test collections

Test theory for evaluating reliability of IR test collections

A Test Collection for Evaluating Legal Case Law Search

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations