Query Exposure Prediction for Groups of Documents in Rankings

Jaenich, Thomas; McDonald, Graham; Ounis, Iadh

doi:10.1007/978-3-031-56060-6_10

Thomas Jaenich¹⁴,
Graham McDonald¹⁴ &
Iadh Ounis¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14609))

Included in the following conference series:

European Conference on Information Retrieval

1020 Accesses
1 Citations

Abstract

The main objective of an Information Retrieval (IR) system is to provide a user with the most relevant documents to the user’s query. To do this, modern IR systems typically deploy a re-ranking pipeline in which a set of documents is retrieved by a lightweight first-stage retrieval process and then re-ranked by a more effective but expensive model. However, the success of a re-ranking pipeline is heavily dependent on the performance of the first stage retrieval, since new documents are not usually identified during the re-ranking stage. Moreover, this can impact the amount of exposure that a particular group of documents, such as documents from a particular demographic group, can receive in the final ranking. For example, the fair allocation of exposure becomes more challenging or impossible if the first stage retrieval returns too few documents from certain groups, since the number of group documents in the ranking affects the exposure more than the documents’ positions. With this in mind, it is beneficial to predict the amount of exposure that a group of documents is likely to receive in the results of the first stage retrieval process, in order to ensure that there are a sufficient number of documents included from each of the groups. In this paper, we introduce the novel task of query exposure prediction (QEP). Specifically, we propose the first approach for predicting the distribution of exposure that groups of documents will receive for a given query. Our new approach, called GEP, uses lexical information from individual groups of documents to estimate the exposure the groups will receive in a ranking. Our experiments on the TREC 2021 and 2022 Fair Ranking Track test collections show that our proposed GEP approach results in exposure predictions that are up to $\sim $40% more accurate than the predictions of suitably adapted existing query performance prediction (QPP) and resource allocation approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Improving Exposure Allocation in Rankings by Query Generation

ColBERT-FairPRF: Towards Fair Pseudo-Relevance Feedback in Dense Retrieval

Stat-Weight: Improving the Estimator of Interleaved Methods Outcomes with Statistical Hypothesis Testing

References

Abdi, H.: Coefficient of variation. Encycl. Res. Des. 1(5), 169–171 (2010)
Google Scholar
Abdul-Jaleel, N., et al.: UMass at TREC 2004: Novelty and HARD. Computer Science Department Faculty Publication Series, p. 189 (2004)
Google Scholar
Agarwal, A., Zaitsev, I., Wang, X., Li, C., Najork, M., Joachims, T.: Estimating position bias without intrusive interventions. In: Proceedings of ICWSM (2019)
Google Scholar
Amati, G.: Probabilistic Models for Information Retrieval based on Divergence from Randomness. University of Glasgow, UK. Ph.D. thesis, PhD Thesis (2003)
Google Scholar
Amati, G., Van Rijsbergen, C.J.: Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Trans. Inf. Syst. (TOIS) 20(4), 357–389 (2002)
Article Google Scholar
Biega, A.J., Gummadi, K.P., Weikum, G.: Equity of attention: amortizing individual fairness in rankings. In: Proceedings of SIGIR, pp. 405–414 (2018)
Google Scholar
Bower, A., Lum, K., Lazovich, T., Yee, K., Belli, L.: Random isn’t always fair: Candidate set imbalance and exposure inequality in recommender systems (2022). arXiv preprint arXiv:2209.05000
Callan, J.: Distributed information retrieval. In: Croft, W.B. (eds.) Advances in Information Retrieval. The Information Retrieval Series, vol. 7. Springer, Boston, MA (2002). https://doi.org/10.1007/0-306-47019-5_5
Carmel, D., Yom-Tov, E.: Estimating the query difficulty for information retrieval. Synth. Lect. Inf. Concepts Retrieval Serv. 2(1), 1–89 (2010)
Google Scholar
Craswell, N., Zoeter, O., Taylor, M., Ramsey, B.: An experimental comparison of click position-bias models. In: Proceedings of WSDM (2008)
Google Scholar
Cronen-Townsend, S., Zhou, Y., Croft, W.B.: Predicting query performance. In: Proceedings of SIGIR, pp. 299–306 (2002)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding (2018). arXiv preprint arXiv:1810.04805
Diaz, F., Mitra, B., Ekstrand, M.D., Biega, A.J., Carterette, B.: Evaluating stochastic rankings with expected exposure. In: Proceedings of CIKM, pp. 275–284 (2020)
Google Scholar
Ekstrand, M.D., Burke, R., Diaz, F.: Fairness and discrimination in retrieval and recommendation. In: Proceedings of SIGIR, pp. 1403–1404 (2019)
Google Scholar
Ekstrand, M.D., McDonald, G., Raj, A., Johnson, I.: Overview of the TREC 2021 fair ranking track. In: Proceedings of TREC (2021)
Google Scholar
Ekstrand, M.D., McDonald, G., Raj, A., Johnson, I.: Overview of the TREC 2022 fair ranking track. In: Proceedings of TREC 2022 (2022)
Google Scholar
Fang, H., Zhai, C.: Semantic term matching in axiomatic approaches to information retrieval. In: Proceedings of SIGIR, pp. 115–122 (2006)
Google Scholar
Formal, T., Piwowarski, B., Clinchant, S.: SPLADE: sparse lexical and expansion model for first stage ranking. In: Proceedings of SIGIR, pp. 2288–2292 (2021)
Google Scholar
Hauff, C., Azzopardi, L., Hiemstra, D., de Jong, F.: Query performance prediction: evaluation contrasted with effectiveness. In: Gurrin, C., et al. (eds.) ECIR 2010. LNCS, vol. 5993, pp. 204–216. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-12275-0_20
Chapter Google Scholar
He, B., Ounis, I.: Inferring query performance using pre-retrieval predictors. In: Apostolico, A., Melucci, M. (eds.) SPIRE 2004. LNCS, vol. 3246, pp. 43–54. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30213-1_5
Chapter Google Scholar
He, B., Ounis, I.: Query performance prediction. Inf. Syst. 31(7), 585–594 (2006)
Article Google Scholar
Heuss, M., Sarvi, F., de Rijke, M.: Fairness of exposure in light of incomplete exposure estimation. In: Proceedings of SIGIR, pp. 759–769 (2022)
Google Scholar
Hofstätter, S., Hanbury, A.: Let’s measure run time! Extending the IR replicability infrastructure to include performance aspects (2019). arXiv preprint arXiv:1907.04614
Jaenich, T., McDonald, G., Ounis, I.: ColBERT-FairPRF: towards fair pseudo-relevance feedback in dense retrieval. In: Kamps, J., et al. Advances in Information Retrieval. ECIR 2023. LNCS, vol. 13981. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-28238-6_36
Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. (TOIS) 20(4), 422–446 (2002)
Article Google Scholar
Kletti, T., Renders, J.M., Loiseau, P.: Pareto-optimal fairness-utility amortizations in rankings with a DBN exposure model. In: Proceedings of SIGIR, pp. 748–758 (2022)
Google Scholar
Lin, J., Nogueira, R., Yates, A.: Pretrained transformers for text ranking: BERT and beyond. Springer Nature (2022). https://doi.org/10.1007/978-3-031-02181-7
Macdonald, C., Tonellotto, N.: Declarative experimentation in information retrieval using PyTerrier. In: Proceedings of ICTIR (2020)
Google Scholar
McDonald, G., Macdonald, C., Ounis, I.: Search results diversification for effective fair ranking in academic search. Inf. Retrieval J. 25(1), 1–26 (2022)
Article Google Scholar
Morik, M., Singh, A., Hong, J., Joachims, T.: Controlling fairness and bias in dynamic learning-to-rank. In: Proceedings of SIGIR, pp. 429–438 (2020)
Google Scholar
Pradeep, R., Liu, Y., Zhang, X., Li, Y., Yates, A., Lin, J.: Squeezing water from a stone: a bag of tricks for further improving cross-encoder effectiveness for reranking. In: Hagen, M., et al. (eds.) ECIR 2022. LNCS, vol. 13185, pp. 655–670. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-99736-6_44
Chapter Google Scholar
Pradeep, R., Nogueira, R., Lin, J.: The expando-mono-duo design pattern for text ranking with pretrained sequence-to-sequence models (2021). arXiv preprint arXiv:2101.05667
Raj, A., Wood, C., Montoly, A., Ekstrand, M.D.: Comparing fair ranking metrics (2020). arXiv preprint arXiv:2009.01311
Robertson, S.E., Walker, S., Jones, S., Hancock-Beaulieu, M.M., Gatford, M., et al.: Okapi at TREC-3. NIST Special Publication SP 109, 109 (1995)
Google Scholar
Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manag. 24(5), 513–523 (1988)
Article Google Scholar
Sarvi, F., Heuss, M., Aliannejadi, M., Schelter, S., de Rijke, M.: Understanding and mitigating the effect of outliers in fair ranking. In: Proceedings of WSDM, pp. 861–869 (2022)
Google Scholar
Singh, A., Joachims, T.: Fairness of exposure in rankings. In: Proceedings of KDD (2018)
Google Scholar
Usunier, N., Do, V., Dohmatob, E.: Fast online ranking with fairness of exposure. In: Proceedings of FAccT, pp. 2157–2167 (2022)
Google Scholar
Wang, X., Golbandi, N., Bendersky, M., Metzler, D., Najork, M.: Position bias estimation for unbiased learning to rank in personal search. In: Proceedings of ICWSM (2018)
Google Scholar
Wu, H., Mitra, B., Ma, C., Diaz, F., Liu, X.: Joint multisided exposure fairness for recommendation. In: Proceedings of SIGIR, pp. 703–714 (2022)
Google Scholar
Zehlike, M., Castillo, C.: Reducing disparate exposure in ranking: a learning to rank approach. In: Proceedings of The Web Conference, pp. 2849–2855 (2020)
Google Scholar
Zehlike, M., Yang, K., Stoyanovich, J.: Fairness in ranking: A survey (2021). arXiv preprint arXiv:2103.14000

Download references

Author information

Authors and Affiliations

University of Glasgow, Glasgow, UK
Thomas Jaenich, Graham McDonald & Iadh Ounis

Authors

Thomas Jaenich
View author publications
You can also search for this author in PubMed Google Scholar
Graham McDonald
View author publications
You can also search for this author in PubMed Google Scholar
Iadh Ounis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Thomas Jaenich .

Editor information

Editors and Affiliations

Georgetown University, Washington, WA, USA
Nazli Goharian
University of Pisa, PISA, Pisa, Italy
Nicola Tonellotto
King's College London, London, UK
Yulan He
University College London, London, UK
Aldo Lipani
University of Glasgow, Glasgow, UK
Graham McDonald
University of Glasgow, Glasgow, UK
Craig Macdonald
University of Glasgow, Glasgow, UK
Iadh Ounis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jaenich, T., McDonald, G., Ounis, I. (2024). Query Exposure Prediction for Groups of Documents in Rankings. In: Goharian, N., et al. Advances in Information Retrieval. ECIR 2024. Lecture Notes in Computer Science, vol 14609. Springer, Cham. https://doi.org/10.1007/978-3-031-56060-6_10

Download citation

DOI: https://doi.org/10.1007/978-3-031-56060-6_10
Published: 16 March 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-56059-0
Online ISBN: 978-3-031-56060-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Query Exposure Prediction for Groups of Documents in Rankings

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Improving Exposure Allocation in Rankings by Query Generation

ColBERT-FairPRF: Towards Fair Pseudo-Relevance Feedback in Dense Retrieval

Stat-Weight: Improving the Estimator of Interleaved Methods Outcomes with Statistical Hypothesis Testing

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Query Exposure Prediction for Groups of Documents in Rankings

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Improving Exposure Allocation in Rankings by Query Generation

ColBERT-FairPRF: Towards Fair Pseudo-Relevance Feedback in Dense Retrieval

Stat-Weight: Improving the Estimator of Interleaved Methods Outcomes with Statistical Hypothesis Testing

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation