Noise Correction in Pairwise Document Preferences for Learning to Rank

Trivedi, Harsh; Majumder, Prasenjit

doi:10.1007/978-3-319-48051-0_22

Noise Correction in Pairwise Document Preferences for Learning to Rank

Harsh Trivedi²⁰ &
Prasenjit Majumder²⁰

Conference paper
First Online: 15 October 2016

880 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9994))

Abstract

This paper proposes a way of correcting noise in the training data for Learning to Rank. It is natural to assume that some level of noise might seep in during the process of producing query-document relevance labels by human evaluators. These relevance labels, which act as gold standard training data for Learning to Rank can adversely affect the efficiency of learning algorithm if they contain errors. Hence, an automated way of reducing noise can be of great advantage. The focus in this paper is on noise correction for pairwise document preferences which are used for pairwise Learning to Rank algorithms. The approach relies on representing pairwise document preferences in an intermediate feature space on which ensemble learning based approach is applied to identify and correct the errors. Up to 90 % errors in the pairwise preferences could be corrected at statistically significant levels by using this approach, which is robust enough to even operate at high levels of noise.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
Weka - machine learning software was used for classification [4] .
2.
document pair noise will be referred to as noise henceforth.

References

Bailey, P., et al.: Relevance assessment: are judges exchangeable and does it matter. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 667–674. ACM (2008)
Google Scholar
Brodley, C.E., Friedl, M.A.: Identifying mislabeled training data. J. Artif. Intell. Res. 11, 131–167 (1999)
MATH Google Scholar
Geng, X., et al.: Selecting optimal training data for learning to rank. Inf. Process. Manag. 47(5), 730–741 (2011)
Article Google Scholar
Hall, M., et al.: The WEKA data mining software: an update. ACM SIGKDD Explor. Newslett. 11(1), 10–18 (2009)
Article Google Scholar
Hang, L.I.: A short introduction to learning to rank. IEICE Trans. Inf. Syst. 94(10), 1854–1862 (2011)
Google Scholar
Liu, T.-Y.: Learning to rank for information retrieval. Found. Trends Inf. Retrieval 3(3), 225–331 (2009)
Article Google Scholar
Niu, S., et al.: Which noise affects algorithm robustness for learning to rank. Inf. Retrieval J. 18(3), 215–245 (2015)
Article Google Scholar
Qin, T., et al.: LETOR: a benchmark collection for research on learning to rank for information retrieval. Inf. Retrieval 13(4), 346–374 (2010)
Article Google Scholar
Voorhees, E.M.: Variations in relevance judgments, the measurement of retrieval effectiveness. Inf. Process. Manag. 36(5), 697–716 (2000)
Article Google Scholar
Voorhees, E., Harman, D.: Overview of the fifth text retrieval conference (TREC-5). In: NIST Special Publication SP, pp. 1–28 (1997)
Google Scholar
Jingfang, X., et al.: Improving quality of training data for learning to rank using click-through data. In: Proceedings of the Third ACM International Conference on Web Search and Data Mining, pp. 171–180. ACM (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Dhirubhai Ambani Institute of Information and Communication Technology, Gandhinagar, India
Harsh Trivedi & Prasenjit Majumder

Authors

Harsh Trivedi
View author publications
You can also search for this author in PubMed Google Scholar
Prasenjit Majumder
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Harsh Trivedi .

Editor information

Editors and Affiliations

Tsinghua University , Beijing, China
Shaoping Ma
Renmin University of China , Beijing, China
Ji-Rong Wen
Tsinghua University , Beijing, China
Yiqun Liu
Renmin University of China , Beijing, China
Zhicheng Dou
Tsinghua University , Beijing, China
Min Zhang
Yahoo Labs , Sunnyvale, California, USA
Yi Chang
Renmin University of China , Beijing, China
Xin Zhao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Trivedi, H., Majumder, P. (2016). Noise Correction in Pairwise Document Preferences for Learning to Rank. In: Ma, S., et al. Information Retrieval Technology. AIRS 2016. Lecture Notes in Computer Science(), vol 9994. Springer, Cham. https://doi.org/10.1007/978-3-319-48051-0_22

Download citation

DOI: https://doi.org/10.1007/978-3-319-48051-0_22
Published: 15 October 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-48050-3
Online ISBN: 978-3-319-48051-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics