ABSTRACT
Both the quality and quantity of training data have significant impact on the performance of ranking functions in the context of learning to rank for web search. Due to resource constraints, training data for smaller search engine markets are scarce and we need to leverage existing training data from large markets to enhance the learning of ranking function for smaller markets. In this paper, we present a boosting framework for learning to rank in the multi-task learning context for this purpose. In particular, we propose to learn non-parametric common structures adaptively from multiple tasks in a stage-wise way. An algorithm is developed to iteratively discover super-features that are effective for all the tasks. The estimation of the functions for each task is then learned as a linear combination of those super-features. We evaluate the performance of this multi-task learning method for web search ranking using data from a search engine. Our results demonstrate that multi-task learning methods bring significant relevance improvements over existing baseline methods.
- A. Argyriou, T. Evgeniou and M. Pontil. Multi-task feature learning. Advances in Neural Information Processing Systems 19, pages 41--48. MIT Press, Cambridge, 2007.Google Scholar
- B. Bakker and T. Heskes. Task clustering and gating for bayesian multitask learning. Journal of Machine Learning Research, 4:83--99, 2003. Google ScholarDigital Library
- C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton and G. Hullender. Learning to rank using gradient descent. In Proceedings of the 22nd International Conference on Machine learning, 2005. Google ScholarDigital Library
- J. Baxter. A bayesian/information theoretic model of learning to learn via multiple task sampling. Machine Learning, 28(1):7--39, 1997. Google ScholarDigital Library
- J. H. Friedman. Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5):1189--1232, 2001.Google ScholarCross Ref
- J. H. Friedman. Stochastic gradient boosting. Computational Statistics and Data Analysis, 38, 2002. Google ScholarDigital Library
- K. Yu, V. Tresp and A. Schwaighofer. Learning gaussian processes from multiple tasks. ICML, volume 119 of ACM International Conference Proceeding Series, pages 1012--1019, 2005. Google ScholarDigital Library
- K. Jäarvelin and J. Kekäaläainen. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems, 20, 422--446. 2002. Google ScholarDigital Library
- R. Caruana. Multitask learning. Machine Learning, 28(1):41--75, 1997. Google ScholarDigital Library
- T. Joachims. Optimizing search engines using clickthrough data. In Proceedings of ACM SIGKDD, 2002. Google ScholarDigital Library
- Y. Freund, R. D. Iyer, R. E. Schapire, and Y. Singer. An efficient boosting algorithm for combining preferences. In Proceedings of the Fifteenth International Conference on Machine Learning, 1998. Google ScholarDigital Library
- Z. Zheng, K. Chen, G. Sun and H. Zha. A regression framework for learning ranking functions using relative relevance judgments. In Proceedings of the 30th ACM SIGIR conference, 2007. Google ScholarDigital Library
Index Terms
Multi-task learning for learning to rank in web search
Recommendations
Multi-task learning to rank for web search
Both the quality and quantity of training data have significant impact on the accuracy of rank functions in web search. With the global search needs, a commercial search engine is required to expand its well tailored service to small countries as well. ...
Leveraging Auxiliary Data for Learning to Rank
In learning to rank, both the quality and quantity of the training data have significant impacts on the performance of the learned ranking functions. However, in many applications, there are usually not sufficient labeled training data for the ...
Learning to rank code examples for code search engines
Source code examples are used by developers to implement unfamiliar tasks by learning from existing solutions. To better support developers in finding existing solutions, code search engines are designed to locate and rank code examples relevant to user'...
Comments