ABSTRACT
Learning to rank arises in many information retrieval applications, ranging from Web search engine, online advertising to recommendation systems. Traditional ranking mainly focuses on one type of data source, and effective modeling relies on a sufficiently large number of labeled examples, which require expensive and time-consuming labeling process. However, in many real-world applications, ranking over multiple related heterogeneous domains becomes a common situation, where in some domains we may have a relatively large amount of training data while in some other domains we can only collect very little. Theretofore, how to leverage labeled information from related heterogeneous domain to improve ranking in a target domain has become a problem of great interests. In this paper, we propose a novel probabilistic model, pairwise cross-domain factor model, to address this problem. The proposed model learns latent factors(features) for multi-domain data in partially-overlapped heterogeneous feature spaces. It is capable of learning homogeneous feature correlation, heterogeneous feature correlation, and pairwise preference correlation for cross-domain knowledge transfer. We also derive two PCDF variations to address two important special cases. Under the PCDF model, we derive a stochastic gradient based algorithm, which facilitates distributed optimization and is flexible to adopt different loss functions and regularization functions to accommodate different data distributions. The extensive experiments on real world data sets demonstrate the effectiveness of the proposed model and algorithm.
- R. Ando and T. Zhang. A high-performance semi-supervised learning method for text chunking. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pages 1--9. Association for Computational Linguistics Morristown, NJ, USA, 2005. Google ScholarDigital Library
- A. Argyriou, T. Evgeniou, and M. Pontil. Multi-task feature learning. In Advances in Neural Information Processing Systems: Proceedings of the 2006 Conference, page 41. MIT Press, 2007.Google ScholarDigital Library
- A. Argyriou, C. Micchelli, M. Pontil, and Y. Ying. A spectral regularization framework for multi-task structure learning. Advances in Neural Information Processing Systems, 20, 2008.Google Scholar
- S. Bickel, M. Brückner, and T. Scheffer. Discriminative learning for differing training and test distributions. In Proceedings of the 24th international conference on Machine learning, pages 81--88. ACM New York, NY, USA, 2007. Google ScholarDigital Library
- J. Blitzer, K. Crammer, A. Kulesza, F. Pereira, and J. Wortman. Learning bounds for domain adaptation. Advances in Neural Information Processing Systems, 20, 2008.Google Scholar
- J. Blitzer, R. McDonald, and F. Pereira. Domain adaptation with structural correspondence learning. In Proceedings of the Empirical Methods in Natural Language Processing (EMNLP), 2006. Google ScholarDigital Library
- A. Blum and T. Mitchell. Combining labeled and unlabeled data with co-training. In Proceedings of the eleventh annual conference on Computational learning theory, COLT'98, pages 92--100, 1998. Google ScholarDigital Library
- E. Bonilla, K. Chai, and C. Williams. Multi-task gaussian process prediction. Advances in Neural Information Processing Systems, 20:153--160.Google Scholar
- C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. In Proceedings of the 22nd International Conference on Machine learning, 2005. Google ScholarDigital Library
- Z. Cao, T. Qin, T.-Y. Liu, M.-F. Tsai, and H. Li. Learning to rank: from pairwise approach to listwise approach. In ICML '07, pages 129--136, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- D. Chen, J. Yan, G. Wang, Y. Xiong, W. Fan, and Z. Chen. TransRank: A Novel Algorithm for Transfer of Rank Learning. In IEEE ICDM Workshops, 2008. Google ScholarDigital Library
- M. Collins, S. Dasgupta, and R. Reina. A generalizaion of principal component analysis to the exponential family. In NIPS'01, 2001.Google Scholar
- C. Cortes, M. Mohri, and A. Rastogi. Magnitude-preserving ranking algorithms. In Proceedings of the 24th ICML, 2007. Google ScholarDigital Library
- W. Dai, G. Xue, Q. Yang, and Y. Yu. Co-clustering based classification for out-of-domain documents. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 210--219. ACM New York, NY, USA, 2007. Google ScholarDigital Library
- W. Dai, Q. Yang, G. Xue, and Y. Yu. Boosting for transfer learning. In Proceedings of the 24th international conference on Machine learning, pages 193--200. ACM New York, NY, USA, 2007. Google ScholarDigital Library
- H. Daume. Frustratingly easy domain adaptation. In Annual meeting-association for computational linguistics, volume 45, page 256, 2007.Google Scholar
- H. Daume III and D. Marcu. Domain adaptation for statistical classifiers. Journal of Artificial Intelligence Research, 26:101--126, 2006. Google ScholarDigital Library
- T. Evgeniou and M. Pontil. Regularized multi-task learning. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 109--117. ACM New York, NY, USA, 2004. Google ScholarDigital Library
- Y. Freund, R. Iyer, R. Schapire, and Y. Singer. An efficient boosting algorithm for combining preferences. In Proceedings of the Fifteenth International Conference on Machine Learning, 1998. Google ScholarDigital Library
- J. Friedman. Greedy function approximation: a gradient boosting machine. Annals of Statistics, pages 1189--1232, 2001.Google ScholarCross Ref
- J. Gao, Q. Wu, C. Burges, K. Svore, Y. Su, N. Khan, S. Shah, and H. Zhou. Model adaptation via model interpolation and boosting for web search ranking. In Proceedings of conference on Empirical Methods in Natural Language Processing, 2009. Google ScholarDigital Library
- J. Guiver and E. Snelson. Learning to rank with SoftRank and Gaussian processes. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, 2008. Google ScholarDigital Library
- M. Harel and S. Mannor. Learning from multiple outlooks. In L. Getoor and T. Scheffer, editors, Proceedings of the 28th International Conference on Machine Learning (ICML-11), ICML '11, pages 401--408, New York, NY, USA, June 2011. ACM.Google Scholar
- J. He and R. Lawrence. A graph-based framework for multi-task multi-view learning. In L. Getoor and T. Scheffer, editors, Proceedings of the 28th International Conference on Machine Learning (ICML-11), ICML '11, pages 25--32, New York, NY, USA, June 2011. ACM.Google Scholar
- J. Huang, A. Smola, A. Gretton, K. Borgwardt, and B. Scholkopf. Correcting sample selection bias by unlabeled data. Advances in neural information processing systems, 19:601, 2007.Google Scholar
- J. Jiang and C. Zhai. Instance weighting for domain adaptation in NLP. In Annual meeting-assosciation for computational linguistics, volume 45, page 264, 2007.Google Scholar
- T. Joachims. Optimizing search engines using clickthrough data. In Proceedings of ACM SIGKDD, 2002. Google ScholarDigital Library
- N. Lawrence and J. Platt. Learning to learn with the informative vector machine. In Proceedings of the twenty-first international conference on Machine learning. ACM New York, NY, USA, 2004. Google ScholarDigital Library
- H. Lee, A. Battle, R. Raina, and A. Y. Ng. Efficient sparse coding algorithms. In In NIPS, pages 801--808. NIPS, 2007.Google Scholar
- S. Lee, V. Chatalbashev, D. Vickrey, and D. Koller. Learning a meta-level prior for feature relevance from multiple related tasks. In Proceedings of the 24th international conference on Machine learning, pages 489--496. ACM New York, NY, USA, 2007. Google ScholarDigital Library
- X. Liao, Y. Xue, and L. Carin. Logistic regression with an auxiliary data source. In MACHINE LEARNING-INTERNATIONAL WORKSHOP THEN CONFERENCE-, volume 22, page 505, 2005. Google ScholarDigital Library
- P. Luo, F. Zhuang, H. Xiong, Y. Xiong, and Q. He. Transfer learning from multiple source domains via consensus regularization. In CIKM '08: Proceeding of the 17th ACM conference on Information and knowledge management, pages 103--112, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- R. Raina, A. Battle, H. Lee, B. Packer, and A. Ng. Self-taught learning: Transfer learning from unlabeled data. In Proceedings of the 24th international conference on Machine learning, pages 759--766. ACM New York, NY, USA, 2007. Google ScholarDigital Library
- A. Schwaighofer, V. Tresp, and K. Yu. Learning Gaussian process kernels via hierarchical Bayes. Advances in Neural Information Processing Systems, 17:1209--1216, 2005.Google Scholar
- M. Sugiyama, S. Nakajima, H. Kashima, P. von Bunau, and M. Kawanabe. Direct importance estimation with model selection and its application to covariate shift adaptation. Advances in Neural Information Processing Systems, 20, 2008.Google Scholar
- B. Wang, J. Tang, W. Fan, S. Chen, Z. Yang, and Y. Liu. Heterogeneous cross domain ranking in latent space. In Proceeding of the 18th ACM conference on Information and knowledge management, CIKM '09, pages 987--996, 2009. Google ScholarDigital Library
- C. Wang and S. Mahadevan. Heterogeneous domain adaptation using manifold alignment. In IJCAI, pages 1541--1546, 2011. Google ScholarDigital Library
- J. Xu and H. Li. Adarank: a boosting algorithm for information retrieval. In Proceedings of the 30th ACM SIGIR, 2007. Google ScholarDigital Library
- Q. Yang, Y. Chen, G.-R. Xue, W. Dai, and Y. Yu. Heterogeneous transfer learning for image clustering via the social web. ACL '09, pages 1--9, 2009. Google ScholarDigital Library
- H. Zha, Z. Zheng, H. Fu, and G. Sun. Incorporating query difference for learning retrieval functions in world wide web search. In Proceedings of the 15th ACM CIKM conference, 2006. Google ScholarDigital Library
- Z. Zheng, K. Chen, G. Sun, and H. Zha. A regression framework for learning ranking functions using relative relevance judgments. In SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 287--294, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- M. Zinkevich, M. Weimer, A. Smola, and L. Li. Parallelized stochastic gradient descent. In J. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R. Zemel, and A. Culotta, editors, Advances in Neural Information Processing Systems 23, pages 2595--2603, 2010.Google Scholar
Index Terms
- Pairwise cross-domain factor model for heterogeneous transfer ranking
Recommendations
Ranking with auxiliary data
CIKM '10: Proceedings of the 19th ACM international conference on Information and knowledge managementLearning to rank arises in many information retrieval applications, ranging from Web search engine, online advertising to recommendation system. In learning to rank, the performance of a ranking function heavily depends on the number of labeled examples ...
A risk minimization framework for domain adaptation
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge managementSupervised learning algorithms usually require high quality labeled training set of large volume. It is often expensive to obtain such labeled examples in every domain of an application. Domain adaptation aims to help in such cases by utilizing data ...
Domain‐invariant adversarial learning with conditional distribution alignment for unsupervised domain adaptation
Unsupervised domain adaption aims to reduce the divergence between the source domain and the target domain. The final objective is to learn domain‐invariant features from both domains that get the minimised expected error on the target domain. The ...
Comments