research-article

Pairwise cross-domain factor model for heterogeneous transfer ranking

Authors:
Bo Long

Yahoo! Labs, Sunnyvale, CA, USA

Yahoo! Labs, Sunnyvale, CA, USA
View Profile

,
Yi Chang

Yahoo! Labs, Sunnyvale, CA, USA

Yahoo! Labs, Sunnyvale, CA, USA
View Profile

,
Anlei Dong

Yahoo! Labs, Sunnyvale, CA, USA

Yahoo! Labs, Sunnyvale, CA, USA
View Profile

,
Jianzhang He

Yahoo! Labs, Sunnyvale, USA

Yahoo! Labs, Sunnyvale, USA
View Profile

WSDM '12: Proceedings of the fifth ACM international conference on Web search and data miningFebruary 2012Pages 113–122https://doi.org/10.1145/2124295.2124311

Published:08 February 2012Publication History

WSDM '12: Proceedings of the fifth ACM international conference on Web search and data mining

Pages 113–122

ABSTRACT

Learning to rank arises in many information retrieval applications, ranging from Web search engine, online advertising to recommendation systems. Traditional ranking mainly focuses on one type of data source, and effective modeling relies on a sufficiently large number of labeled examples, which require expensive and time-consuming labeling process. However, in many real-world applications, ranking over multiple related heterogeneous domains becomes a common situation, where in some domains we may have a relatively large amount of training data while in some other domains we can only collect very little. Theretofore, how to leverage labeled information from related heterogeneous domain to improve ranking in a target domain has become a problem of great interests. In this paper, we propose a novel probabilistic model, pairwise cross-domain factor model, to address this problem. The proposed model learns latent factors(features) for multi-domain data in partially-overlapped heterogeneous feature spaces. It is capable of learning homogeneous feature correlation, heterogeneous feature correlation, and pairwise preference correlation for cross-domain knowledge transfer. We also derive two PCDF variations to address two important special cases. Under the PCDF model, we derive a stochastic gradient based algorithm, which facilitates distributed optimization and is flexible to adopt different loss functions and regularization functions to accommodate different data distributions. The extensive experiments on real world data sets demonstrate the effectiveness of the proposed model and algorithm.

References

R. Ando and T. Zhang. A high-performance semi-supervised learning method for text chunking. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pages 1--9. Association for Computational Linguistics Morristown, NJ, USA, 2005. Google ScholarDigital Library
A. Argyriou, T. Evgeniou, and M. Pontil. Multi-task feature learning. In Advances in Neural Information Processing Systems: Proceedings of the 2006 Conference, page 41. MIT Press, 2007.Google ScholarDigital Library
A. Argyriou, C. Micchelli, M. Pontil, and Y. Ying. A spectral regularization framework for multi-task structure learning. Advances in Neural Information Processing Systems, 20, 2008.Google Scholar
S. Bickel, M. Brückner, and T. Scheffer. Discriminative learning for differing training and test distributions. In Proceedings of the 24th international conference on Machine learning, pages 81--88. ACM New York, NY, USA, 2007. Google ScholarDigital Library
J. Blitzer, K. Crammer, A. Kulesza, F. Pereira, and J. Wortman. Learning bounds for domain adaptation. Advances in Neural Information Processing Systems, 20, 2008.Google Scholar
J. Blitzer, R. McDonald, and F. Pereira. Domain adaptation with structural correspondence learning. In Proceedings of the Empirical Methods in Natural Language Processing (EMNLP), 2006. Google ScholarDigital Library
A. Blum and T. Mitchell. Combining labeled and unlabeled data with co-training. In Proceedings of the eleventh annual conference on Computational learning theory, COLT'98, pages 92--100, 1998. Google ScholarDigital Library
E. Bonilla, K. Chai, and C. Williams. Multi-task gaussian process prediction. Advances in Neural Information Processing Systems, 20:153--160.Google Scholar
C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. In Proceedings of the 22nd International Conference on Machine learning, 2005. Google ScholarDigital Library
Z. Cao, T. Qin, T.-Y. Liu, M.-F. Tsai, and H. Li. Learning to rank: from pairwise approach to listwise approach. In ICML '07, pages 129--136, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
D. Chen, J. Yan, G. Wang, Y. Xiong, W. Fan, and Z. Chen. TransRank: A Novel Algorithm for Transfer of Rank Learning. In IEEE ICDM Workshops, 2008. Google ScholarDigital Library
M. Collins, S. Dasgupta, and R. Reina. A generalizaion of principal component analysis to the exponential family. In NIPS'01, 2001.Google Scholar
C. Cortes, M. Mohri, and A. Rastogi. Magnitude-preserving ranking algorithms. In Proceedings of the 24th ICML, 2007. Google ScholarDigital Library
W. Dai, G. Xue, Q. Yang, and Y. Yu. Co-clustering based classification for out-of-domain documents. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 210--219. ACM New York, NY, USA, 2007. Google ScholarDigital Library
W. Dai, Q. Yang, G. Xue, and Y. Yu. Boosting for transfer learning. In Proceedings of the 24th international conference on Machine learning, pages 193--200. ACM New York, NY, USA, 2007. Google ScholarDigital Library
H. Daume. Frustratingly easy domain adaptation. In Annual meeting-association for computational linguistics, volume 45, page 256, 2007.Google Scholar
H. Daume III and D. Marcu. Domain adaptation for statistical classifiers. Journal of Artificial Intelligence Research, 26:101--126, 2006. Google ScholarDigital Library
T. Evgeniou and M. Pontil. Regularized multi-task learning. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 109--117. ACM New York, NY, USA, 2004. Google ScholarDigital Library
Y. Freund, R. Iyer, R. Schapire, and Y. Singer. An efficient boosting algorithm for combining preferences. In Proceedings of the Fifteenth International Conference on Machine Learning, 1998. Google ScholarDigital Library
J. Friedman. Greedy function approximation: a gradient boosting machine. Annals of Statistics, pages 1189--1232, 2001.Google ScholarCross Ref
J. Gao, Q. Wu, C. Burges, K. Svore, Y. Su, N. Khan, S. Shah, and H. Zhou. Model adaptation via model interpolation and boosting for web search ranking. In Proceedings of conference on Empirical Methods in Natural Language Processing, 2009. Google ScholarDigital Library
J. Guiver and E. Snelson. Learning to rank with SoftRank and Gaussian processes. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, 2008. Google ScholarDigital Library
M. Harel and S. Mannor. Learning from multiple outlooks. In L. Getoor and T. Scheffer, editors, Proceedings of the 28th International Conference on Machine Learning (ICML-11), ICML '11, pages 401--408, New York, NY, USA, June 2011. ACM.Google Scholar
J. He and R. Lawrence. A graph-based framework for multi-task multi-view learning. In L. Getoor and T. Scheffer, editors, Proceedings of the 28th International Conference on Machine Learning (ICML-11), ICML '11, pages 25--32, New York, NY, USA, June 2011. ACM.Google Scholar
J. Huang, A. Smola, A. Gretton, K. Borgwardt, and B. Scholkopf. Correcting sample selection bias by unlabeled data. Advances in neural information processing systems, 19:601, 2007.Google Scholar
J. Jiang and C. Zhai. Instance weighting for domain adaptation in NLP. In Annual meeting-assosciation for computational linguistics, volume 45, page 264, 2007.Google Scholar
T. Joachims. Optimizing search engines using clickthrough data. In Proceedings of ACM SIGKDD, 2002. Google ScholarDigital Library
N. Lawrence and J. Platt. Learning to learn with the informative vector machine. In Proceedings of the twenty-first international conference on Machine learning. ACM New York, NY, USA, 2004. Google ScholarDigital Library
H. Lee, A. Battle, R. Raina, and A. Y. Ng. Efficient sparse coding algorithms. In In NIPS, pages 801--808. NIPS, 2007.Google Scholar
S. Lee, V. Chatalbashev, D. Vickrey, and D. Koller. Learning a meta-level prior for feature relevance from multiple related tasks. In Proceedings of the 24th international conference on Machine learning, pages 489--496. ACM New York, NY, USA, 2007. Google ScholarDigital Library
X. Liao, Y. Xue, and L. Carin. Logistic regression with an auxiliary data source. In MACHINE LEARNING-INTERNATIONAL WORKSHOP THEN CONFERENCE-, volume 22, page 505, 2005. Google ScholarDigital Library
P. Luo, F. Zhuang, H. Xiong, Y. Xiong, and Q. He. Transfer learning from multiple source domains via consensus regularization. In CIKM '08: Proceeding of the 17th ACM conference on Information and knowledge management, pages 103--112, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
R. Raina, A. Battle, H. Lee, B. Packer, and A. Ng. Self-taught learning: Transfer learning from unlabeled data. In Proceedings of the 24th international conference on Machine learning, pages 759--766. ACM New York, NY, USA, 2007. Google ScholarDigital Library
A. Schwaighofer, V. Tresp, and K. Yu. Learning Gaussian process kernels via hierarchical Bayes. Advances in Neural Information Processing Systems, 17:1209--1216, 2005.Google Scholar
M. Sugiyama, S. Nakajima, H. Kashima, P. von Bunau, and M. Kawanabe. Direct importance estimation with model selection and its application to covariate shift adaptation. Advances in Neural Information Processing Systems, 20, 2008.Google Scholar
B. Wang, J. Tang, W. Fan, S. Chen, Z. Yang, and Y. Liu. Heterogeneous cross domain ranking in latent space. In Proceeding of the 18th ACM conference on Information and knowledge management, CIKM '09, pages 987--996, 2009. Google ScholarDigital Library
C. Wang and S. Mahadevan. Heterogeneous domain adaptation using manifold alignment. In IJCAI, pages 1541--1546, 2011. Google ScholarDigital Library
J. Xu and H. Li. Adarank: a boosting algorithm for information retrieval. In Proceedings of the 30th ACM SIGIR, 2007. Google ScholarDigital Library
Q. Yang, Y. Chen, G.-R. Xue, W. Dai, and Y. Yu. Heterogeneous transfer learning for image clustering via the social web. ACL '09, pages 1--9, 2009. Google ScholarDigital Library
H. Zha, Z. Zheng, H. Fu, and G. Sun. Incorporating query difference for learning retrieval functions in world wide web search. In Proceedings of the 15th ACM CIKM conference, 2006. Google ScholarDigital Library
Z. Zheng, K. Chen, G. Sun, and H. Zha. A regression framework for learning ranking functions using relative relevance judgments. In SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 287--294, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
M. Zinkevich, M. Weimer, A. Smola, and L. Li. Parallelized stochastic gradient descent. In J. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R. Zemel, and A. Culotta, editors, Advances in Neural Information Processing Systems 23, pages 2595--2603, 2010.Google Scholar

Index Terms

Pairwise cross-domain factor model for heterogeneous transfer ranking
1. Information systems

Recommendations

Ranking with auxiliary data
CIKM '10: Proceedings of the 19th ACM international conference on Information and knowledge management

Learning to rank arises in many information retrieval applications, ranging from Web search engine, online advertising to recommendation system. In learning to rank, the performance of a ranking function heavily depends on the number of labeled examples ...
Read More
A risk minimization framework for domain adaptation
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

Supervised learning algorithms usually require high quality labeled training set of large volume. It is often expensive to obtain such labeled examples in every domain of an application. Domain adaptation aims to help in such cases by utilizing data ...
Read More
Domain‐invariant adversarial learning with conditional distribution alignment for unsupervised domain adaptation

Unsupervised domain adaption aims to reduce the divergence between the source domain and the target domain. The final objective is to learn domain‐invariant features from both domains that get the minimised expected error on the target domain. The ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WSDM '12: Proceedings of the fifth ACM international conference on Web search and data mining
February 2012
792 pages
ISBN:9781450307475
DOI:10.1145/2124295
General Chairs:
Eytan Adar
University of Michigan, USA
,
Jaime Teevan
Microsoft Research, USA
,
Program Chairs:
Eugene Agichtein
Emory University, USA
,
Yoelle Maarek
Yahoo! Research, Israel
Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 8 February 2012
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
heterogeneous transfer ranking
homogeneous transfer ranking
pairwise cross-domain factor model
ranking
source domain
stochastic gradient descent
target domain
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate498of2,863submissions,17%
Upcoming Conference
WSDM '25

Sponsor:

sigir

sigir

sigir

sigir

The Eighteenth ACM International Conference on Web Search and Data Mining

April 7 - 11, 2025

Hannover , Germany
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 456
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Pairwise cross-domain factor model for heterogeneous transfer ranking

WSDM '12: Proceedings of the fifth ACM international conference on Web search and data mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Ranking with auxiliary data

A risk minimization framework for domain adaptation

Domain‐invariant adversarial learning with conditional distribution alignment for unsupervised domain adaptation