Query-dependent cross-domain ranking in heterogeneous network

Wang, Bo; Tang, Jie; Fan, Wei; Chen, Songcan; Tan, Chenhao; Yang, Zi

doi:10.1007/s10115-011-0472-7

Query-dependent cross-domain ranking in heterogeneous network

Regular Paper
Published: 18 January 2012

Volume 34, pages 109–145, (2013)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Bo Wang¹,
Jie Tang²,
Wei Fan³,
Songcan Chen¹,
Chenhao Tan² &
…
Zi Yang²

474 Accesses
8 Citations
Explore all metrics

Abstract

Traditional learning-to-rank problem mainly focuses on one single type of objects. However, with the rapid growth of the Web 2.0, ranking over multiple interrelated and heterogeneous objects becomes a common situation, e.g., the heterogeneous academic network. In this scenario, one may have much training data for some type of objects (e.g. conferences) while only very few for the interested types of objects (e.g. authors). Thus, the two important questions are: (1) Given a networked data set, how could one borrow supervision from other types of objects in order to build an accurate ranking model for the interested objects with insufficient supervision? (2) If there are links between different objects, how can we exploit their relationships for improved ranking performance? In this work, we first propose a regularized framework called HCDRank to simultaneously minimize two loss functions related to these two domains. Then, we extend the approach by exploiting the link information between heterogeneous objects. We conduct a theoretical analysis to the proposed approach and derive its generalization bound to demonstrate how the two related domains could help each other in learning ranking functions. Experimental results on three different genres of data sets demonstrate the effectiveness of the proposed approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Agarwal A, Chakrabarti S, Aggarwal S (2006) Learning to rank networked entities. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD’06), pp 14–23
Amini M-R, Truong T-V, Goutte C (2008) A boosting algorithm for learning bipartite ranking functions with partially labeled data. In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval (SIGIR’08), pp 99–106
Argyriou A, Evgeniou T, Pontil M (2006) Multi-task feature learning. In: Proceedings of the 18th neural information processing systems (NIPS’06), pp 41–48
Baccini A, Dejean S, Lafage L, Mothe J (2011) How many performance measures to evaluate information retrieval systems? Knowl Inf Syst 1–21. doi:10.1007/s10115-011-0391-7
Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. ACM Press, New York
Google Scholar
Bar-Yossef Z, Guy I, Lempel R, Maarek YS, Soroka V (2008) Cluster ranking with an application to mining mailbox networks. Knowl Inf Syst 14(1): 101–139
Article Google Scholar
Bickel S, Brückner M, Scheffer T (2007) Discriminative learning for differing training and test distributions. In: Proceedings of the 24th international conference on machine learning (ICML’07), pp 81–88
Blitzer J, Crammer K, Kulesza A, Pereira F, Wortman J (2007) Learning bounds for domain adaptation. In: Proceedings of the 19th neural information processing systems (NIPS’07), pp 129–136
Blitzer J, McDonald R, Pereira F (2006) Domain adaptation with structural correspondence learning. In: Proceedings of conference on empirical methods in natural language processing (EMNLP’06), pp 120–128
Bonilla E, Chai KM, ChrisWilliams (2008) Multi-task gaussian process prediction. In: Proceedings of the 20th neural information processing systems (NIPS’08), pp 153–160
Brefeld U, Scheffer T (2005) Auc maximizing support vector learning. In: Proceedings of the 2nd workshop on ROC analysis in machine learning (ROCML 2005)
Buckley C, Voorhees EM (2004) Retrieval evaluation with incomplete information. In: Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR’04), pp 25–32
Burges C, Shaked T, Renshaw E, Lazier A, Deeds M, Hamilton N, Hullender G (2005) Learning to rank using gradient descent. In: Proceedings of the 22th international conference on machine learning (ICML’05), pp 89–96
Chapelle O, Shivaswamy P, Vadrevu S, Weinberger K, Zhang Y, Tseng B (2010) Multi-task learning for boosting with application to web search ranking. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’10), pp 1189–1198
Chen K, Lu R, Wong CK, Sun G, Heck L, Tseng B (2008) Trada: tree based ranking function adaptation. In: Proceedings of the 17th ACM international conference on information and knowledge management (CIKM’08), pp 1143–1152
Cui J, Liu H, He J, Li P, Du X, Wang P (2011) Tagclus: a random walk-based method for tag clustering. Knowl and Inf Syst 27(2): 193–225
Article MATH Google Scholar
Czarnowski I (2011) Cluster-based instance selection for machine classification. Knowl Inf Syst
Dai W, Jin O, Xue G, Yang Q, Yu Y (2009) Eigentransfer: a unified framework for transfer learning. In: Proceedings of the 26th annual international conference on machine learning (ICML’09), pp 193–200
Dai W, Yang Q, Xue G-R, Yu Y (2007) Boosting for transfer learning. In: Proceedings of the 24th international conference on machine learning (ICML’07), pp 193–200
Duh K, Kirchhoff K (2008) Learning to rank with partially-labeled data. In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval (SIGIR’08), pp 251–258
Evgeniou T, Pontil M (2004) Regularized multi-task learning. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining (KDD’04), pp 109–117
Gao J, Fan W, Jian J, Han J (2008) Knowledge transfer via multiple model local structure mapping. In: Proceeding of the 14th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’08), pp 283–291
Gao J, Fan W, Sun Y, Han J (2009) Heterogeneous source consensus learning via decision propagation and negotiation. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining(KDD’09), pp 339–348
Gao J, Wu Q, Burges C, Svore K, Su Y, Khan N, Shah S, Zhou H (2009) Model adaptation via model interpolation and boosting for web search ranking. In: Proceedings of the 2009 conference on empirical methods in natural language processing (EMNLP’09), pp 505–513
Geng B, Yang L, Xu C, Hua X (2009) Ranking model adaptation for domain-specific search. In: Proceeding of the 18th ACM conference on information and knowledge management (CIKM’09), pp 197–206
Gupta SK, Phung D, Adams B, Tran T, Venkatesh S (2010) Nonnegative shared subspace learning and its application to social media retrieval. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’10), pp 1169–1178
He J, Liu Y, Lawrence R (2009) Graph-based transfer learning. In: Proceeding of the 18th ACM conference on information and knowledge management (CIKM’09), pp 937–946
Herbrich R, Graepel T, Obermayer K (2000) Large margin rank boundaries for ordinal regression. MIT Press, Cambridge
Google Scholar
Hoi SC, Jin R (2008) Semi-supervised ensemble ranking. In: Proceedings of association for the advancement of artificial intelligence (AAAI’08), pp 634–639
Jarvelin K, Kekalainen J (2000) Ir evaluation methods for retrieving highly relevant documents. In: Proceedings of the 23th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR’00), pp 41–48
Jebara T (2004) Multi-task feature and kernel selection for svms. In: Proceedings of the 21th international conference on machine learning (ICML’04), pp 55–62
Jiang L, Li C, Cai Z (2009) Learning decision tree for ranking. Knowl Inf Syst 20(1): 123–135
Article Google Scholar
Joachims T (2002) Learning to classify text using support vector machines. Dissertation
Joachims T (2006) Training linear svms in linear time. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’06), pp 217–226
Kang U, Tsourakakis CE, Faloutsos C (2011) Pegasus: mining peta-scale graphs. Knowl Inf Syst 27(2): 303–325
Article Google Scholar
Lee S-I, Chatalbashev V, Vickrey D, Koller D (2007) Learning a meta-level prior for feature relevance from multiple related tasks. In: Proceedings of the 24th international conference on machine learning (ICML’07), pp 489–496
Li B, Yang Q, Xue X (2009) Transfer learning for collaborative filtering via a rating-matrix generative model. In: Proceedings of the 26th annual international conference on machine learning(ICML’09), pp 617–624
Ling X, Xue G, Dai W, Jiang Y, Yang Q, Yu Y (2008) Can chinese web pages be classified with english data source? In: Proceeding of the 17th international conference on World Wide Web (WWW’08), pp 969–978
Liu J, Ji S, Ye J (2009) Multi-task feature learning via efficient l _2,1-norm minimization. In: The twenty-fifth conference on uncertainty in artificial intelligence (UAI’09), pp 339–348
Liu T-Y, Xu J, Qin T, Xiong W, Li H (2007) Letor: Benchmark dataset for research on learning to rank for information retrieval. In: LR4IR 2007, in conjunction with SIGIR 2007
Mihalkova L, Mooney RJ (2009) Transfer learning from minimal target data by mapping across relational domains. In: Proceedings of the 21st international jont conference on artifical intelligence(IJCAI’09), pp 1163–1168
Pan SJ, Ni X, Sun J, Yang Q, Chen Z (2010) Cross-domain sentiment classification via spectral feature alignment. In: Proceedings of the 19th international World Wide Web conference(WWW’10), pp 751–760
Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng (TKDE) 22(10): 1345–1359
Article Google Scholar
Qin T, Liu T, Zhang X, Wang D, Xiong W, Li H (2008) Learning to rank relational objects and its application to web search. In: 17th international World Wide Web conference (WWW’08), pp 407–416
Raina R, Battle A, Lee H, Packer B, Ng AY (2007) Self-taught learning: Transfer learning from unlabeled data. In: Proceedings of the 24th international conference on machine learning (ICML’07), pp 759–766
Rosa KD, Metsis V, Athitsos V (2011) Boosted ranking models: a unifying framework for ranking prediction. Knowl Inf Syst 1–26. doi:10.1007/s10115-011-0390-8
Shi X, Liu Q, Fan W, Yu PS, Zhu R (2010) Transfer learning on heterogenous feature spaces via spectral transformation. In: Proceedings of the 2010 IEEE international conference on data mining (ICDM’10), pp 1049–1054
Szummer M, Jaakkola T (2002) Partially labeled classification with markov random walks. In: Advances in neural information processing systems (NIPS’02), pp 945–952
Tang J, Jin R, Zhang J (2008) A topic modeling approach and its integration into the random walk framework for academic search. In: Proceedings of 2008 IEEE international conference on data mining (ICDM’08), pp 1055–1060
Tang J, Zhang J, Yao L, Li J, Zhang L, Su Z (2008) Arnetminer: Extraction and mining of academic social networks. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining (SIGKDD’08), pp 990–998
Wall ME, Rechtsteiner A, Rocha LM (2003) Singular value decomposition and principal component analysis. Kluwer, Norwell, pp 91–109
Google Scholar
Wang B, Tang J, Fan W, Chen S, Yang Z, Liu Y (2009) Heterogeneous cross domain ranking in latent space. In: Proceedings of the eighteenth conference on information and knowledge management (CIKM’09), pp 987–996
Wang Z, Song Y, Zhang C (2008) Transferred dimensionality reduction. In: Machine learning and knowledge discovery in databases, European conference (ECML/PKDD’08), pp 550–565
Wong T-L, Lam W, Chen B (2009) Mining employment market via text block detection and adaptive cross-domain information extraction. In: Proceedings of the 32nd international ACM SIGIR conference on research and development in information retrieval(SIGIR’09), pp 283–290
Xie S, Fan W, Peng J, Verscheure O, Ren J (2009) Latent space domain transfer between high dimensional overlapping distributions. In: Proceedings of the 18th international conference on World wide web(WWW’09), pp 91–100
Xu J, Li H (2007) Adarank: a boosting algorithm for information retrieval. In: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR’07), pp 391–398
Yang Q, Chen Y, Xue G, Dai W, Yu Y (2009) Heterogeneous transfer learning for image clustering via the social web. In: Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP: Volume 1 (ACL’09), pp 1–9
Yang Z, Tang J, Wang B, Guo J, Li J, Chen S (2009) Expert2bole: from expert finding to bole search. In: Proceeding of the 15th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’09)
Yue Y, Finley T, Radlinski F, Joachims T (2007) A support vector method for optimizing average precision. In: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR’07), pp 271–278
Zhai C, Lafferty J (2001) Model-based feedback in the language modeling approach to information retrieval. In: Proceedings of the 10th conference on information and knowledge management (CIKM’01), pp 403–410
Zheng Z, Chen K, Sun G, Zha H (2007) A regression framework for learning ranking functions using relative relevance judgments. In: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR’07), pp 287–294
Zhong E, Fan W, Peng J, Zhang K, Ren J, Turaga D, Verscheure O (2009) Cross domain distribution adaptation via kernel mapping. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining(KDD’09), pp 1027–1036
Zhu J, Huang X, Song D, Ruger SM (2010) Integrating multiple document features in language models for expert finding. Knowl Inf Syst 23(1): 29–54
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Nanjing University of Aeronautics and Astronautics, Nanjing, China
Bo Wang & Songcan Chen
Department of Computer Science, Tsinghua University, 100084, Beijing, China
Jie Tang, Chenhao Tan & Zi Yang
IBM T.J. Watson Research Center, New York, USA
Wei Fan

Authors

Bo Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jie Tang
View author publications
You can also search for this author in PubMed Google Scholar
Wei Fan
View author publications
You can also search for this author in PubMed Google Scholar
Songcan Chen
View author publications
You can also search for this author in PubMed Google Scholar
Chenhao Tan
View author publications
You can also search for this author in PubMed Google Scholar
Zi Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jie Tang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, B., Tang, J., Fan, W. et al. Query-dependent cross-domain ranking in heterogeneous network. Knowl Inf Syst 34, 109–145 (2013). https://doi.org/10.1007/s10115-011-0472-7

Download citation

Received: 31 December 2010
Revised: 28 October 2011
Accepted: 09 December 2011
Published: 18 January 2012
Issue Date: January 2013
DOI: https://doi.org/10.1007/s10115-011-0472-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Query-dependent cross-domain ranking in heterogeneous network

Abstract

Access this article

Similar content being viewed by others

S-Rank: A Supervised Ranking Framework for Relationship Prediction in Heterogeneous Information Networks

Supervised ranking framework for relationship prediction in heterogeneous information networks

Learning to rank using multiple loss functions

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Query-dependent cross-domain ranking in heterogeneous network

Abstract

Access this article

Similar content being viewed by others

S-Rank: A Supervised Ranking Framework for Relationship Prediction in Heterogeneous Information Networks

Supervised ranking framework for relationship prediction in heterogeneous information networks

Learning to rank using multiple loss functions

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation