Abstract
Traditional learning-to-rank problem mainly focuses on one single type of objects. However, with the rapid growth of the Web 2.0, ranking over multiple interrelated and heterogeneous objects becomes a common situation, e.g., the heterogeneous academic network. In this scenario, one may have much training data for some type of objects (e.g. conferences) while only very few for the interested types of objects (e.g. authors). Thus, the two important questions are: (1) Given a networked data set, how could one borrow supervision from other types of objects in order to build an accurate ranking model for the interested objects with insufficient supervision? (2) If there are links between different objects, how can we exploit their relationships for improved ranking performance? In this work, we first propose a regularized framework called HCDRank to simultaneously minimize two loss functions related to these two domains. Then, we extend the approach by exploiting the link information between heterogeneous objects. We conduct a theoretical analysis to the proposed approach and derive its generalization bound to demonstrate how the two related domains could help each other in learning ranking functions. Experimental results on three different genres of data sets demonstrate the effectiveness of the proposed approaches.
Similar content being viewed by others
References
Agarwal A, Chakrabarti S, Aggarwal S (2006) Learning to rank networked entities. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD’06), pp 14–23
Amini M-R, Truong T-V, Goutte C (2008) A boosting algorithm for learning bipartite ranking functions with partially labeled data. In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval (SIGIR’08), pp 99–106
Argyriou A, Evgeniou T, Pontil M (2006) Multi-task feature learning. In: Proceedings of the 18th neural information processing systems (NIPS’06), pp 41–48
Baccini A, Dejean S, Lafage L, Mothe J (2011) How many performance measures to evaluate information retrieval systems? Knowl Inf Syst 1–21. doi:10.1007/s10115-011-0391-7
Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. ACM Press, New York
Bar-Yossef Z, Guy I, Lempel R, Maarek YS, Soroka V (2008) Cluster ranking with an application to mining mailbox networks. Knowl Inf Syst 14(1): 101–139
Bickel S, Brückner M, Scheffer T (2007) Discriminative learning for differing training and test distributions. In: Proceedings of the 24th international conference on machine learning (ICML’07), pp 81–88
Blitzer J, Crammer K, Kulesza A, Pereira F, Wortman J (2007) Learning bounds for domain adaptation. In: Proceedings of the 19th neural information processing systems (NIPS’07), pp 129–136
Blitzer J, McDonald R, Pereira F (2006) Domain adaptation with structural correspondence learning. In: Proceedings of conference on empirical methods in natural language processing (EMNLP’06), pp 120–128
Bonilla E, Chai KM, ChrisWilliams (2008) Multi-task gaussian process prediction. In: Proceedings of the 20th neural information processing systems (NIPS’08), pp 153–160
Brefeld U, Scheffer T (2005) Auc maximizing support vector learning. In: Proceedings of the 2nd workshop on ROC analysis in machine learning (ROCML 2005)
Buckley C, Voorhees EM (2004) Retrieval evaluation with incomplete information. In: Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR’04), pp 25–32
Burges C, Shaked T, Renshaw E, Lazier A, Deeds M, Hamilton N, Hullender G (2005) Learning to rank using gradient descent. In: Proceedings of the 22th international conference on machine learning (ICML’05), pp 89–96
Chapelle O, Shivaswamy P, Vadrevu S, Weinberger K, Zhang Y, Tseng B (2010) Multi-task learning for boosting with application to web search ranking. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’10), pp 1189–1198
Chen K, Lu R, Wong CK, Sun G, Heck L, Tseng B (2008) Trada: tree based ranking function adaptation. In: Proceedings of the 17th ACM international conference on information and knowledge management (CIKM’08), pp 1143–1152
Cui J, Liu H, He J, Li P, Du X, Wang P (2011) Tagclus: a random walk-based method for tag clustering. Knowl and Inf Syst 27(2): 193–225
Czarnowski I (2011) Cluster-based instance selection for machine classification. Knowl Inf Syst
Dai W, Jin O, Xue G, Yang Q, Yu Y (2009) Eigentransfer: a unified framework for transfer learning. In: Proceedings of the 26th annual international conference on machine learning (ICML’09), pp 193–200
Dai W, Yang Q, Xue G-R, Yu Y (2007) Boosting for transfer learning. In: Proceedings of the 24th international conference on machine learning (ICML’07), pp 193–200
Duh K, Kirchhoff K (2008) Learning to rank with partially-labeled data. In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval (SIGIR’08), pp 251–258
Evgeniou T, Pontil M (2004) Regularized multi-task learning. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining (KDD’04), pp 109–117
Gao J, Fan W, Jian J, Han J (2008) Knowledge transfer via multiple model local structure mapping. In: Proceeding of the 14th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’08), pp 283–291
Gao J, Fan W, Sun Y, Han J (2009) Heterogeneous source consensus learning via decision propagation and negotiation. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining(KDD’09), pp 339–348
Gao J, Wu Q, Burges C, Svore K, Su Y, Khan N, Shah S, Zhou H (2009) Model adaptation via model interpolation and boosting for web search ranking. In: Proceedings of the 2009 conference on empirical methods in natural language processing (EMNLP’09), pp 505–513
Geng B, Yang L, Xu C, Hua X (2009) Ranking model adaptation for domain-specific search. In: Proceeding of the 18th ACM conference on information and knowledge management (CIKM’09), pp 197–206
Gupta SK, Phung D, Adams B, Tran T, Venkatesh S (2010) Nonnegative shared subspace learning and its application to social media retrieval. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’10), pp 1169–1178
He J, Liu Y, Lawrence R (2009) Graph-based transfer learning. In: Proceeding of the 18th ACM conference on information and knowledge management (CIKM’09), pp 937–946
Herbrich R, Graepel T, Obermayer K (2000) Large margin rank boundaries for ordinal regression. MIT Press, Cambridge
Hoi SC, Jin R (2008) Semi-supervised ensemble ranking. In: Proceedings of association for the advancement of artificial intelligence (AAAI’08), pp 634–639
Jarvelin K, Kekalainen J (2000) Ir evaluation methods for retrieving highly relevant documents. In: Proceedings of the 23th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR’00), pp 41–48
Jebara T (2004) Multi-task feature and kernel selection for svms. In: Proceedings of the 21th international conference on machine learning (ICML’04), pp 55–62
Jiang L, Li C, Cai Z (2009) Learning decision tree for ranking. Knowl Inf Syst 20(1): 123–135
Joachims T (2002) Learning to classify text using support vector machines. Dissertation
Joachims T (2006) Training linear svms in linear time. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’06), pp 217–226
Kang U, Tsourakakis CE, Faloutsos C (2011) Pegasus: mining peta-scale graphs. Knowl Inf Syst 27(2): 303–325
Lee S-I, Chatalbashev V, Vickrey D, Koller D (2007) Learning a meta-level prior for feature relevance from multiple related tasks. In: Proceedings of the 24th international conference on machine learning (ICML’07), pp 489–496
Li B, Yang Q, Xue X (2009) Transfer learning for collaborative filtering via a rating-matrix generative model. In: Proceedings of the 26th annual international conference on machine learning(ICML’09), pp 617–624
Ling X, Xue G, Dai W, Jiang Y, Yang Q, Yu Y (2008) Can chinese web pages be classified with english data source? In: Proceeding of the 17th international conference on World Wide Web (WWW’08), pp 969–978
Liu J, Ji S, Ye J (2009) Multi-task feature learning via efficient l 2,1-norm minimization. In: The twenty-fifth conference on uncertainty in artificial intelligence (UAI’09), pp 339–348
Liu T-Y, Xu J, Qin T, Xiong W, Li H (2007) Letor: Benchmark dataset for research on learning to rank for information retrieval. In: LR4IR 2007, in conjunction with SIGIR 2007
Mihalkova L, Mooney RJ (2009) Transfer learning from minimal target data by mapping across relational domains. In: Proceedings of the 21st international jont conference on artifical intelligence(IJCAI’09), pp 1163–1168
Pan SJ, Ni X, Sun J, Yang Q, Chen Z (2010) Cross-domain sentiment classification via spectral feature alignment. In: Proceedings of the 19th international World Wide Web conference(WWW’10), pp 751–760
Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng (TKDE) 22(10): 1345–1359
Qin T, Liu T, Zhang X, Wang D, Xiong W, Li H (2008) Learning to rank relational objects and its application to web search. In: 17th international World Wide Web conference (WWW’08), pp 407–416
Raina R, Battle A, Lee H, Packer B, Ng AY (2007) Self-taught learning: Transfer learning from unlabeled data. In: Proceedings of the 24th international conference on machine learning (ICML’07), pp 759–766
Rosa KD, Metsis V, Athitsos V (2011) Boosted ranking models: a unifying framework for ranking prediction. Knowl Inf Syst 1–26. doi:10.1007/s10115-011-0390-8
Shi X, Liu Q, Fan W, Yu PS, Zhu R (2010) Transfer learning on heterogenous feature spaces via spectral transformation. In: Proceedings of the 2010 IEEE international conference on data mining (ICDM’10), pp 1049–1054
Szummer M, Jaakkola T (2002) Partially labeled classification with markov random walks. In: Advances in neural information processing systems (NIPS’02), pp 945–952
Tang J, Jin R, Zhang J (2008) A topic modeling approach and its integration into the random walk framework for academic search. In: Proceedings of 2008 IEEE international conference on data mining (ICDM’08), pp 1055–1060
Tang J, Zhang J, Yao L, Li J, Zhang L, Su Z (2008) Arnetminer: Extraction and mining of academic social networks. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining (SIGKDD’08), pp 990–998
Wall ME, Rechtsteiner A, Rocha LM (2003) Singular value decomposition and principal component analysis. Kluwer, Norwell, pp 91–109
Wang B, Tang J, Fan W, Chen S, Yang Z, Liu Y (2009) Heterogeneous cross domain ranking in latent space. In: Proceedings of the eighteenth conference on information and knowledge management (CIKM’09), pp 987–996
Wang Z, Song Y, Zhang C (2008) Transferred dimensionality reduction. In: Machine learning and knowledge discovery in databases, European conference (ECML/PKDD’08), pp 550–565
Wong T-L, Lam W, Chen B (2009) Mining employment market via text block detection and adaptive cross-domain information extraction. In: Proceedings of the 32nd international ACM SIGIR conference on research and development in information retrieval(SIGIR’09), pp 283–290
Xie S, Fan W, Peng J, Verscheure O, Ren J (2009) Latent space domain transfer between high dimensional overlapping distributions. In: Proceedings of the 18th international conference on World wide web(WWW’09), pp 91–100
Xu J, Li H (2007) Adarank: a boosting algorithm for information retrieval. In: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR’07), pp 391–398
Yang Q, Chen Y, Xue G, Dai W, Yu Y (2009) Heterogeneous transfer learning for image clustering via the social web. In: Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP: Volume 1 (ACL’09), pp 1–9
Yang Z, Tang J, Wang B, Guo J, Li J, Chen S (2009) Expert2bole: from expert finding to bole search. In: Proceeding of the 15th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’09)
Yue Y, Finley T, Radlinski F, Joachims T (2007) A support vector method for optimizing average precision. In: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR’07), pp 271–278
Zhai C, Lafferty J (2001) Model-based feedback in the language modeling approach to information retrieval. In: Proceedings of the 10th conference on information and knowledge management (CIKM’01), pp 403–410
Zheng Z, Chen K, Sun G, Zha H (2007) A regression framework for learning ranking functions using relative relevance judgments. In: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR’07), pp 287–294
Zhong E, Fan W, Peng J, Zhang K, Ren J, Turaga D, Verscheure O (2009) Cross domain distribution adaptation via kernel mapping. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining(KDD’09), pp 1027–1036
Zhu J, Huang X, Song D, Ruger SM (2010) Integrating multiple document features in language models for expert finding. Knowl Inf Syst 23(1): 29–54
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wang, B., Tang, J., Fan, W. et al. Query-dependent cross-domain ranking in heterogeneous network. Knowl Inf Syst 34, 109–145 (2013). https://doi.org/10.1007/s10115-011-0472-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-011-0472-7