skip to main content
10.1145/2247596.2247618acmotherconferencesArticle/Chapter ViewAbstractPublication PagesedbtConference Proceedingsconference-collections
research-article

Relevance search in heterogeneous networks

Published: 27 March 2012 Publication History

Abstract

Conventional research on similarity search focuses on measuring the similarity between objects with the same type. However, in many real-world applications, we need to measure the relatedness between objects with different types. For example, in automatic expert profiling, people are interested in finding the most relevant objects to an expert, where the objects can be of various types, such as research areas, conferences and papers, etc. With the surge of study on heterogeneous networks, the relatedness measure on objects with different types becomes increasingly important. In this paper, we study the relevance search problem in heterogeneous networks, where the task is to measure the relatedness of heterogeneous objects (including objects with the same type or different types). We propose a novel measure, called HeteSim, with the following attributes: (1) a path-constrained measure: the relatedness of object pairs are defined based on the search path that connect two objects through following a sequence of node types; (2) a uniform measure: it can measure the relatedness of objects with the same or different types in a uniform framework; (3) a semi-metric measure: HeteSim has some good properties (e.g., self-maximum and symmetric), that are crucial to many tasks. Empirical studies show that HeteSim can effectively evaluate the relatedness of heterogeneous objects. Moreover, in the query and clustering tasks, it can achieve better performances than conventional measures.

References

[1]
A. Balmin, V. Hristidis, and Y. Papakonstantinou. Objectrank: Authority-based keyword search in databases. In VLDB, pages 564--575, 2004.
[2]
S. Berchtold, B. Ertl, D. A. Keim, H. peter Kriegel, and T. Seidl. Fast nearest neighbor search in high-dimensional space. In ICDE, pages 209--218, 1998.
[3]
D. Fogaras, B. Rácz, K. Csalogány, and T. Sarlós. Towards scaling fully personalized pagerank: Algorithms, lower bounds, and experiments. Internet Mathematics, 2(3):333--358, 2005.
[4]
M. Gupta, A. Pathak, and S. Chakrabarti. Fast algorithms for top-k personalized pagerank queries. In WWW, pages 1225--1226, 2008.
[5]
G. Jeh and J. Widom. Simrank: a measure of structural-context similarity. In KDD, pages 538--543, 2002.
[6]
G. Jeh and J. Widom. Scaling personalized web search. In WWW, pages 271--279, 2003.
[7]
M. Ji, Y. Sun, M. Danilevsky, J. Han, and J. Gao. Graph regularized transductive classification on heterogeneous information networks. In ECML/PKDD, pages 570--586, 2010.
[8]
R. Jin, V. E. Lee, and H. Hong. Axiomatic ranking of network role similarity. In KDD, pages 922--930, 2011.
[9]
M. Kolahdouzan and C. Shahabi. Voronoi-based K nearest neighbor search for spatial network databases. In VLDB, pages 840--851, 2004.
[10]
J. A. Konstan, B. N. Miller, D. Maltz, J. L. Herlocker, L. R. Gordon, and J. Riedl. Grouplens: Applying collaborative filtering to usenet news. Communications of the ACM, 40(3):77--87, 1997.
[11]
N. Lao and W. Cohen. Fast query execution for retrieval models based on path constrained random walks. In KDD, pages 881--888, 2010.
[12]
N. Lao and W. W. Cohen. Relational retrieval using a combination of path-constrained random walks. Machine Learning, 81(2):53--67, 2010.
[13]
C. Li, J. Han, G. He, X. Jin, Y. Sun, Y. Yu, and T. Wu. Fast computation of simrank for static and dynamic information networks. In EDBT, pages 465--476, 2010.
[14]
L. Liu, J. Tang, J. Han, M. Jiang, and S. Yang. Mining topic-level influence in heterogeneous networks. In CIKM, pages 199--208, 2010.
[15]
D. Lizorkin, P. Velikhov, M. Grinev, and D. Turdakov. Accuracy estimate and optimization techniques for simrank computation. In PVLDB, pages 422--433, 2008.
[16]
Z. Nie, Y. Zhang, J. Wen, and W. Ma. Object-level ranking: bringing order to web objects. In WWW, pages 422--433, 2005.
[17]
L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford University Database Group, 1998.
[18]
J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Transaction of Pattern Analysis and Machine Intelligence, 22(8):888--905, 2000.
[19]
Y. Sun, J. Han, X. Yan, P. Yu, and T. Wu. Pathsim: Meta path-based top-k similarity search in heterogeneous information networks. In VLDB, 2011.
[20]
Y. Sun, J. Han, P. Zhao, Z. Yin, H. Cheng, and T. Wu. RankClus: integrating clustering with ranking for heterogeneous information network analysis. In EDBT, pages 565--576, 2009.
[21]
H. Tong, C. Faloutsos, and J. Pan. Fast random walk with restart and its applications. In ICDM, pages 613--622, 2006.
[22]
Q. Xia. The geodesic problem in quasimetric spaces. Journal of Geometric Analysis, 19(2):452--479, 2009.
[23]
C. Xiao, W. Wang, X. Lin, and H. Shang. Top-k set similarity joins. In ICDE, pages 916--927, 2009.
[24]
X. Xu, N. Yuruk, Z. Feng, and T. A. J. Schweiger. Scan: an structural clustering algorithm for netowrks. In KDD, pages 824--833, 2007.

Cited By

View all
  • (2024)Efficient Maximal Motif-Clique Enumeration over Large Heterogeneous Information NetworksProceedings of the VLDB Endowment10.14778/3681954.368197517:11(2946-2959)Online publication date: 30-Aug-2024
  • (2024)Meta-path automatically extracted from heterogeneous information network for recommendationWorld Wide Web10.1007/s11280-024-01265-427:3Online publication date: 13-Apr-2024
  • (2024)Managing Traceability for Software Life Cycle ProcessesTheoretical Aspects of Software Engineering10.1007/978-3-031-64626-3_25(428-445)Online publication date: 14-Jul-2024
  • Show More Cited By

Index Terms

  1. Relevance search in heterogeneous networks

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    EDBT '12: Proceedings of the 15th International Conference on Extending Database Technology
    March 2012
    643 pages
    ISBN:9781450307901
    DOI:10.1145/2247596
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 March 2012

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. heterogeneous information network
    2. pairwise random walk
    3. similarity search

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    EDBT '12

    Acceptance Rates

    Overall Acceptance Rate 7 of 10 submissions, 70%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)19
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 17 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Efficient Maximal Motif-Clique Enumeration over Large Heterogeneous Information NetworksProceedings of the VLDB Endowment10.14778/3681954.368197517:11(2946-2959)Online publication date: 30-Aug-2024
    • (2024)Meta-path automatically extracted from heterogeneous information network for recommendationWorld Wide Web10.1007/s11280-024-01265-427:3Online publication date: 13-Apr-2024
    • (2024)Managing Traceability for Software Life Cycle ProcessesTheoretical Aspects of Software Engineering10.1007/978-3-031-64626-3_25(428-445)Online publication date: 14-Jul-2024
    • (2023)Influential Community Search over Large Heterogeneous Information NetworksProceedings of the VLDB Endowment10.14778/3594512.359453216:8(2047-2060)Online publication date: 1-Apr-2023
    • (2023)Efficient and Effective Academic Expert Finding on Heterogeneous Graphs through (k, 𝒫)-Core based EmbeddingACM Transactions on Knowledge Discovery from Data10.1145/357836517:6(1-35)Online publication date: 22-Mar-2023
    • (2023)TRAVERS: A Diversity-Based Dynamic Approach to Iterative Relevance Search over Knowledge GraphsProceedings of the ACM Web Conference 202310.1145/3543507.3583429(2560-2571)Online publication date: 30-Apr-2023
    • (2023)Interpretable Clinical Trial Search using Pubmed Citation Network2023 IEEE International Conference on Digital Health (ICDH)10.1109/ICDH60066.2023.00056(328-338)Online publication date: Jul-2023
    • (2023)Session-Based Recommendation Along with the Session Style of ExplanationMachine Learning and Knowledge Discovery in Databases10.1007/978-3-031-26387-3_25(404-420)Online publication date: 17-Mar-2023
    • (2022)Leveraging Dynamic Heterogeneous Networks to Study Transnational Issue Publics. The Case of the European COVID-19 Discourse on TwitterFrontiers in Sociology10.3389/fsoc.2022.8846407Online publication date: 30-Jun-2022
    • (2022)Effective community search over large star-schema heterogeneous information networksProceedings of the VLDB Endowment10.14778/3551793.355179515:11(2307-2320)Online publication date: 1-Jul-2022
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media