skip to main content
10.1145/2661829.2662040acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Ranking-based Clustering on General Heterogeneous Information Networks by Network Projection

Published: 03 November 2014 Publication History

Abstract

Recently there is an increasing attention in heterogeneous information network analysis, which models networked data as networks including different types of objects and relations. Many data mining tasks have been exploited in heterogeneous networks, among which clustering and ranking are two basic tasks. These two tasks are usually done separately, whereas recent researches show that they can mutually enhance each other. Unfortunately, these works are limited to heterogeneous networks with special structures (e.g. bipartite or star-schema network). However, real data are more complex and irregular, so it is desirable to design a general method to manage objects and relations in heterogeneous networks with arbitrary schema. In this paper, we study the ranking-based clustering problem in a general heterogeneous information network and propose a novel solution HeProjI. HeProjI projects a general heterogeneous network into a sequence of sub-networks and an information transfer mechanism is designed to keep the consistency among sub-networks. For each sub-network, a path-based random walk model is built to estimate the reachable probability of objects which can be used for clustering and ranking analysis. Iteratively analyzing each sub-network leads to effective ranking-based clustering. Extensive experiments on three real datasets illustrate that HeProjI can achieve better clustering and ranking performances compared to other well-established algorithms.

References

[1]
http://academic.research.microsoft.com.
[2]
S. Brin and L. Page. The anatomy of a large-scale hyper textual web search engine. Comput. Netw. ISDN Syst, 30(1-7):1757--1771, 1998.
[3]
B. Chen, Y. Ding, and D. Wild. Assessing drug target association using semantic linked data. PLoS Comput. Biol., 8(7): 1757--1771, 2012.
[4]
M. Grcar and N. Lavrac. A methodology for mining document-enriched heterogeneous information networks. In Discovery Science, pages 107--121, 2011.
[5]
J. Han. Mining heterogeneous information networks: the next frontier. In KDD, page Keynote speech, 2012.
[6]
G. Jeh and J. Widom. Simrank: a measure of structural-context similarity. In KDD, pages 538--543, 2002.
[7]
M. Ji, J. Han, and M. Danilevsky. Ranking-based classification of heterogeneous information networks. In KDD, pages 1298--1306, 2011.
[8]
B. Long, X. Wu, Z. Zhang, and P. S. Yu. Unsupervised learning on k-partite graphs. In KDD, pages 317--326, 2006.
[9]
Q. Mei, D. Cai, D. Zhang, and C. Zhai. Topic modeling with network regularization. In WWW, pages 101--110, 2008.
[10]
M. E. J. Newman and M. Girvan. Finding and evaluating community structure in networks. Physics Review E, 69(026113):1757--1771, 2004.
[11]
Z. Nie, Y. Zhang, J. R. Wen, and W. Y. Ma. Object-level ranking: bringing order to web objects. In WWW, pages 567--574, 2005.
[12]
C. Shi, Z. Yan, Y. Cai, and B. Wu. Multi-objective community detection in complex networks. Applied Soft Computing, 12(2):850--859, 2012.
[13]
C. Shi, C. Zhou, X. Kong, P. S. Yu, G. Liu and W. Bai. Heterecom: a semantic-based recommendation system in heterogeneous networks. In KDD, pages 1552--1555, 2012.
[14]
J. Shi and J. Malik. Normalized cuts and image segmentation. In CVPR, pages 731--737, 1997.
[15]
Y. Sun, J. Han, J. Gao, and Y. Yu. Itopicmodel: information network-integrated topic modeling. In ICDM, pages 493--502, 2009.
[16]
Y. Sun, J. Han, X. F. Yan, P. S. Yu, and T. Wu. PathSim: meta path-based top-K similarity search in heterogeneous information networks. In VLDB, pages 992--1003, 2011.
[17]
Y. Sun, J. Han, P. Zhao, Z. Yin, H. Cheng, and T. Wu. Rankclus: integrating clustering with ranking for heterogeneous information network analysis. In EDBT, pages 565--576, 2009.
[18]
Y. Sun, B. Norick, J. Han, X. Yan, P. S. Yu, and X. Yu. Integrating meta-path selection with user guided object clustering in heterogeneous information networks. In KDD, pages 1348--1356, 2012.
[19]
Y. Sun, Y. Yu, and J. Han. Ranking-based clustering of heterogeneous information networks with star network schema. In KDD, pages 797--806, 2009.
[20]
R. Wang, C. Shi, P. S. Yu, and B. Wu. Integrating clustering and ranking on hybrid heterogeneous information network. In PAKDD, pages 583--594, 2013.
[21]
D. Zhou, S. Orshanskiy, H. Zha, and C. Giles. Co-ranking authors and documents in a heterogeneous network. In ICDM, pages 739--744, 2007.

Cited By

View all
  • (2024)Self-Training GNN-based Community Search in Large Attributed Heterogeneous Information Networks2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00216(2765-2778)Online publication date: 13-May-2024
  • (2024)Efficient Core Decomposition Over Large Heterogeneous Information Networks2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00189(2393-2406)Online publication date: 13-May-2024
  • (2024)Attribute-sensitive community search over attributed heterogeneous information networksExpert Systems with Applications10.1016/j.eswa.2023.121153235(121153)Online publication date: Jan-2024
  • Show More Cited By

Index Terms

  1. Ranking-based Clustering on General Heterogeneous Information Networks by Network Projection

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management
    November 2014
    2152 pages
    ISBN:9781450325981
    DOI:10.1145/2661829
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 03 November 2014

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. clustering
    2. heterogeneous information network
    3. ranking

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    CIKM '14
    Sponsor:

    Acceptance Rates

    CIKM '14 Paper Acceptance Rate 175 of 838 submissions, 21%;
    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)11
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 17 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Self-Training GNN-based Community Search in Large Attributed Heterogeneous Information Networks2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00216(2765-2778)Online publication date: 13-May-2024
    • (2024)Efficient Core Decomposition Over Large Heterogeneous Information Networks2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00189(2393-2406)Online publication date: 13-May-2024
    • (2024)Attribute-sensitive community search over attributed heterogeneous information networksExpert Systems with Applications10.1016/j.eswa.2023.121153235(121153)Online publication date: Jan-2024
    • (2023)Influential Community Search over Large Heterogeneous Information NetworksProceedings of the VLDB Endowment10.14778/3594512.359453216:8(2047-2060)Online publication date: 1-Apr-2023
    • (2023)Atrapos: Real-time Evaluation of Metapath Query WorkloadsProceedings of the ACM Web Conference 202310.1145/3543507.3583322(2487-2498)Online publication date: 30-Apr-2023
    • (2022)Effective community search over large star-schema heterogeneous information networksProceedings of the VLDB Endowment10.14778/3551793.355179515:11(2307-2320)Online publication date: 1-Jul-2022
    • (2022)Efficient Distributed Clustering Algorithms on Star-Schema Heterogeneous GraphsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2020.304763134:10(4781-4796)Online publication date: 1-Oct-2022
    • (2022)Personalized paper recommendation for postgraduates using multi-semantic path fusionApplied Intelligence10.1007/s10489-022-04017-x53:8(9634-9649)Online publication date: 10-Aug-2022
    • (2022)Related Work on CSMs and SolutionsCohesive Subgraph Search Over Large Heterogeneous Information Networks10.1007/978-3-030-97568-5_6(57-60)Online publication date: 23-Feb-2022
    • (2021)Detecting Communities from Heterogeneous GraphsProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482250(1170-1180)Online publication date: 26-Oct-2021
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media