skip to main content
10.1145/3297280.3297331acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

Pairwise normalization in SimRank variants: problem, solution, and evaluation

Published: 08 April 2019 Publication History

Abstract

Despite of the success in the real-world applications, SimRank and its variants, rvs-SimRank and PRank, suffer from the pairwise normalization problem (PNP) as a counter intuitive property hidden in their computation paradigm. Jac-Sim, a state-of-the-art measure, provides an effective solution to PNP; however, it considers PNP and the effectiveness of the provided solution only in SimRank. In this paper, we consider PNP in SimRank variants, generalize the solution to those measures, conduct extensive experiments with two real-world datasets to accurately perform parameter tuning, and verify the effectiveness of applying the solution to SimRank variants with both un-weighted and weighted graphs.

References

[1]
Ioannis Antonellis, Hector Garcia Molina, and Chi Chao Chang. 2008. Simrank++: Query Rewriting Through Link Analysis of the Click Graph. PVLDB 1, 1 (2008), 408--421.
[2]
Yuanzhe Cai, Pei Li, Hongyan Liu, Jun He, and Xiaoyong Du. 2008. S-SimRank: Combining Content and Link Information to Cluster Papers Effectively and Efficiently. In Lecture Notes in Computer Science. 317--329.
[3]
Daniel Fogaras and Balazs Racz. 2005. Scaling Link-based Similarity Search. In Proceedings of the 14th International Conference on World Wide Web, WWW. 641--650.
[4]
Yasushiro Fujiwara, Makoto Nakatsuji, Hiroaki Shiokawa, and Makoto Onizuka. 2013. Efficient Search Algorithm for SimRank. In Proceedings of the 29th International Conference on Data Engineering, ICDE. 589--600.
[5]
Jiawei Han, Micheline Kamber, and Jian Pei. 2006. Data Mining: Concepts and Techniques, Second Edition. Morgan Kaufmann, San Francisco.
[6]
Guoming He, Haijun Feng, Cuiping Li, and Hong Chen. 2010. Parallel SimRank Computation on Large Graphs with Iterative Aggregation. In Proceedings of the 16th International Conference on Knowledge Discovery and Data Mining, ACM SIGKDD. 543--552.
[7]
Glen Jeh and Jennifer Widom. 2002. SimRank: A Measure of Structural-Context Similarity. In Proceedings of the 8th ACM International Conference on Knowledge Discovery and Data Mining, ACM SIGKDD. 538--543.
[8]
Mitsuru Kusumoto, Takanori Maehara, and Ken-ichi Kawarabayashi. 2014. Scalable Similarity Search for SimRank. In Proceedings of the 2014 International Conference on Management of Data, ACM SIGMOD. 325--336.
[9]
Cuiping Li, Jiawei Han, Guoming He, Xin Jin, Yizhou Sun, Yintao Yu, and Tianyi Wu. 2010. Fast Computation of SimRank for Static and Dynamic Information Networks. In Proceedings of the 13th International Conference on Extending Database Technology, EDBT. 465--476.
[10]
Longjie Li, Min Ma, Peng Lei, Mingwei Leng, and Xiaoyun Chen. 2014. S2R&R2S: A Framework for Ranking Vertex and Computing Vertex-pair Similarity Simultaneously. Information Science 40, 6 (2014), 723--735.
[11]
Zhenjiang Lin, Michael R. Lyu, and Irwin King. 2009. Match-Sim: A Novel Neighbor-based Similarity Measure with Maximum Neighborhood Matching. In Proceedings of the 18th ACM Conference on Information and Knowledge Management, ACM CIKM. 1613--1616.
[12]
Walid Magdy and Gareth J.F. Jones. 2010. PRES: A Score Metric for Evaluating Recall-oriented Information Retrieval Applications. In Proceedings of the 33rd International Conference on Research and Development in Information Retrieval, ACM SIGIR. 611--618.
[13]
Christopher Manning, Prabhakar Raghavan, and Hinrich Schutze. 2008. Introduction to Information Retrieval. Cambridge University Press, New York.
[14]
Tao Qin, Yan Liu, Jun Xu, and Hang Li. 2010. LETOR: A Benchmark Collection for Research on Learning to Rank for Information Retrieval. Information Retrieval 13, 4 (2010), 346--374.
[15]
Masoud Reyhani Hamedani and Sang-Wook Kim. 2017. JacSim: An Accurate and Efficient Link-based Similarity Measure in Graphs. Information Sciences 414 (2017), 203--224.
[16]
Masoud Reyhani Hamedani, Sang-Wook Kim, and Dongjin Kim. 2016. SimCC: A Novel Method to Consider both Content and Citations for Computing Similarity of Scientific Papers. Information Sciences 334--335 (2016), 273--292.
[17]
Yousef Saad. 2003. Iterative Methods for Sparse Linear Systems. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA.
[18]
Yingxia Shao, Bin Cui, Lei Chen, Mingming Liu, and Xie Xing. 2015. An Efficient Similarity Search Framework for SimRank Over Large Dynamic Graphs. PVLDB 8, 8 (2015), 838--849.
[19]
Seok Yoon, Sang Kim, and Park Sunju. 2016. C-Rank: A Link-based Similarity Measure for Scientific Literature Databases. Information Sciences 326 (2016), 25--40.
[20]
Weiren Yu, Xuemin Lin, Wenjie Zhang, Lijun Chang, and Jian Pei. 2013. More is Simpler: Effectively and Efficiently Assessing Node-pair Similarities Based on Hyperlinks. PVLDB 7, 1 (2013), 2150--8097.
[21]
Weiren Yu and Julie A. McCann. 2015. Efficient Partial-pairs SimRank Search on Large Networks. PVLDB 8, 5 (2015), 569--580.
[22]
Weiren Yu, Wenjie Zhang, Xuemin Lin, Qing Zhang, and Jiajin Le. 2012. A Space and Time Efficient Algorithm for SimRank Computation. World Wide Web 15, 3 (2012), 327--353.
[23]
Peixiang Zhao, Jiawei Han, and Sun Yizhou. 2009. P-Rank: a Comprehensive Structural Similarity Measure over Information Networks. In Proceedings of the 18th ACM Conference on Information and Knowledge Management, ACM CIKM. 553--562.
[24]
Weiguo Zheng, Lei Zou, Yansong Feng, Lei Chen, and Dongyan Zhao. 2013. Efficient SimRank-based Similarity Join over Large Graphs. PVLDB 6, 7 (2013), 493--504.

Cited By

View all
  • (2022)Scaling High-Quality Pairwise Link-Based Similarity Retrieval on Billion-Edge GraphsACM Transactions on Information Systems10.1145/349520940:4(1-45)Online publication date: 11-Jan-2022
  • (2021)AdaSimProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482316(1528-1537)Online publication date: 26-Oct-2021
  • (2021)Embedding Methods or Link-based Similarity Measures, Which is Better for Link Prediction?2021 7th IEEE International Conference on Network Intelligence and Digital Content (IC-NIDC)10.1109/IC-NIDC54101.2021.9660590(378-382)Online publication date: 17-Nov-2021
  • Show More Cited By

Index Terms

  1. Pairwise normalization in SimRank variants: problem, solution, and evaluation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SAC '19: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing
    April 2019
    2682 pages
    ISBN:9781450359337
    DOI:10.1145/3297280
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 08 April 2019

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. JacSim
    2. SimRank
    3. link-based similarity
    4. pairwise normalization problem

    Qualifiers

    • Research-article

    Funding Sources

    • Korea government (MSIT; Ministry of Science and ICT)

    Conference

    SAC '19
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

    Upcoming Conference

    SAC '25
    The 40th ACM/SIGAPP Symposium on Applied Computing
    March 31 - April 4, 2025
    Catania , Italy

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)9
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 17 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Scaling High-Quality Pairwise Link-Based Similarity Retrieval on Billion-Edge GraphsACM Transactions on Information Systems10.1145/349520940:4(1-45)Online publication date: 11-Jan-2022
    • (2021)AdaSimProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482316(1528-1537)Online publication date: 26-Oct-2021
    • (2021)Embedding Methods or Link-based Similarity Measures, Which is Better for Link Prediction?2021 7th IEEE International Conference on Network Intelligence and Digital Content (IC-NIDC)10.1109/IC-NIDC54101.2021.9660590(378-382)Online publication date: 17-Nov-2021
    • (2020)On Investigating Both Effectiveness and Efficiency of Embedding Methods in Task of Similarity Computation of Nodes in GraphsApplied Sciences10.3390/app1101016211:1(162)Online publication date: 26-Dec-2020
    • (2019)Efficient Pairwise Penetrating-rank Similarity RetrievalACM Transactions on the Web10.1145/336861613:4(1-52)Online publication date: 18-Dec-2019

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media