skip to main content
article

Relevance search and anomaly detection in bipartite graphs

Published: 01 December 2005 Publication History

Abstract

Many real applications can be modeled using bipartite graphs, such as users vs. files in a P2P system, traders vs. stocks in a financial trading system, conferences vs. authors in a scientific publication network, and so on. We introduce two operations on bipartite graphs: 1) identifying similar nodes (relevance search), and 2) finding nodes connecting irrelevant nodes (anomaly detection). And we propose algorithms to compute the relevance score for each node using random walk with restarts and graph partitioning; we also propose algorithms to identify anomalies, using relevance scores. We evaluate the quality of relevance search based on semantics of the datasets, and we also measure the performance of the anomaly detection algorithm with manually injected anomalies. Both effectiveness and efficiency of the methods are confirmed by experiments on several real datasets.

References

[1]
C. Aggarwal and P. Yu. Outlier detection for high-dimensional data. In SIGMOD, pages 37--46, 2001.
[2]
J. Breese, D. Heckerman, and C. Kadie. Empirical analysis of predictive algorithms for collaborative filtering. In UAI, 1998.
[3]
Sergey Brin and Lawrence Page. The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems, 30(1--7):107--117, 1998.
[4]
Deepayan Chakrabarti. Autopart: Parameter-free graph partitioning and outlier detection. In PKDD, pages 112--124, 2004.
[5]
Deepayan Chakrabarti, Spiros PapADimitriou, Dharmendra S. Modha, and Christos Faloutsos. Fully automatic cross-associations. In KDD, pages 79--88. ACM Press, 2004.
[6]
I. S. Dhillon, S. Mallela, and D. S. Modha. Information-theoretic co-clustering. In KDD, 2003.
[7]
Gary William Flake, Steve Lawrence, and C. Lee Giles. Efficient identification of Web communities. In KDD, 2000.
[8]
M. Girvan and M. E. J. Newman. Community structure in social and biological networks. In Proc. Natl. Acad. Sci. USA, volume 99, 2002.
[9]
T. Haveliwala. Topic-sensitive pagerank. In Proceedings of the Eleventh International World Wide Web Conference, 2002.
[10]
Taher H. Haveliwala and Sepandar D. Kamvar. The second eigenvalue of the google matrix. Stanford University Technical Report, 2003.
[11]
Glen Jeh and Jennifer Widom. Simrank: a measure of structural-context similarity. In KDD, 2002.
[12]
R. Kannan, S. Vempala, and A. Vetta. On clusterings -- good, bad and spectral. In FOCS, 2000.
[13]
George Karypis and Vipin Kumar. Multilevel k-way partitioning scheme for irregular graphs. Journal of Parallel and Distributed Computing, 48(1):96--129, 1998.
[14]
Stefan Klink, Michael Ley, Emma Rabbidge, Patrick Reuther, Bernd Walter, and Alexander Weber. Browsing and visualizing digital bibliographic data. In VisSym, pages 237--242, 2004.
[15]
C. C. Noble and D. J. Cook. Graph-based anomaly detection. In KDD, pages 631--636, 2003.
[16]
Jia-Yu Pan, Hyung-Jeong Yang, Pinar Duygulu, and Christos Faloutsos. Automatic multimedia cross-modal correlation discovery, In KDD, 2004.
[17]
Berthier Ribeiro-Neto and Berthier Ribeiro-Neto. Modern Information Retrieval. Addison-Wesley, 1999.
[18]
Upendra Shardanand and Pattie Maes. Social information filtering: Algorithms for automating "word of mouth". In Human Factors in Computing Systems, 1995.
[19]
Gilbert Strang. Introduction to Linear Algebra. Wellesley-Cambridge Press, 3 edition, 1998.

Cited By

View all
  • (2022)Privacy-preserving data mining of cross-border financial flowsCogent Engineering10.1080/23311916.2022.20466809:1Online publication date: 15-Mar-2022
  • (2021)A Stochastic Block Model Based Approach to Detect Outliers in NetworksDatabase and Expert Systems Applications10.1007/978-3-030-86472-9_14(149-154)Online publication date: 31-Aug-2021
  • (2020)Community Detection for Mobile Money Fraud Detection2020 Seventh International Conference on Social Networks Analysis, Management and Security (SNAMS)10.1109/SNAMS52053.2020.9336578(1-6)Online publication date: 14-Dec-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGKDD Explorations Newsletter
ACM SIGKDD Explorations Newsletter  Volume 7, Issue 2
December 2005
152 pages
ISSN:1931-0145
EISSN:1931-0153
DOI:10.1145/1117454
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 December 2005
Published in SIGKDD Volume 7, Issue 2

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)4
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Privacy-preserving data mining of cross-border financial flowsCogent Engineering10.1080/23311916.2022.20466809:1Online publication date: 15-Mar-2022
  • (2021)A Stochastic Block Model Based Approach to Detect Outliers in NetworksDatabase and Expert Systems Applications10.1007/978-3-030-86472-9_14(149-154)Online publication date: 31-Aug-2021
  • (2020)Community Detection for Mobile Money Fraud Detection2020 Seventh International Conference on Social Networks Analysis, Management and Security (SNAMS)10.1109/SNAMS52053.2020.9336578(1-6)Online publication date: 14-Dec-2020
  • (2019)Supervised and extended restart in random walks for ranking and link prediction in networksPLOS ONE10.1371/journal.pone.021385714:3(e0213857)Online publication date: 20-Mar-2019
  • (2018)TINETProceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining10.1145/3219819.3220003(1890-1899)Online publication date: 19-Jul-2018
  • (2018)Automobile Insurance Fraud Detection Using Social Network AnalysisApplications of Data Management and Analysis10.1007/978-3-319-95810-1_2(11-16)Online publication date: 5-Oct-2018
  • (2017)Smoke DetectorProceedings of the 33rd Annual Computer Security Applications Conference10.1145/3134600.3134645(200-211)Online publication date: 4-Dec-2017
  • (2017)BiRankIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2016.261158429:1(57-71)Online publication date: 1-Jan-2017
  • (2017)Modeling user communities for identifying security risks in an organization2017 IEEE International Conference on Big Data (Big Data)10.1109/BigData.2017.8258488(4481-4486)Online publication date: Dec-2017
  • (2017)Measuring and modeling bipartite graphs with community structureJournal of Complex Networks10.1093/comnet/cnx0015:4(581-603)Online publication date: 26-Mar-2017
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media