skip to main content
10.1145/2783258.2783376acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Using Local Spectral Methods to Robustify Graph-Based Learning Algorithms

Published: 10 August 2015 Publication History

Abstract

Graph-based learning methods have a variety of names including semi-supervised and transductive learning. They typically use a diffusion to propagate labels from a small set of nodes with known class labels to the remaining nodes of the graph. While popular, these algorithms, when implemented in a straightforward fashion, are extremely sensitive to the details of the graph construction. Here, we provide four procedures to help make them more robust: recognizing implicit regularization in the diffusion, using a scalable push method to evaluate the diffusion, using rank-based rounding, and densifying the graph through a matrix polynomial. We study robustness with respect to the details of graph constructions, errors in node labeling, degree variability, and a variety of other real-world heterogeneities, studying these methods through a precise relationship with mincut problems. For instance, the densification strategy explicitly adds new weighted edges to a sparse graph. We find that this simple densification creates a graph where multiple diffusion methods are robust to several types of errors. This is demonstrated by a study with predicting product categories from an Amazon co-purchasing network.

Supplementary Material

MP4 File (p359.mp4)

References

[1]
R. Andersen, F. Chung, and K. Lang. Local graph partitioning using PageRank vectors. In FOCS, 2006.
[2]
R. Andersen and K. Lang. An algorithm for improving graph partitions. In SODA, 2008.
[3]
M. Belkin and P. Niyogi. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation, 15(6):1373--1396, 2003.
[4]
M. Belkin and P. Niyogi. Semi-supervised learning on riemannian manifolds. Mach. Learn., 56(1--3):209--239, 2004.
[5]
P. Berkhin. Bookmark-coloring algorithm for personalized PageRank computing. Internet Math., 3(1):41--62, 2007.
[6]
F. R. L. Chung. Spectral graph theory, volume 92. American Mathematical Society, 1997.
[7]
R. R. Coifman and S. Lafon. Diffusion maps. Applied and Computational Harmonic Analysis, 21(1):5--30, 2006.
[8]
C. A. R. de Sousa, S. O. Rezende, and G. E. Batista. Influence of graph construction on semi-supervised learning. In ECML/PKDD, pages 160--175. Springer, 2013.
[9]
A. Gittens and M. W. Mahoney. Revisiting the Nyström method for improved large-scale machine learning. In ICML, volume 28, pages 567--575, 2013.
[10]
D. F. Gleich and M. M. Mahoney. Algorithmic anti-differentiation: A case study with min-cuts, spectral, and flow. In ICML, 2014.
[11]
T. J. Hansen and M. W. Mahoney. Semi-supervised eigenvectors for locally-biased learning. In NIPS, 2012.
[12]
G. Jeh and J. Widom. Scaling personalized web search. In WWW, pages 271--279. ACM, 2003.
[13]
L. G. S. Jeub, P. Balachandran, M. A. Porter, P. J. Mucha, and M. W. Mahoney. Think locally, act locally: Detection of small, medium-sized, and large communities in large networks. Physical Review E, 91:012821, 2015.
[14]
T. Joachims. Transductive learning via spectral graph partitioning. In ICML, pages 290--297, 2003.
[15]
K. Kloster and D. F. Gleich. Heat kernel based community detection. In KDD, 2014.
[16]
I. M. Kloumann and J. M. Kleinberg. Community membership identification from small seed sets. In KDD, pages 1366--1375, 2014.
[17]
J. Leskovec. Supporting website. snap.stanford.edu/data.
[18]
J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney. Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. Internet Mathematics, 6(1):29--123, September 2009.
[19]
M. W. Mahoney, L. Orecchia, and N. K. Vishnoi. A local spectral method for graphs: With applications to improving graph partitions and exploring data graphs locally. JMLR, 13:2339--2365, 2012.
[20]
F. McSherry. A uniform approach to accelerated PageRank computation. In WWW, pages 575--582. ACM, 2005.
[21]
M. Newman. Network data. www-personal.umich.edu/ mejn/netdata.
[22]
M. E. J. Newman. Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E, 74(3):036104, September 2006.
[23]
L. Orecchia and M. W. Mahoney. Implementing regularization implicitly via approximate eigenvector computation. In ICML, pages 121--128, 2011.
[24]
S. Roweis and L. Saul. Nonlinear dimensionality reduction by local linear embedding. Science, 290:2323--2326, 2000.
[25]
H. Shin, N. J. Hill, A. M. Lisewski, and J.-S. Park. Graph sharpening. Expert. Syst. Appl., 37(12):7870--7879, 2010.
[26]
D. A. Spielman and S.-H. Teng. A local clustering algorithm for massive graphs and its application to nearly linear time graph partitioning. SIAM J. Comput., 42:1--26, 2013.
[27]
J. Tenenbaum, V. de Silva, and J. Langford. A global geometric framework for nonlinear dimensionality reduction. Science, 290:2319--2323, 2000.
[28]
A. L. Traud, P. J. Mucha, and M. A. Porter. Social structure of Facebook networks. arXiv, cs.SI:1102.2166, 2011.
[29]
J. Yang and J. Leskovec. Defining and evaluating network communities based on ground-truth. In ICDM, 2012.
[30]
L. Zelnik-Manor and P. Perona. Self-tuning spectral clustering. In NIPS, pages 1601--1608, 2004.
[31]
D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Schölkopf. Learning with local and global consistency. In NIPS, 2003.
[32]
X. Zhu, Z. Ghahramani, and J. Lafferty. Semi-supervised learning using Gaussian fields and harmonic functions. In ICML, pages 912--919, 2003.

Cited By

View all
  • (2023)Flow-Based Algorithms for Improving Clusters: A Unifying Framework, Software, and PerformanceSIAM Review10.1137/20M133305565:1(59-143)Online publication date: 9-Feb-2023
  • (2023)Optimization Problems on Complex Networks: Method of Gravitational Potentials2023 IEEE 13th International Conference on Electronics and Information Technologies (ELIT)10.1109/ELIT61488.2023.10310910(31-36)Online publication date: 26-Sep-2023
  • (2021)Nonlinear Higher-Order Label SpreadingProceedings of the Web Conference 202110.1145/3442381.3450035(2402-2413)Online publication date: 19-Apr-2021
  • Show More Cited By

Index Terms

  1. Using Local Spectral Methods to Robustify Graph-Based Learning Algorithms

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
    August 2015
    2378 pages
    ISBN:9781450336642
    DOI:10.1145/2783258
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 10 August 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. clustering
    2. diffusions
    3. robustifying
    4. semi-supervised learning

    Qualifiers

    • Research-article

    Funding Sources

    • Army Research Office
    • DARPA
    • NSF

    Conference

    KDD '15
    Sponsor:

    Acceptance Rates

    KDD '15 Paper Acceptance Rate 160 of 819 submissions, 20%;
    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Upcoming Conference

    KDD '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)18
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 01 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Flow-Based Algorithms for Improving Clusters: A Unifying Framework, Software, and PerformanceSIAM Review10.1137/20M133305565:1(59-143)Online publication date: 9-Feb-2023
    • (2023)Optimization Problems on Complex Networks: Method of Gravitational Potentials2023 IEEE 13th International Conference on Electronics and Information Technologies (ELIT)10.1109/ELIT61488.2023.10310910(31-36)Online publication date: 26-Sep-2023
    • (2021)Nonlinear Higher-Order Label SpreadingProceedings of the Web Conference 202110.1145/3442381.3450035(2402-2413)Online publication date: 19-Apr-2021
    • (2020)Local hypergraph clustering using capacity releasing diffusionPLOS ONE10.1371/journal.pone.024348515:12(e0243485)Online publication date: 23-Dec-2020
    • (2020)Principal eigenvector localization and centrality in networks: RevisitedPhysica A: Statistical Mechanics and its Applications10.1016/j.physa.2020.124169(124169)Online publication date: Jan-2020
    • (2019)Nonlinear Diffusion for Community Detection and Semi-Supervised LearningThe World Wide Web Conference10.1145/3308558.3313483(739-750)Online publication date: 13-May-2019
    • (2019)Graph-based Semi-Supervised & Active Learning for Edge FlowsProceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining10.1145/3292500.3330872(761-771)Online publication date: 25-Jul-2019
    • (2019)Adaptive Diffusions for Scalable Learning Over GraphsIEEE Transactions on Signal Processing10.1109/TSP.2018.288998467:5(1307-1321)Online publication date: 1-Mar-2019
    • (2019)Robust classification of graph-based dataData Mining and Knowledge Discovery10.1007/s10618-018-0603-933:1(230-251)Online publication date: 1-Jan-2019
    • (2018)Learning on Partial-Order HypergraphsProceedings of the 2018 World Wide Web Conference10.1145/3178876.3186064(1523-1532)Online publication date: 10-Apr-2018
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media