skip to main content
10.1145/2764947.2764950acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Online Partitioning of Multi-Labeled Graphs

Published: 31 May 2015 Publication History

Abstract

Graph partitioning is an old problem that is finding renewed interest in the era of big, complex datasets and parallel computing frameworks that can benefit from a proper partitiong of big graph data across multiple nodes in a cluster. In this paper we look into a specific instance of the problem termed online graph partitioning that addresses the need to partition large graphs that do not fit in main memory. A neglected aspect of modern graph datasets is that real graphs have labels! Node labels may, for instance, correspond to categorical attributes (such as country, profession, participating groups, etc.) of the entities depicted by the vertices of the graph. Edge labels may represent different relationship types (e.g. "friend-of", "likes", etc.). In this work we first revisit the formulation of the graph partitioning problem for graphs with labels on both nodes and edges. We introduce "relation-cut", as a new metric that extends the traditional "edge-cut" metric used in graph partitioning in order to take into account the existence of different edge-types. Then, we combine this metric with a novel "label-cut" metric that takes into consideration the displacement of related nodes with similar labels across partitions. In our experiments we adapt two recent online partitioning algorithms for the new proposed metric and provide a thorough evaluation on a variety of real and synthetic graphs. Our experiments demonstrate that the proposed technique balances the generated cuts on both relations and labels on the resulting partitions.

References

[1]
C. C. Aggarwal and H. Wang. A survey of clustering algorithms for graph data. In Managing and Mining Graph Data, volume 40 of Advances in Database Systems, pages 275--301. Springer, 2010.
[2]
K. Andreev and H. Räcke. Balanced Graph Partitioning. In Proc. of SPAA, pages 120--124, New York, NY, USA, 2004.
[3]
D. A. Bader, H. Meyerhenke, P. Sanders, and D. Wagner, editors. Graph Partitioning and Graph Clustering - 10th DIMACS Implementation Challenge Workshop, Georgia Institute of Technology, Atlanta, GA, USA, 2013.
[4]
U. Catalyurek and C. Aykanat. A hypergraph-partitioning approach for coarse-grain decomposition. In Proceedings of the 2001 ACM/IEEE Conference on Supercomputing, SC '01, pages 28--28, New York, NY, USA, 2001.
[5]
H. Cheng, Y. Zhou, and J. X. Yu. Clustering large attributed graphs: A balance between structural and attribute similarities. ACM Trans. Knowl. Discov. Data, 5(2):12:1--12:33, Feb. 2011.
[6]
C. M. Fiduccia and R. M. Mattheyses. A linear-time heuristic for improving network partitions. In Proceedings of DAC, pages 175--181, Piscataway, NJ, USA, 1982.
[7]
M. Fiedler. A property of eigenvectors of nonnegative symmetric matrices and its application to graph theory. Czechoslovak Mathematical Journal, 25(4):619--633, 1975.
[8]
M. R. Garey, D. S. Johnson, and L. Stockmeyer. Some simplified np-complete problems. In Proceedings of STOC, pages 47--63, New York, NY, USA, 1974.
[9]
K. S. George, G. Karypis, and V. Kumar. A new algorithm for multi-objective graph partitioning. In In Proceedings of Europar, pages 322--331. Springer Verlag, 1999.
[10]
B. Hendrickson and R. Leland. A multilevel algorithm for partitioning graphs. In Proceedings of Supercomputing, New York, NY, USA, 1995.
[11]
U. Kang, C. E. Tsourakakis, and C. Faloutsos. Pegasus: A Peta-Scale Graph Mining System Implementation and Observations. In Proceedings of ICDM, pages 229--238, Washington, DC, USA, 2009.
[12]
G. Karypis and V. Kumar. Multilevel algorithms for multi-constraint graph partitioning. In Proceedings of the 1998 ACM/IEEE Conference on Supercomputing, SC '98, pages 1--13, Washington, DC, USA, 1998.
[13]
G. Karypis and V. Kumar. Multilevel k-way partitioning scheme for irregular graphs. J. Parallel Distrib. Comput., 48(1):96--129, Jan. 1998.
[14]
B. Kernighan and S. Lin. An Efficient Heuristic Procedure for Partitioning Graphs. The Bell Systems Technical Journal, 49(2), 1970.
[15]
J. Leskovec and A. Krevl. SNAP Datasets: Stanford large network dataset collection, June 2014.
[16]
Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin, and J. M. Hellerstein. Graphlab: A New Parallel Framework for Machine Learning. In Proceedings of UAI, July 2010.
[17]
G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: A System for Large-scale Graph Processing. In Proceedings of SIGMOD, pages 135--146, New York, NY, USA, 2010.
[18]
J. Nishimura and J. Ugander. Restreaming graph partitioning: Simple versatile algorithms for advanced balancing. In Proc. of SIGKDD, pages 1106--1114, New York, NY, USA, 2013.
[19]
F. Pellegrini and J. Roman. Scotch: A software package for static mapping by dual recursive bipartitioning of process and architecture graphs. In Proceedings of HPCN, pages 493--498, London, UK, 1996. Springer-Verlag.
[20]
A. Pothen, H. D. Simon, and K.-P. Liou. Partitioning sparse matrices with eigenvectors of graphs. SIAM J. Matrix Anal. Appl., 11(3):430--452, May 1990.
[21]
H. D. Simon. Partitioning of unstructured problems for parallel processing, 1991.
[22]
A. J. Soper, C. Walshaw, and M. Cross. A Combined Evolutionary Search and Multilevel Optimisation Approach to Graph-Partitioning. Journal of Global Optimization, 29(2):225--241, June 2004.
[23]
V. Spyropoulos and Y. Kotidis. Dynamic partitioning of big hierarchical graphs. In Proceedings of the First International Workshop on Big Dynamic Distributed Data, Riva del Garda, Italy, August 30, 2013, pages 37--42, 2013.
[24]
I. Stanton and G. Kliot. Streaming graph partitioning for large distributed graphs. In Proceedings of SIGKDD, pages 1222--1230, New York, NY, USA, 2012.
[25]
C. Tsourakakis, C. Gkantsidis, B. Radunovic, and M. Vojnovic. Fennel: Streaming graph partitioning for massive scale graphs. In Proceedings of WSDM, pages 333--342, New York, NY, USA, 2014.
[26]
Z. Xu, Y. Ke, Y. Wang, H. Cheng, and J. Cheng. A model-based approach to attributed graph clustering. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, SIGMOD '12, pages 505--516, New York, NY, USA, 2012.

Cited By

View all
  • (2021)Distributed Storage and Query for Domain Knowledge GraphsWeb and Big Data. APWeb-WAIM 2020 International Workshops10.1007/978-981-16-0479-9_10(116-128)Online publication date: 1-Apr-2021
  • (2020)Dynamic Partition of Large Graphs Combining Local Nodes Exchange with Directed Dynamic MaintenanceWeb Information Systems and Applications10.1007/978-3-030-60029-7_40(441-453)Online publication date: 23-Sep-2020
  • (2016)Digree: A middleware for a graph databases polystore2016 IEEE International Conference on Big Data (Big Data)10.1109/BigData.2016.7840900(2580-2589)Online publication date: Dec-2016
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
GRADES'15: Proceedings of the GRADES'15
May 2015
54 pages
ISBN:9781450336116
DOI:10.1145/2764947
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 May 2015

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

SIGMOD/PODS'15
Sponsor:
SIGMOD/PODS'15: International Conference on Management of Data
May 31 - June 4, 2015
VIC, Melbourne, Australia

Acceptance Rates

Overall Acceptance Rate 29 of 61 submissions, 48%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2021)Distributed Storage and Query for Domain Knowledge GraphsWeb and Big Data. APWeb-WAIM 2020 International Workshops10.1007/978-981-16-0479-9_10(116-128)Online publication date: 1-Apr-2021
  • (2020)Dynamic Partition of Large Graphs Combining Local Nodes Exchange with Directed Dynamic MaintenanceWeb Information Systems and Applications10.1007/978-3-030-60029-7_40(441-453)Online publication date: 23-Sep-2020
  • (2016)Digree: A middleware for a graph databases polystore2016 IEEE International Conference on Big Data (Big Data)10.1109/BigData.2016.7840900(2580-2589)Online publication date: Dec-2016
  • (2016)Effective and efficient graph augmentation in large graphs2016 IEEE International Conference on Big Data (Big Data)10.1109/BigData.2016.7840681(875-880)Online publication date: Dec-2016

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media