skip to main content
10.1145/3289430.3289468acmotherconferencesArticle/Chapter ViewAbstractPublication PagesbdiotConference Proceedingsconference-collections
research-article

Parallel Edge Contraction for Large Nonplanar Graph Clustering

Published: 24 October 2018 Publication History

Abstract

With the flowering of graph mining and computation technology, the field of graph clustering has become popular. Particularly, to cluster increasingly massive data represented as graph become common today. There have been many research efforts on graph clustering algorithms as well as reported applications. However, graph clustering technology is highly application specific and computationally expensive especially when applied to huge data. In this paper, we propose an efficient graph contraction algorithm for speeding up graph clustering. We target at parallelizing edge contraction for reducing computing time for graph mining. We adopt massive multithreading for edge contraction on huge non planar graph. In experiment, our algorithm achieves good parallel efficiency for computation time of both contracting edges and clustering graph as the number of threads increases. Our edge contraction algorithm achieves the speedup ranging from 3 to 6 in contracting 10000 to 30000 edges. Finally, we observe that with edge contraction, parallel clustering achieves speed up is 8 to 80 times faster compared with the result of no edge contraction.

References

[1]
Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R.: Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press (1996)
[2]
Fukunaga, K., Narendra, P.M.: A branch and bound algorithm for computing k-nearest neighbors. IEEE Trans. Comput. (1975)
[3]
Cheeseman, P., Stutz, J.: Bayesian classification (autoclass): Theory and results. In Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R., eds.: Advances in Knowledge Discovery and Data Mining, AAAI/MIT Press (1996)
[4]
Smyth, P., Ghil, M., Ide, K., Roden, J., Fraser, A.: Detecting atmospheric regimes using cross-validated clustering. In Pregibon, D., Uthurusamy, R., eds.: Proceedings Third International Conference on Knowledge Discovery and Data Mining, Newport Beach, CA, AAAI Press (1997)
[5]
Gersho, A., Gray, R.M.: Vector quantization and signal compression. Kluwer Academic Publishers (1992)
[6]
Shaw, C.T., King, G.P.: Using cluster analysis to classify time series. Physica D 58 (1992)
[7]
Dhillon, I.S., Modha, D.S., Spangler, W.S.: Visualizing class structure of multidimensional data. In Weisberg, S., ed.: Proceedings of the 30th Symposium on the Interface: Computing Science and Statistics, Minneapolis, MN. (1998)
[8]
Dhillon, I.S., Modha, D.S., Spangler, W.S.: Visualizing class structure of highdimensional data with applications. Submitted for publication (1999)
[9]
Vishwa S. Parekh, Michael A. Jacobs, "A multidimensional data visualization and clustering method: Consensus similarity mapping", 2016 IEEE 13th International Symposium on Biomedical Imaging(ISBI), April 2016
[10]
Broder, A.Z., Glassman, S.C., Manasse, M.S., Zweig, G.: Syntactic clustering of the web. Technical Report 1997-015, Digital Systems Research Center (1997)
[11]
Agrawal, R., Shafer, J.C.: Parallel mining of association rules: Design, implementation, and experience. IEEE Trans. Knowledge and Data Eng. 8 (1996)
[12]
Chattratichat, J., Darlington, J., Ghanem, M., Guo, Y., Huning, H., Kohler, M., Sutiwaraphun, J., To, H.W., Yang, D.: Large scale data mining: Challenges and responses. In Pregibon, D., Uthurusamy, R., eds.: Proceedings Third International Conference on Knowledge Discovery and Data Mining, Newport Beach, CA, AAAI Press (1997)
[13]
Cheung, D.W., Xiao, Y.: Effect of data distribution in parallel mining of associations. Data Mining and Knowledge Discovery (1999) to appear.
[14]
Han, E.H., Karypis, G., Kumar, V.: Scalable parallel data mining for association rules. In: SIGMOD Record: Proceedings of the 1997 ACM-SIGMOD Conference on Management of Data, Tucson, AZ, USA. (1997)
[15]
D. Judd, P. K. McKinley, and A. K. Jain, Large-Scale Parallel Data Clustering, in Proceedings of the International Conference on Pattern Recognition (ICPR1996) Volume IV, 1996.
[16]
I. S. Dhillon and D. S. Modha, Data-Clustering Algorithm on Distributed Memory Multiprocessors, in Revised Papers from
[17]
Large-Scale Parallel Data Mining, Workshop on Large-Scale Parallel KDD Systems, SIGKDD: Springer-Verlag, 2000.

Index Terms

  1. Parallel Edge Contraction for Large Nonplanar Graph Clustering

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    BDIOT '18: Proceedings of the 2018 2nd International Conference on Big Data and Internet of Things
    October 2018
    217 pages
    ISBN:9781450365192
    DOI:10.1145/3289430
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    In-Cooperation

    • Deakin University

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 24 October 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Nonplanar graph
    2. edge contraction
    3. graph clustering
    4. parallelism

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    BDIOT 2018

    Acceptance Rates

    Overall Acceptance Rate 75 of 136 submissions, 55%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 45
      Total Downloads
    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 03 Mar 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media