Skip to main content

Graph Compression with Stars

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11440))

Abstract

Making massive graph data easily understandable by people is a demanding task in a variety of real applications. Graph compression is an effective approach to reducing the size of graph data as well as its complexity in structures. This paper proposes a simple yet effective graph compression method called the star-based graph compression. This method compresses a graph by shrinking a collection of disjoint subgraphs called stars. Compressing a graph into the optimal star-based compressed graph with the highest compression ratio is shown to be NP-complete. We propose a greedy compression algorithm called StarZip. We experimentally verify that StarZip achieves compression ratios of 3.8–45.7 and 2.9–241.6 in terms of vertex count and edge count, respectively. Besides, we study the shortest path queries on compressed graphs. On the real graphs, the StarSSSP algorithm for processing shortest path queries on compressed graphs is 4X–20X faster than Dijkstra’s algorithm running on original graphs. The average absolute error between the query results of StarSSSP and the exact shortest distances is about 1. On the synthetic graphs, StarSSSP is up to 313X faster than Dijkstra’s algorithm, and the average absolute error is also about 1.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://www.worldwidewebsize.com.

  2. 2.

    http://www.statisticbrain.com/facebook-statistics/.

References

  1. Leskovec, J., Faloutsos, C.: Sampling from large graphs. In: KDD, pp. 631–636 (2006)

    Google Scholar 

  2. Feder, T., Motwani, R.: Clique partitions, graph compression and speeding-up algorithms. J. Comput. Syst. Sci. 51(2), 261–272 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  3. Toivonen, H., Zhou, F., Hartikainen, A., Hinkka, A.: Compression of weighted graphs. In: KDD, pp. 965–973 (2011)

    Google Scholar 

  4. Chvatal, V.: A greedy heuristic for the set-covering problem. Math. Oper. Res. 4(3), 233–235 (1979)

    Article  MathSciNet  MATH  Google Scholar 

  5. Ruan, L., Du, H., Jia, X., Wu, W., Li, Y., Ko, K.I.: A greedy approximation for minimum connected dominating sets. Theoret. Comput. Sci. 329(1–3), 325–330 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  6. Leskovec, J., Krevl, A.: SNAP datasets: Stanford large network dataset collection, June 2014. http://snap.stanford.edu/data

  7. Chakrabarti, D., Zhan, Y., Faloutsos, C.: R-MAT: a recursive model for graph mining. In: SDM, vol. 4, pp. 442–446 (2004)

    Google Scholar 

  8. Li, L.: A concordance correlation coefficient to evaluate reproducibility. Biometrics 45(1), 255–268 (1989)

    Article  MATH  Google Scholar 

  9. Tian, Y., Hankins, R.A., Patel, J.M.: Efficient aggregation for graph summarization. In: SIGMOD, pp. 567–580 (2008)

    Google Scholar 

  10. Zhang, N., Tian, Y., Patel, J.M.: Discovery-driven graph summarization. In: ICDE, pp. 880–891 (2010)

    Google Scholar 

  11. Navlakha, S., Rastogi, R., Shrivastava, N.: Graph summarization with bounded error. In: SIGMOD, pp. 419–432 (2008)

    Google Scholar 

  12. Ruan, N., Jin, R., Huang, Y.: Distance preserving graph simplification. In: ICDM, pp. 1200–1205 (2011)

    Google Scholar 

  13. Bonchi, F., Morales, G.D.F., Gionis, A., Ukkonen, A.: Activity preserving graph simplification. Data Min. Knowl. Disc. 27(3), 321–343 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  14. Gonzalez, J.E., Low, Y., Gu, H., Bickson, D., Guestrin, C.: PowerGraph: distributed graph-parallel computation on natural graphs. In: OSDI, pp. 17–30 (2012)

    Google Scholar 

  15. Shao, Y., Cui, B., Ma, L.: PAGE: a partition aware engine for parallel graph computation. IEEE Trans. Knowl. Data Eng. 27(2), 518–530 (2015)

    Article  Google Scholar 

  16. Boldi, P., Vigna, S.: The webgraph framework I: compression techniques. In: WWW, pp. 595–601 (2004)

    Google Scholar 

  17. Adler, M., Mitzenmacher, M.: Towards compressing web graphs. In: DCC, pp. 203–212 (2001)

    Google Scholar 

  18. Apostolico, A., Drovandi, G.: Graph compression by BFS. Algorithms 2(3), 1031–1044 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  19. Fan, W., Li, J., Wang, X., Wu, Y.: Query preserving graph compression. In: SIGMOD, pp. 157–168 (2012)

    Google Scholar 

Download references

Acknowledgements

This work was partially supported by the National Natural Science Foundation of China (No. 61532015, No. 61672189, No. 61732003 and No. 61872106) and the National Science Foundation of USA (No. 1741277 and No. 1829674).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhaonian Zou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, F., Zou, Z., Li, J., Li, Y. (2019). Graph Compression with Stars . In: Yang, Q., Zhou, ZH., Gong, Z., Zhang, ML., Huang, SJ. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2019. Lecture Notes in Computer Science(), vol 11440. Springer, Cham. https://doi.org/10.1007/978-3-030-16145-3_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-16145-3_35

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-16144-6

  • Online ISBN: 978-3-030-16145-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics