Skip to main content

A Comparison of Three Algorithms for Approximating the Distance Distribution in Real-World Graphs

  • Conference paper
Theory and Practice of Algorithms in (Computer) Systems (TAPAS 2011)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6595))

Abstract

The distance for a pair of vertices in a graph G is the length of the shortest path between them. The distance distribution for G specifies how many vertex pairs are at distance h, for all feasible values h. We study three fast randomized algorithms to approximate the distance distribution in large graphs. The Eppstein-Wang (ew) algorithm exploits sampling through a limited (logarithmic) number of Breadth-First Searches (bfses). The Size-Estimation Framework (sef) by Cohen employs random ranking and least-element lists to provide several estimators. Finally, the Approximate Neighborhood Function (anf) algorithm by Palmer, Gibbons, and Faloutsos makes use of the probabilistic counting technique introduced by Flajolet and Martin, in order to estimate the number of distinct elements in a large multiset. We investigate how good is the approximation of the distance distribution, when the three algorithms are run in similar settings. The analysis of anf derives from the results on the probabilistic counting method, while the one of sef is given by Cohen. For what concerns ew (originally designed for another problem), we extend its simple analysis in order to bound its error with high probability and to show its convergence. We then perform an experimental study on 30 real-world graphs, showing that our implementation of ew combines the accuracy of sef with the performance of anf.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Blondel, V., Guillaume, J.L., Hendrickx, J., Jungers, R.: Distance Distribution in Random Graphs and Applications to Network Exploration. Phys. Rev. E 76 (2007)

    Google Scholar 

  2. Boldi, P., Vigna, S.: The WebGraph framework I: Compression techniques. In: Proc. of the 13th International World Wide Web Conference, pp. 595–601 (2004)

    Google Scholar 

  3. Cohen, E.: Estimating the size of the transitive closure in linear time. In: Annual IEEE Symposium on Foundations of Computer Science, pp. 190–200 (1994)

    Google Scholar 

  4. Cohen, E.: Size-estimation framework with applications to transitive closure and reachability. J. Comput. Syst. Sci. 55(3), 441–453 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  5. Cohen, E., Kaplan, H.: Bottom-k sketches: better and more efficient estimation of aggregates. In: ACM SIGMETRICS, pp. 353–354. ACM, New York (2007)

    Google Scholar 

  6. Cohen, E., Kaplan, H.: Spatially-decaying aggregation over a network. J. Comput. Syst. Sci. 73(3), 265–288 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  7. Cohen, E., Kaplan, H.: Summarizing data using bottom-k sketches. In: ACM PODC, pp. 225–234 (2007)

    Google Scholar 

  8. Cohen, E., Kaplan, H.: Tighter estimation using bottom k sketches. PVLDB 1(1), 213–224 (2008)

    Google Scholar 

  9. Crescenzi, P., Grossi, R., Imbrenda, C., Lanzi, L., Marino, A.: Finding the Diameter in Real-World Graphs: Experimentally Turning a Lower Bound into an Upper Bound. In: de Berg, M., Meyer, U. (eds.) ESA 2010. LNCS, vol. 6346, pp. 302–313. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  10. Eppstein, D., Wang, J.: Fast approximation of centrality. In: ACM/SIAM SODA, pp. 228–229 (2001)

    Google Scholar 

  11. Flajolet, P., Martin, G.N.: Probabilistic Counting Algorithms for Data Base Applications. Journal of Computer Systems Science 31(2), 182–209 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  12. Latapy, M., Magnien, C.: Measuring Fundamental Properties of Real-World Complex Networks. CoRR abs/cs/0609115 (2006)

    Google Scholar 

  13. Leskovec, J., Kleinberg, J., Faloutsos, C.: Graph Evolution: Densification and Shrinking Diameters. ACM Trans. Knowl. Discov. Data 1(1) (2007)

    Google Scholar 

  14. Lipton, R.J., Naughton, J.F.: Query size estimation by adaptive sampling. J. Comput. Syst. Sci. 51(1), 18–25 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  15. Lynch, N.: Distributed Algorithms. Morgan Kaufmann, San Francisco (1996)

    MATH  Google Scholar 

  16. Mehlhorn, K., Meyer, U.: External-memory breadth-first search with sublinear I/O. In: Möhring, R.H., Raman, R. (eds.) ESA 2002. LNCS, vol. 2461, pp. 723–735. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  17. Palmer, C.R., Gibbons, P.B., Faloutsos, C.: ANF: a Fast and Scalable Tool for Data Mining in Massive Graphs. In: ACM SIGKDD, pp. 81–90 (2002)

    Google Scholar 

  18. Wang, L., Subramanian, S., Latifi, S., Srimani, P.: Distance Distribution of Nodes in Star Graphs. Applied Mathematics Letters 19(8), 780–784 (2006)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Crescenzi, P., Grossi, R., Lanzi, L., Marino, A. (2011). A Comparison of Three Algorithms for Approximating the Distance Distribution in Real-World Graphs. In: Marchetti-Spaccamela, A., Segal, M. (eds) Theory and Practice of Algorithms in (Computer) Systems. TAPAS 2011. Lecture Notes in Computer Science, vol 6595. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19754-3_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-19754-3_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-19753-6

  • Online ISBN: 978-3-642-19754-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics