Skip to main content

Merging Adjacency Lists for Efficient Web Graph Compression

  • Conference paper

Part of the book series: Advances in Intelligent and Soft Computing ((AINSC,volume 103))

Abstract

Analysing Web graphs meets a difficulty in the necessity of storing a major part of huge graphs in the external memory, which prevents efficient random access to edge (hyperlink) lists. A number of algorithms involving compression techniques have thus been presented, to represent Web graphs succinctly but also providing random access. Our algorithm belongs to this category. It works on contiguous blocks of adjacency lists, and its key mechanism is merging the block into a single ordered list. This method achieves compression ratios much better than most methods known from the literature at rather competitive access times.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   259.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anh, V.N., Moffat, A.F.: Local modeling for webgraph compression. In: Storer, J.A., Marcellin, M.W. (eds.) Proceedings of the Data Compression Conference (DCC), p. 519. IEEE Computer Society, Los Alamitos (2010)

    Google Scholar 

  2. Apostolico, A., Drovandi, G.: Graph compression by BFS. Algorithms 2(3), 1031–1044 (2009)

    Article  MathSciNet  Google Scholar 

  3. Asano, Y., Miyawaki, Y., Nishizeki, T.: Efficient compression of web graphs. In: Hu, X., Wang, J. (eds.) COCOON 2008. LNCS, vol. 5092, pp. 1–11. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  4. Boldi, P., Codenotti, B., Santini, M., Vigna, S.: UbiCrawler: A scalable fully distributed web crawler. Software: Practice & Experience 34(8), 711–726 (2004)

    Article  Google Scholar 

  5. Boldi, P., Santini, M., Vigna, S.: Permuting web and social graphs. Internet Mathematics 6(3), 257–283 (2010)

    Article  MathSciNet  Google Scholar 

  6. Boldi, P., Vigna, S.: The webgraph framework I: Compression techniques. In: Feldman, S.I., Uretsky, M., Najork, M., Wills, C.E. (eds.) Proceedings of the 13th International World Wide Web Conference, pp. 595–602. ACM Press, New York (2004)

    Google Scholar 

  7. Brisaboa, N., Cánovas, R., Claude, F., Martínez-Prieto, M., Navarro, G.: Compressed string dictionaries. In: Pardalos, P.M., Rebennack, S. (eds.) SEA 2011. LNCS, vol. 6630, pp. 136–147. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  8. Brisaboa, N.R., Ladra, S., Navarro, G.: K2-trees for compact web graph representation. In: Karlgren, J., Tarhio, J., Hyyrö, H. (eds.) SPIRE 2009. LNCS, vol. 5721, pp. 18–30. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  9. Buehrer, G., Chellapilla, K.: A scalable pattern mining approach to web graph compression with communities. In: Najork, M., Broder, A.Z., Chakrabarti, S. (eds.) Proceedings of the International Conference on Web Search and Web Data Mining (WSDM), pp. 95–106. ACM, New York (2008)

    Chapter  Google Scholar 

  10. Claude, F., Navarro, G.: Extended compact web graph representations. In: Elomaa, T., Mannila, H., Orponen, P. (eds.) Ukkonen Festschrift 2010. LNCS, vol. 6060, pp. 77–91. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  11. Claude, F., Navarro, G.: Fast and compact web graph representations. ACM Transactions on the Web (TWEB) 4(4), 16:1–16:16 (2010)

    Google Scholar 

  12. Grabowski, S., Bieniecki, W.: Tight and simple Web graph compression. In: Holub, J., Žd’árek, J. (eds.) Proceeding of the Prague Stringology Conference, pp. 127–137 (2010)

    Google Scholar 

  13. Larsson, N.J., Moffat, A.: Off-line dictionary-based compression. Proceedings of the IEEE 88(11), 1722–1732 (2000)

    Article  Google Scholar 

  14. Munro, J.I., Raman, V.: Succinct representation of balanced parentheses, static trees and planar graphs. In: Proceedings of the IEEE Symposium on Foundations of Computer Science (FOCS), pp. 118–126. IEEE Computer Society, Los Alamitos (1997)

    Google Scholar 

  15. Navarro, G., Mäkinen, V.: Compressed full-text indexes. ACM Computing Surveys 39(1) (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Grabowski, S., Bieniecki, W. (2011). Merging Adjacency Lists for Efficient Web Graph Compression. In: Czachórski, T., Kozielski, S., Stańczyk, U. (eds) Man-Machine Interactions 2. Advances in Intelligent and Soft Computing, vol 103. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23169-8_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-23169-8_42

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-23168-1

  • Online ISBN: 978-3-642-23169-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics