Skip to main content

Using the Client Cache for Content Encoding: Shared Dictionary Compression for the Web

  • Conference paper
  • First Online:
Service-Oriented Computing (SummerSOC 2023)

Abstract

As different approaches have demonstrated in the past, delta encoding and shared dictionary compression can significantly reduce the payload of websites. However, choosing a good dictionary or delta source is still a challenge and has kept delta encoding from becoming practically relevant for today’s web. In this work, we demonstrate that the often prohibitive costs of dictionary generation exhibited by earlier approaches can be avoided by simply using cache entries for content encoding: We divide web pages into different page types and use one actual page of every type as a dictionary to encode pages of the same type. In an experimental evaluation, we show that our approach outperforms current industry standards by a factor of 5 in terms of compression ratio. We discuss optimization and content normalization strategies as well as application scenarios that are possible with our approach, but infeasible with the current state of the art.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Content delivery networks (CDNs) accelerate content delivery by caching resources that are requested by multiple clients [6]. This is obviously not possible for deltas, if they are computed for individual users.

  2. 2.

    https://github.com/google/open-vcdiff.

  3. 3.

    https://github.com/gtoubassi/femtozip.

  4. 4.

    https://analytics.google.com/analytics/web/provision.

  5. 5.

    The categorization within those page types is implemented by the website owner, e.g. by URL regex.

  6. 6.

    https://handlebarsjs.com/.

  7. 7.

    https://twig.symfony.com/.

  8. 8.

    We only use pages within the same website as a dictionary, since browsers would prohibit sharing content across different domains.

  9. 9.

    A journey describes multiple consecutive page visits of one user.

  10. 10.

    https://www.similarweb.com/top-websites/e-commerce-and-shopping/.

References

  1. Alakuijala, J., et al.: Brotli: a general-purpose data compressor. ACM TOI 37(1), 1–30 (2018)

    Google Scholar 

  2. Collet, Y., M. Kucherawy, E.: Zstandard Compression and the ’application/zstd’ Media Type. RFC 8878, February 2021

    Google Scholar 

  3. Korn, D., MacDonald, J., Mogul, J., Vo, K.: The VCDIFF Generic Differencing and Compression Data Format. RFC 3284, June 2002

    Google Scholar 

  4. McQuade, B., Mixter, K., Lee, W.H., Butler, J.: A proposal for shared dictionary compression over http (2016)

    Google Scholar 

  5. Mogul, J., et al.: Delta Encoding in HTTP. RFC 3229, January 2002

    Google Scholar 

  6. Pathan, M., Buyya, R.: A Taxonomy of CDNs, pp. 33–77. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-77887-5_2

    Book  Google Scholar 

  7. Shapira, O.: SDCH at LinkedIn (2015). https://engineering.linkedin.com/shared-dictionary-compression-http-linkedin. Accessed Mar 2023

  8. Sleevi, R.: Intent to Unship: SDCH (2016). https://groups.google.com/a/chromium.org/d/msg/blink-dev/nQl0ORHy7sw/HNpR96sqAgAJ. Accessed Mar 2023

  9. Weiss, Y., Meenan, P.: Compression dictionary transport (2023). https://github.com/WICG/compression-dictionary-transport. Accessed Mar 2023

  10. Wingerath, W., et al.: Speed Kit: A Polyglot & GDPR-Compliant Approach For Caching Personalized Content. In: ICDE, Dallas, Texas (2020)

    Google Scholar 

  11. Wingerath, W., et al.: Beaconnect: continuous web performance A/B-testing at scale. In: Proceedings of the 48th International Conference on Very Large Data Bases (2022)

    Google Scholar 

  12. Wollmer, B., Wingerath, W., Ferrlein, S., Panse, F., Gessert, F., Ritter, N.: The case for cross-entity delta encoding in web compression. In: Proceedings of the 22nd International Conference on Web Engineering (ICWE) (2022)

    Google Scholar 

  13. Wollmer, B., Wingerath, W., Ferrlein, S., Panse, F., Gessert, F., Ritter, N.: The case for cross-entity delta encoding in web compression (extended). J. Web Eng. 22(01), 131–146 (2023)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Benjamin Wollmer .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wollmer, B. et al. (2023). Using the Client Cache for Content Encoding: Shared Dictionary Compression for the Web. In: Aiello, M., Barzen, J., Dustdar, S., Leymann, F. (eds) Service-Oriented Computing. SummerSOC 2023. Communications in Computer and Information Science, vol 1847. Springer, Cham. https://doi.org/10.1007/978-3-031-45728-9_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-45728-9_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-45727-2

  • Online ISBN: 978-3-031-45728-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics