Abstract
As different approaches have demonstrated in the past, delta encoding and shared dictionary compression can significantly reduce the payload of websites. However, choosing a good dictionary or delta source is still a challenge and has kept delta encoding from becoming practically relevant for today’s web. In this work, we demonstrate that the often prohibitive costs of dictionary generation exhibited by earlier approaches can be avoided by simply using cache entries for content encoding: We divide web pages into different page types and use one actual page of every type as a dictionary to encode pages of the same type. In an experimental evaluation, we show that our approach outperforms current industry standards by a factor of 5 in terms of compression ratio. We discuss optimization and content normalization strategies as well as application scenarios that are possible with our approach, but infeasible with the current state of the art.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Content delivery networks (CDNs) accelerate content delivery by caching resources that are requested by multiple clients [6]. This is obviously not possible for deltas, if they are computed for individual users.
- 2.
- 3.
- 4.
- 5.
The categorization within those page types is implemented by the website owner, e.g. by URL regex.
- 6.
- 7.
- 8.
We only use pages within the same website as a dictionary, since browsers would prohibit sharing content across different domains.
- 9.
A journey describes multiple consecutive page visits of one user.
- 10.
References
Alakuijala, J., et al.: Brotli: a general-purpose data compressor. ACM TOI 37(1), 1–30 (2018)
Collet, Y., M. Kucherawy, E.: Zstandard Compression and the ’application/zstd’ Media Type. RFC 8878, February 2021
Korn, D., MacDonald, J., Mogul, J., Vo, K.: The VCDIFF Generic Differencing and Compression Data Format. RFC 3284, June 2002
McQuade, B., Mixter, K., Lee, W.H., Butler, J.: A proposal for shared dictionary compression over http (2016)
Mogul, J., et al.: Delta Encoding in HTTP. RFC 3229, January 2002
Pathan, M., Buyya, R.: A Taxonomy of CDNs, pp. 33–77. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-77887-5_2
Shapira, O.: SDCH at LinkedIn (2015). https://engineering.linkedin.com/shared-dictionary-compression-http-linkedin. Accessed Mar 2023
Sleevi, R.: Intent to Unship: SDCH (2016). https://groups.google.com/a/chromium.org/d/msg/blink-dev/nQl0ORHy7sw/HNpR96sqAgAJ. Accessed Mar 2023
Weiss, Y., Meenan, P.: Compression dictionary transport (2023). https://github.com/WICG/compression-dictionary-transport. Accessed Mar 2023
Wingerath, W., et al.: Speed Kit: A Polyglot & GDPR-Compliant Approach For Caching Personalized Content. In: ICDE, Dallas, Texas (2020)
Wingerath, W., et al.: Beaconnect: continuous web performance A/B-testing at scale. In: Proceedings of the 48th International Conference on Very Large Data Bases (2022)
Wollmer, B., Wingerath, W., Ferrlein, S., Panse, F., Gessert, F., Ritter, N.: The case for cross-entity delta encoding in web compression. In: Proceedings of the 22nd International Conference on Web Engineering (ICWE) (2022)
Wollmer, B., Wingerath, W., Ferrlein, S., Panse, F., Gessert, F., Ritter, N.: The case for cross-entity delta encoding in web compression (extended). J. Web Eng. 22(01), 131–146 (2023)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wollmer, B. et al. (2023). Using the Client Cache for Content Encoding: Shared Dictionary Compression for the Web. In: Aiello, M., Barzen, J., Dustdar, S., Leymann, F. (eds) Service-Oriented Computing. SummerSOC 2023. Communications in Computer and Information Science, vol 1847. Springer, Cham. https://doi.org/10.1007/978-3-031-45728-9_3
Download citation
DOI: https://doi.org/10.1007/978-3-031-45728-9_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-45727-2
Online ISBN: 978-3-031-45728-9
eBook Packages: Computer ScienceComputer Science (R0)