Abstract
A wide range of techniques have been proposed, implemented, and even standardized for improving the performance of Web content delivery. However, previous work has found that many Web sites either do not take advantage of such techniques or unknowingly inhibit their use. In this paper, we present the design of a tool called Cassandra that addresses these problems. Web site developers can use Cassandra to achieve three goals: (i) to identify protocol correctness and conformance problems; (ii) to identify content delivery performance problems; and (iii) to evaluate the potential benefits of using content delivery optimizations. Cassandra combines performance and behavioral data, together with an extensible simulation architecture, to identify content delivery problems and predict optimization benefits. We describe the architecture of Cassandra and demonstrate its use to evaluate the potential benefits of a CDN on a large Web server farm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
AlertSite, http://www.alertsite.com/
The Apache HTTP server, http://httpd.apache.org/
Banga, G., Druschel, P.: Measuring the capacity of a Web server. In: Proc. of the USENIX Symp. on Internet Technologies and Systems (1997)
Barford, P., Crovella, M.: Generating representative Web workloads for network and server performance evaluation. In: Proc. of ACM SIGMETRICS (1998)
Bent, L., Rabinovich, M., Voelker, G.M., Xiao, Z.: Characterization of a large Web site population with implications for content delivery. In: Proc. of the 13th International World Wide Web Conference (May 2004)
Bent, L., Voelker, G.M.: Whole page performance. In: Proc. of the Seventh International Workshop on Web Content Caching and Distribution (August 2002)
Cacheability tools, http://www.web-caching.com/tools.html
Cheng, Y.-C., Hoelzle, U., Cardwell, N., Savage, S., Voelker, G.M.: Monkey see, monkey do: A tool for TCP tracing and replaying. In: Proc. of the USENIX Annual Technical Conference (June 2004)
Cranor, C., Johnson, T., Spatscheck, O.: Gigascope: a stream database for network applications. In: Proc. of ACM SIGMOD (June 2003)
Empirix, http://www.empirix.com/
Fei, Z.: A novel approach to managing consistency in content distribution networks. In: Proc. of Web Caching and Content Distribution Workshop (2001)
Fielding, R., Gettys, J., Mogul, J.C., Frystyk, H., Masinter, L., Leach, P., Berners-Lee, T.: Hypertext Transfer Protocol – HTTP/1.1 RFC 2616 (1998)
Iyengar, K., Squillante, M.S., Zhang, L.: Analysis and characterization of large-scale Web server access patterns and performance. World Wide Web 2(1-2), 85–100 (1999)
Jung, Y., Krishnamurthy, B., Rabinovich, M.: Flash crowds and denial of service attacks: Characterization and implications for CDNs and Web sites. In: Proc. of the 11th International World Wide Web Conference (May 2002)
Keynote, http://www.keynote.com/
Koletsou, M., Voelker, G.: The medusa proxy: A tool for exploring user-perceived Web performance. In: Proc. of the 6th International Web Caching Workshop and Content Delivery Workshop (June 2001)
Krishnamurthy, B., Arlitt, M.: PRO-COW: Protocol compliance on the Web: A longitudinal study. In: Proc. of the 3rd USENIX Symp. on Internet Technologies and Systems, pp. 109–122 (2001)
Krishnamurthy, B., Wang, J.: On network-aware clustering of Web clients. In: Proc. of ACM SIGCOMM (August 2000)
Krishnamurthy, B., Wills, C., Zhang, Y.: On the use and performance of content distribution networks. In: Proc. of the First ACM SIGCOMM Internet Measurement Workshop, pp. 169–182 (November 2001)
Krishnamurthy, B., Wills, C.E.: Analyzing factors that influence end-to-end Web performance. Computer Networks 33(1-6), 17–32 (2000)
Manley, S., Seltzer, M.: Web facts and fantasy. In: Proc. of the USENIX Symp. on Internet Technologies and Systems, pp. 125–133 (December 1997)
Mogul, J.: Clarifying the fundamentals of http. In: Proc. of the 11th International World Wide Web Conference, pp. 444–457 (May 2002)
Mosberger, D., Jin, T.: httperf – a tool for measuring Web server performance. In: Proc. of Workshop on Internet Server Performance (1998)
Padmanabhan, V.N., Qiu, L.: The content and access dynamics of a busy Web site: Findings and implications. In: Proc. of ACM SIGCOMM (August 2000)
Rabinovich, M., Spatscheck, O.: Web Caching and Replication. Addison Wesley, Reading (2002)
The Squid Web proxy cache, http://www.squid-cache.org
tcpdump, http://www.tcpdump.org/
Web-Polygraph, http://www.web-polygraph.org/
Websiteoptimization.com, http://www.websiteoptimization.com/services/analyze/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bent, L., Rabinovich, M., Voelker, G.M., Xiao, Z. (2004). Towards Informed Web Content Delivery. In: Chi, CH., van Steen, M., Wills, C. (eds) Web Content Caching and Distribution. WCW 2004. Lecture Notes in Computer Science, vol 3293. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30471-5_18
Download citation
DOI: https://doi.org/10.1007/978-3-540-30471-5_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23516-3
Online ISBN: 978-3-540-30471-5
eBook Packages: Springer Book Archive