ABSTRACT
One role for workload generation is as a means for understanding how servers and networks respond to variation in load. This enables management and capacity planning based on current and projected usage. This paper applies a number of observations of Web server usage to create a realistic Web workload generation tool which mimics a set of real users accessing a server. The tool, called Surge (Scalable URL Reference Generator) generates references matching empirical measurements of 1) server file size distribution; 2) request size distribution; 3) relative file popularity; 4) embedded file references; 5) temporal locality of reference; and 6) idle periods of individual users. This paper reviews the essential elements required in the generation of a representative Web workload. It also addresses the technical challenges to satisfying this large set of simultaneous constraints on the properties of the reference stream, the solutions we adopted, and their associated accuracy. Finally, we present evidence that Surge exercises servers in a manner significantly different from other Web server benchmarks.
- 1.Virgilio Almeida, Azer Bestavros, Mark Crovella, and Adriann de Oliveira. Characterizing reference locality in the WV#TW. In Proceedings of 1996 International Conference on Parallel and Distributed Information Systems (PDIS '96), pages 92--103, December 1996. Google ScholarDigital Library
- 2.M.F. Arlitt and C.L. Williamson. Web server workload characterization: The search for invariants. In Proceeding of the A CM SIGMETRICS '96 Conference, Philadelphia, PA, April 1996. Google ScholarDigital Library
- 3.Henry Braun. A simple metllod for testing goodness of fit in tile presence of nuisance parameters. Journal of the Royal Statistical Society, }980.Google Scholar
- 4.Tim Bray. Measuring the web. In Fifth International World Wide Web Conference, Paris, France, May 1996. Google ScholarDigital Library
- 5.The Standard Performance Evaluation Corporation. Specweb96, http://www.specbench.org/org/web96/.Google Scholar
- 6.M.E. Crovella and A. Bestavros. Self-similarity in world wide web traffic: Evidence and possible causes. In Proceedings of the 1996 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, May 1996. Google ScholarDigital Library
- 7.C.A. Cunha, A. Bestavros, and M.E. Crovella. Characteristics of www client-based traces. Technical Report TR-95- 010, Boston University Department of Computer Science, April 1995. Google ScholarDigital Library
- 8.R. B. D'Agostino and M. A. Stephens, editors. Goodnessof-Fit Techniques. Marcel Dekker, Inc., 1986. Google ScholarDigital Library
- 9.S. Deng. Empirical model of WWW document arrivals at access link. In Proceedings of the 1996 IEEE International Conference on Communication, June 1996.Google Scholar
- 10.A. Erramilli, O. Narayan, and W. Willinger. Experimental queueing analysis with long-range dependent packet traffic. IEEE/ACM Transactions on Networking, 4(2):209- 223, April 1996. Google ScholarDigital Library
- 11.A. Feldmann. Modelling characteristics of tcp connections. Technical report, AT$cT Laboratories, 1996.Google Scholar
- 12.W.E. Leland, M.S. Taqqu, W. Willinger, and ID.V. Wilson. On the self-similar nature of ethernet traffic (extended version). IEEE/ACM Transactions on Networking, pages 2:1-15, 1994. Google ScholarDigital Library
- 13.Bruce Mah. An empirical model of HTTP network traffic. in Proceedings of INFOCOM '97, Kobe, Japan, April 1997. Google ScholarDigital Library
- 14.R. Mattson, .I. Gecsei, D. Slutz, and I. Traiger. Evaluation techniques and storage hierarchies. IBM Systems Journal, 9:78-117, 1970.Google ScholarDigital Library
- 15.d.C. Mogul. Network behavior of a busy web server and its clients. Technical Report WRL 95/5, DEC Western Research Laboratory, Palo Alto, CA, 1995.Google Scholar
- 16.University of Minnesota. {(}stone version 1. htt p://web 66. coled, umn .ed u / gsto ne/in fo.html.Google Scholar
- 17.Kihong I)ark, Gi Tae Kim, an(/{ Mark E. Crovella. On the relationship between file sizes, transport protocols, and selfsimilar network traffic. In Proceedings of the Fourth International Conference on Network Protocols (ICNP'96), pages 171-:180, October 1996. Google ScholarDigital Library
- 18.Vern Paxson. Empirically-derived analytic models of widearea tcp conne(:tions. IEEE/ACM Transactions on Networking, 1994. Google ScholarDigital Library
- 19.S. Pederson and M. Johnson. Estimating model discrepancy. Technometrics, 1990. Google ScholarDigital Library
- 20.Gene Trent and Mark Sake. Webstone: The first generation in http server benchmarking, February 1995. Silicon (#raphics White Paper.Google Scholar
- 21.Walter Wiltinger, Murad S. Taqqu, Robert Sherman, and Daniel V. Wilson. Self-similarity through high-variability: Statistical analysis of Ethernet LAN traffic at the source level. IEEE/ACM Transactions on Networking, 5(1):71- 86, February 1997. Google ScholarDigital Library
- 22.G. K. Zipf. Human Behavior and the Principle of Least- Effort. Addison-Wesley, Cambridge, MA, 1949.Google Scholar
Index Terms
- Generating representative Web workloads for network and server performance evaluation
Recommendations
Generating representative Web workloads for network and server performance evaluation
One role for workload generation is as a means for understanding how servers and networks respond to variation in load. This enables management and capacity planning based on current and projected usage. This paper applies a number of observations of ...
Scalable Web Server Architectures
ISCC '97: Proceedings of the 2nd IEEE Symposium on Computers and Communications (ISCC '97)A scalable web server architecture is key to enabling WWW sites to handle the ever increasing traffic loads. There is empirical evidence that, for the current generation of web server applications, multiprocessor platforms do not provide the needed ...
Comments