Skip to main content
Log in

Empirical Characterization of Session–Based Workload and Reliability for Web Servers

  • Special Issue Paper
  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

The growing availability of Internet access has led to significant increase in the use of World Wide Web. If we are to design dependable Web–based systems that deal effectively with the increasing number of clients and highly variable workload, it is important to be able to describe the Web workload and errors accurately. In this paper we focus on the detailed empirical analysis of the session–based workload and reliability based on the data extracted from actual Web logs of eleven Web servers. First, we introduce and rigourously analyze several intra–session and inter–session metrics that collectively describe Web workload in terms of user sessions. Then, we analyze Web error characteristics and estimate the request–based and session–based reliability of Web servers. Finally, we identify the invariants of the Web workload and reliability that apply through all data sets considered. The results presented in this paper show that session–based workload and reliability are better indicators of the users perception of the Web quality than the request–based metrics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Access Log Analyzers, http://www.uu.se/Software/Analyzers/Access-Analyzers.html

  • Alagar VS, Ormandjieva O (2002) Reliability assessment of web applications. In 26th Annual International Computer Software and Applications Conference (COMPSAC'02)

  • Arlitt M, Jin T (1999, September) Workload Characterization of the 1998 World Cup Web Site, Technical Report HPL-1999-35(R.1), Hewlett–Packard

  • Arlitt M, Williamson C (1997, October) Internet web servers: workload characterization and performance implications. IEEE/ACM Transactions on Networking 5(5):631–645

    Article  Google Scholar 

  • Arlitt M, Krishnamurthy D, Rolia J (2001) Characterzing the scalability of a large web–based shopping system. ACM Transctions on Internet Technology 1(1):44–69

    Google Scholar 

  • Backes PG, Tso KS, Norris JS, Tharp GKO, Slostad JT, Bonitz RG, Ali KS (2000, May) Internet–based operations for the Mars polar lander mission. In IEEE International Conference on Robotics and Automation, pp 2025–2032

  • Barford P, Corovella M (1998) Generating representative web workloads for network and server performance evaluation. In ACM SIGMETRICS, pp 151–160

  • Basili VR (1996) The role of experimentation in software engineering: past, current, and future. In 18th IEEE International Conference on Software Engineering (ICSE-18), May, pp 442–449

  • Business Internet Group of San Francisco (2004a) The BIG-SF Report on Government Web Application Integrity

  • Business Internet Group of San Francisco (2004b) The Black Friday Report on Web Application Integrity

  • Cherkasova L, Phaal P (1998) Session Based Admission Control: A Mechanism for Improving the Performance of an Overloaded Web Servers. Technical Report HPL-98-119, HP Labs

  • Cremonesi P, Serazzi G (2002) End-to-end performance of web services. In: Clzarossa MC, Tucci S (eds) Performance 2002, LNCS 2459. Springer-Verlag, pp 158–178

  • Crovella ME, Bestavros A (1995) Explaining World Wide Web Traffic Self–Similarity. Technical Report TR-95-015, Boston University

  • Crovella ME, Bestavros A (1997, December) Self-similarity in world wide web traffic: evidence and possible causes. IEEE/ACM Transactions on Networking 5(6):835–846

    Article  Google Scholar 

  • FastStats, http://www.mach5.com

  • Goševa-Popstojanova K, Kamavaram S (2003, November) Assessing uncertainty in reliability of component–based software systems. In 14th IEEE International Symposium on Software Reliability Engineering (ISSRE 2003), pp 307–320

  • Goševa-Popstojanova K, Singh AD, Mazimdar S (2004, November) Empirical study of session–based workload and reliability for web servers. In 15th IEEE International Symposium on Software Reliability Engineering (ISSRE 2004), pp 403–414

  • HBX Analytics, http://www.websidestory.com/products/web-analytics/hbx-analytics

  • Hill BM (1975) A simple general approach to inference about the tail of a distribution. Annals of Statistics 3(5):1163–1174

    MATH  MathSciNet  Google Scholar 

  • Internet Traffic Archive, http://ita.ee.lbl.gov/html/traces.html

  • Kallepalli C, Tian J (2001, November) Measuring and modeling usage and reliability for statistical web testing. IEEE Transaction on Software Engineering 27(11):1023–1036

    Google Scholar 

  • Kitchenham BA, Pickard LM (1996) Evaluationg software engineering methods and tools, part 10: designing and running quantitative case study. ACM SIGSOFT Softwre Engineering Notes 23(3):20–22

    Google Scholar 

  • Log Files-Apache HTTP Server, http://httpd.apache.org/docs-2.0/logs.html

  • Menascé D, Almeida V, Fonseca R, Mendes M (1999, November) A methodology for workload characterization of E-commerce sites. In ACM Conference on Electronic Commerce (EC-99), pp 119–128

  • Menascé D, Almeida V, Riedi R (2000a, October) In search of invariants for e-business workloads. In 2nd ACM Conference on Electronic Commerce (EC'00) pp 56–65

  • Menascé DA, Almeida VAF, Foneca R, Mendes MA (2000b) Business–oriented resource management policies for E-commerce servers. Performance Evaluation 42(2–3):223–239

    Google Scholar 

  • Menascé D, Abrah ao B, Barbará D, Almeida V, Ribeiro F (2002, May) Fractal characterizaion of web workloads. In 11th International World Wide Web Conference

  • Nelson E (1978) Estimating software reliability from test data. Microelectronics and Reliability 17(1):67–73

    Article  Google Scholar 

  • NetTracker, http://www.sane.com/demo/NetTracker/web/index.html

  • Oppenheimer D, Patterson D (2002) Architecture and dependability of large-scale internet services. IEEE Internet Computing 6(5):41–49

    Article  Google Scholar 

  • Resnick SI (1997) Heavy tail modeling and teletraffic data. The Annals of Statistics 25(5):1805–1849

    Article  MATH  MathSciNet  Google Scholar 

  • Rosenstein M (2000) What is actually taking place in web sites: e-commerce lessons from web server logs. In 2nd ACM Conference on Electronic Commerce (EC'00), pp 38–43

  • Sawmill, http://sawmill.net

  • Wang W, Tang M (2003, November) User–oriented reliability modeling for a web system. In 14th International Symposium on Software Reliability Engineering (ISSRE 2003), pp 293–304

  • Webtrax, http://www.multicians.org/thvv/webtrax-help.html

  • Wohlin C, Runeson P, Höst M, Ohlsson MC, Regnell B, Wesslén A (2000) Experimentation in Software Engineering: An Introduction. Kluwer Academic Publishers, Norwell, MA

    Google Scholar 

  • Zelkowitz MV, Wallace DR (1998, May) Experimentail models for validating technology. IEEE Computer 31(5):23–31

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Katerina Goševa-Popstojanova.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Goševa-Popstojanova, K., Singh, A.D., Mazimdar, S. et al. Empirical Characterization of Session–Based Workload and Reliability for Web Servers. Empir Software Eng 11, 71–117 (2006). https://doi.org/10.1007/s10664-006-5966-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-006-5966-7

Keywords

Navigation