Skip to main content
Log in

Another viewpoint on “evaluating web software reliability based on workload and failure data extracted from server logs”

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

An approach of determining a website’s reliability is evaluated in this paper. This technique extracts workload measures and error codes from the server’s data logs. This information is then used to calculate the reliability for a particular website. This study follows on from a previous study, and hence, can be regarded as a “partial replication” (technically, as both studies are case studies not formal experiments, this description is inaccurate. Unfortunately, no corresponding definition exists for case studies, and hence the term is used to convey a general sense of purpose) of the original study. Although the method proposed by the original study is feasible, the effectiveness of just using a specific error type and a specific workload to estimate the reliability of websites is questionable. In this study, different error types and their usefulness for reliability analysis are examined and discussed. After a thorough investigation, we believe that reliability analysis for websites must be based on more specific error definitions as they can provide a superior reliability estimate for today’s highly dynamic websites.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. http://www.netmechanic.com/products/maintain.shtml

  2. http://validator.w3.org/checklink

  3. http://www.google.com/support/googleanalytics/bin/answer.py?hl = en&answer = 55463 last accessed May 18, 2008

References

  • Alagar VS, Ormandjieva O (2002) “Reliability Assessment of Web Applications”. 26th Annual International Computer Software and Applications Conference, 405–412

  • Arlitt MF, Jin T (1999) Workload characterization of the 1998 world cup web site. HP Labs, Paolo Alto, Technical Report HPL-1999–35 (R.1)

    Google Scholar 

  • Arlitt MF, Williamson CL (1997) Internet Web Servers: Workload Characterization and Performance Implications. IEEE/ACM Trans Netw 5(5):631–645 doi:10.1109/90.649565

    Article  Google Scholar 

  • Boyd S, Keromytis A (2004) Preventing SQL injection attacks. 2nd Applied Cryptography and Network Security (ACNS) Conference, Yellow Mountain, China, June 8–11, pp 292–304

  • Catledge L, Pitkow J (1995) Characterizing browsing behaviors on the World Wide Web. Comput Netw ISDN Syst 27(6):1065–1073 doi:10.1016/0169-7552(95)00043-7

    Article  Google Scholar 

  • CGISecurity.com (2002) The Cross Site Scripting FAQ. Accessed at May 15, 2008. http://www.cgisecurity.com/articles/xss-faq.shtml

  • Cherkasova L, Phaal P (1998) Session based admission control: a mechanism for improving the performance of an overloaded web server. HP Labs, Paolo Alto Technical Report, HPL-08-119

  • Cook. S (2003) A web developer’s guide to cross-site scripting. Accessed May 15, 2008. http://www.giac.org/practical/GSEC/Steve_Cook_GSEC.pdf

  • Cooley R, Mobasher B, Srivastava J (1999) Data preparation for mining World Wide Web browsing patterns. Knowl Inf Syst 1(1):5–32

    Google Scholar 

  • Cowan C, Pu C, Maier D, Hinton H, Bakke P, Beattie S et al (1998) StackGuard: automatic adaptive detection and prevention of buffer-overflow attacks. 7th USENIX Security Conference, San Antonio, TX, USA, pp 63–78

  • Cremonesi P, Serazzi G (2002) “End-to-end performance of web services”, performance evaluation of complex systems: techniques and tools, performance 2002 tutorial lectures. Lect Notes Comput Sci 2459:158–178 doi:10.1007/3-540-45798-4_8

    Article  Google Scholar 

  • Crovella ME, Bestavros A (1997) Self-similarity in world wide web traffic: evidence and possible causes. IEEE/ACM Trans Netw 5(6):631–645

    Article  Google Scholar 

  • Evans D, Larochelle D (2002) Improving security using extensible lightweight static analysis. IEEE Softw 42–51, Jan/Feb: doi:10.1109/52.976940

  • Fu Y, Sandhu K, Shih M (1999) Clustering of web users based on access patterns. International Workshop on Web Usage Analysis and User Profiling (WEBKDD'99), San Diego, CA, USA

  • Galletta DF, Henry R, McCoy S, Polak P (2004) Web site delays: How tolerant are users. J AIS 5(1):1–28

    Google Scholar 

  • Goseva-Popstojanova K, Mazimdar S, Singh A (2004) Empirical study of session-based workload and reliability for web servers. 15th IEEE International Symposium on Software Reliability, Saint-Malo, France, 403–414

  • Goseva-Popstojanova K, Singh A, Mazimdar S, Li F (2006a) Empirical characterization of session-based workload and reliability for web servers. Empir Softw Eng J 11(1):71–117 doi:10.1007/s10664-006-5966-7

    Article  Google Scholar 

  • Goseva-Popstojanova K, Li F, Wang X, Sangle A (2006b) A contribution towards solving the web workload puzzle. 2006 Intl Conf Dependable Syst Networks (DSN’06) pp. 505–516

  • Grant J (2000). Ten undeniable truths for web design. Accessed at May 15, 2008. http://www.htc.net/~joegrant/grantconsulting/articles/undeniable_truths_20000803.htm

  • Grossman J (2004) Thwarting SQL web hacks. VAR Business 20:41–42

    Google Scholar 

  • He D, Goker A (2000) “Detecting session boundaries from Web user logs”, 22nd Annual Colloquium on Information Retrieval Research, British Computer Society, pp. 57–66.

  • Huang YW, Huang SK, Lin TP, Tsai CH (2003) Web application security assessment by fault injection and behavior monitoring. 12th International Conference on World Wide Web, Budapest, Hungary, pp. 148–159

  • Huang X, Peng F, An A, Schuumans D (2004) Dynamic web log session identification with statistical language models. J Am Soc Inf Sci Technol 55(14):1290–1303 doi:10.1002/asi.20084

    Article  Google Scholar 

  • Huntington P, Nicholas D, Jamali HR (2008) Website usage metrics: A re-assessment of session data. Inf Process Manage 44(1):358–372 doi:10.1016/j.ipm.2007.03.003

    Article  Google Scholar 

  • Huynh T, Miller J (2005) Further investigations into evaluating website reliability. 4th International Symposium on Empirical Software Engineering, Noosa Heads, Australia, pp 162–171

  • Jolliffee IT (1986) Principal component analysis. Springer, New York

    Google Scholar 

  • Kallepalli C, Tian J (2001) Measuring and modeling usage and reliability for statistical web testing. IEEE Trans Softw Eng 27(11):1023–1036 doi:10.1109/32.965342

    Article  Google Scholar 

  • Lyu MR (1995) Handbook of software reliability. McGraw-Hill, Columbus

    Google Scholar 

  • Ma L, Tian J (2003) Analyzing errors and referral pairs to characterize common problems and improve web reliability. 3rd International Conference on Web Engineering, Oviedo, Spain, pp. 314–323

  • Masterson M (1999) E-com tech tough enough? CNN Money. Accessed at May 15, 2008. http://money.cnn.com/1999/11/19/technology/etail_tech/

  • Menasce D, Almeida V, Fonseca R, Mendes M (1999) A methodology for workload characterization of e-commerce sites. ACM Conference on Electronic Commerce, Denver, CO, USA, pp. 119–128

  • Menasce D, Almeida V, Foneca R, Mendes M (2000a) Business-oriented resource management policies for e-commerce servers. Perform Eval 32(2–3):223–239 doi:10.1016/S0166-5316(00)00034-1

    Article  Google Scholar 

  • Menasce D, Almeida V, Ried R (2000b) In Search of Invariants for E-Business Workloads. 2nd ACM Conference on Electronic Commerce, Minneapolis, MI, USA, pp. 56–65.

  • Montgomery AL, Faloutsos C (2001) Identifying web browsing trends and patterns. IEEE Comput 34(7):94–95

    Google Scholar 

  • Musa JD, Iannino A, Okumoto K (1987) Software reliability: measurement, prediction, application. McGraw-Hill, Columbus

    Google Scholar 

  • Nah FH (2002) A study of web users’ waiting time. In: Sugumaran, V (eds) Intelligent support systems technology: knowledge management. IRM, Hershey, pp 145–152

    Google Scholar 

  • Nelson E (1978) Estimating Software Reliability from Test Data. Microelectron Reliab 17(1):67–73 doi:10.1016/0026-2714(78)91139-3

    Article  Google Scholar 

  • Offutt J (2002) Quality Attributes of Web Applications. IEEE Software. Spec Issue Softw Eng Internet Softw 19(2):25–32

    Google Scholar 

  • Pitkow JE (1999) Summary of WWW characterizations. World Wide Web 2(1–2):3–13 doi:10.1023/A:1019284202914

    Article  Google Scholar 

  • Rose GM, Lees J, Meuter M (2001) A refined view of download time impacts on e-consumer attitudes and patronage intentions toward e-retailers. Int J Media Manage 3(2):105–111

    Google Scholar 

  • Rosenstein M (2000) What is Actually Taking Place in Web Sites: E-Commerce Lessons from Web Server Logs. 2nd ACM Conference on Electronic Commerce (EC’00), Minneapolis, MN, USA, pp. 38–43.

  • Spitzner L (2001) Know your enemy: revealing the security tools, tactics, and motives of the Blackhat Community, chapter 6. Addison–Wesley, Boston

    Google Scholar 

  • Tian J, Rudraraju S, Li Z (2004) Evaluating web software reliability based on workload and failure data extracted from server logs. IEEE Trans Softw Eng 30(11):754–769 doi:10.1109/TSE.2004.87

    Article  Google Scholar 

  • Trivedi KS (2001) Probability and statistics with reliability, queuing, and computer science applications, 2nd edn. Wiley, New York

    Google Scholar 

  • Wagner D, Foster JS, Brewer EA, Aiken A (2000) A first step towards automated detection of buffer overrun vulnerabilities. Network and Distributed System Security Symposium, San Diego, pp 3–17

    Google Scholar 

  • Wang W, Tang M (2003) User-oriented reliability modeling for a web system. 14th International Symposium on Software Reliability Engineering, Denver, CO, USA, pp 293–304

  • Williams J (2001) “Avoiding the CNN Moment”, IT Pro, March-April, 68–72.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to James Miller.

Additional information

Responsible editor: Laurie Williams

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huynh, T., Miller, J. Another viewpoint on “evaluating web software reliability based on workload and failure data extracted from server logs”. Empir Software Eng 14, 371–396 (2009). https://doi.org/10.1007/s10664-008-9084-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-008-9084-6

Keywords

Navigation