Abstract
An approach of determining a website’s reliability is evaluated in this paper. This technique extracts workload measures and error codes from the server’s data logs. This information is then used to calculate the reliability for a particular website. This study follows on from a previous study, and hence, can be regarded as a “partial replication” (technically, as both studies are case studies not formal experiments, this description is inaccurate. Unfortunately, no corresponding definition exists for case studies, and hence the term is used to convey a general sense of purpose) of the original study. Although the method proposed by the original study is feasible, the effectiveness of just using a specific error type and a specific workload to estimate the reliability of websites is questionable. In this study, different error types and their usefulness for reliability analysis are examined and discussed. After a thorough investigation, we believe that reliability analysis for websites must be based on more specific error definitions as they can provide a superior reliability estimate for today’s highly dynamic websites.
Similar content being viewed by others
References
Alagar VS, Ormandjieva O (2002) “Reliability Assessment of Web Applications”. 26th Annual International Computer Software and Applications Conference, 405–412
Arlitt MF, Jin T (1999) Workload characterization of the 1998 world cup web site. HP Labs, Paolo Alto, Technical Report HPL-1999–35 (R.1)
Arlitt MF, Williamson CL (1997) Internet Web Servers: Workload Characterization and Performance Implications. IEEE/ACM Trans Netw 5(5):631–645 doi:10.1109/90.649565
Boyd S, Keromytis A (2004) Preventing SQL injection attacks. 2nd Applied Cryptography and Network Security (ACNS) Conference, Yellow Mountain, China, June 8–11, pp 292–304
Catledge L, Pitkow J (1995) Characterizing browsing behaviors on the World Wide Web. Comput Netw ISDN Syst 27(6):1065–1073 doi:10.1016/0169-7552(95)00043-7
CGISecurity.com (2002) The Cross Site Scripting FAQ. Accessed at May 15, 2008. http://www.cgisecurity.com/articles/xss-faq.shtml
Cherkasova L, Phaal P (1998) Session based admission control: a mechanism for improving the performance of an overloaded web server. HP Labs, Paolo Alto Technical Report, HPL-08-119
Cook. S (2003) A web developer’s guide to cross-site scripting. Accessed May 15, 2008. http://www.giac.org/practical/GSEC/Steve_Cook_GSEC.pdf
Cooley R, Mobasher B, Srivastava J (1999) Data preparation for mining World Wide Web browsing patterns. Knowl Inf Syst 1(1):5–32
Cowan C, Pu C, Maier D, Hinton H, Bakke P, Beattie S et al (1998) StackGuard: automatic adaptive detection and prevention of buffer-overflow attacks. 7th USENIX Security Conference, San Antonio, TX, USA, pp 63–78
Cremonesi P, Serazzi G (2002) “End-to-end performance of web services”, performance evaluation of complex systems: techniques and tools, performance 2002 tutorial lectures. Lect Notes Comput Sci 2459:158–178 doi:10.1007/3-540-45798-4_8
Crovella ME, Bestavros A (1997) Self-similarity in world wide web traffic: evidence and possible causes. IEEE/ACM Trans Netw 5(6):631–645
Evans D, Larochelle D (2002) Improving security using extensible lightweight static analysis. IEEE Softw 42–51, Jan/Feb: doi:10.1109/52.976940
Fu Y, Sandhu K, Shih M (1999) Clustering of web users based on access patterns. International Workshop on Web Usage Analysis and User Profiling (WEBKDD'99), San Diego, CA, USA
Galletta DF, Henry R, McCoy S, Polak P (2004) Web site delays: How tolerant are users. J AIS 5(1):1–28
Goseva-Popstojanova K, Mazimdar S, Singh A (2004) Empirical study of session-based workload and reliability for web servers. 15th IEEE International Symposium on Software Reliability, Saint-Malo, France, 403–414
Goseva-Popstojanova K, Singh A, Mazimdar S, Li F (2006a) Empirical characterization of session-based workload and reliability for web servers. Empir Softw Eng J 11(1):71–117 doi:10.1007/s10664-006-5966-7
Goseva-Popstojanova K, Li F, Wang X, Sangle A (2006b) A contribution towards solving the web workload puzzle. 2006 Intl Conf Dependable Syst Networks (DSN’06) pp. 505–516
Grant J (2000). Ten undeniable truths for web design. Accessed at May 15, 2008. http://www.htc.net/~joegrant/grantconsulting/articles/undeniable_truths_20000803.htm
Grossman J (2004) Thwarting SQL web hacks. VAR Business 20:41–42
He D, Goker A (2000) “Detecting session boundaries from Web user logs”, 22nd Annual Colloquium on Information Retrieval Research, British Computer Society, pp. 57–66.
Huang YW, Huang SK, Lin TP, Tsai CH (2003) Web application security assessment by fault injection and behavior monitoring. 12th International Conference on World Wide Web, Budapest, Hungary, pp. 148–159
Huang X, Peng F, An A, Schuumans D (2004) Dynamic web log session identification with statistical language models. J Am Soc Inf Sci Technol 55(14):1290–1303 doi:10.1002/asi.20084
Huntington P, Nicholas D, Jamali HR (2008) Website usage metrics: A re-assessment of session data. Inf Process Manage 44(1):358–372 doi:10.1016/j.ipm.2007.03.003
Huynh T, Miller J (2005) Further investigations into evaluating website reliability. 4th International Symposium on Empirical Software Engineering, Noosa Heads, Australia, pp 162–171
Jolliffee IT (1986) Principal component analysis. Springer, New York
Kallepalli C, Tian J (2001) Measuring and modeling usage and reliability for statistical web testing. IEEE Trans Softw Eng 27(11):1023–1036 doi:10.1109/32.965342
Lyu MR (1995) Handbook of software reliability. McGraw-Hill, Columbus
Ma L, Tian J (2003) Analyzing errors and referral pairs to characterize common problems and improve web reliability. 3rd International Conference on Web Engineering, Oviedo, Spain, pp. 314–323
Masterson M (1999) E-com tech tough enough? CNN Money. Accessed at May 15, 2008. http://money.cnn.com/1999/11/19/technology/etail_tech/
Menasce D, Almeida V, Fonseca R, Mendes M (1999) A methodology for workload characterization of e-commerce sites. ACM Conference on Electronic Commerce, Denver, CO, USA, pp. 119–128
Menasce D, Almeida V, Foneca R, Mendes M (2000a) Business-oriented resource management policies for e-commerce servers. Perform Eval 32(2–3):223–239 doi:10.1016/S0166-5316(00)00034-1
Menasce D, Almeida V, Ried R (2000b) In Search of Invariants for E-Business Workloads. 2nd ACM Conference on Electronic Commerce, Minneapolis, MI, USA, pp. 56–65.
Montgomery AL, Faloutsos C (2001) Identifying web browsing trends and patterns. IEEE Comput 34(7):94–95
Musa JD, Iannino A, Okumoto K (1987) Software reliability: measurement, prediction, application. McGraw-Hill, Columbus
Nah FH (2002) A study of web users’ waiting time. In: Sugumaran, V (eds) Intelligent support systems technology: knowledge management. IRM, Hershey, pp 145–152
Nelson E (1978) Estimating Software Reliability from Test Data. Microelectron Reliab 17(1):67–73 doi:10.1016/0026-2714(78)91139-3
Offutt J (2002) Quality Attributes of Web Applications. IEEE Software. Spec Issue Softw Eng Internet Softw 19(2):25–32
Pitkow JE (1999) Summary of WWW characterizations. World Wide Web 2(1–2):3–13 doi:10.1023/A:1019284202914
Rose GM, Lees J, Meuter M (2001) A refined view of download time impacts on e-consumer attitudes and patronage intentions toward e-retailers. Int J Media Manage 3(2):105–111
Rosenstein M (2000) What is Actually Taking Place in Web Sites: E-Commerce Lessons from Web Server Logs. 2nd ACM Conference on Electronic Commerce (EC’00), Minneapolis, MN, USA, pp. 38–43.
Spitzner L (2001) Know your enemy: revealing the security tools, tactics, and motives of the Blackhat Community, chapter 6. Addison–Wesley, Boston
Tian J, Rudraraju S, Li Z (2004) Evaluating web software reliability based on workload and failure data extracted from server logs. IEEE Trans Softw Eng 30(11):754–769 doi:10.1109/TSE.2004.87
Trivedi KS (2001) Probability and statistics with reliability, queuing, and computer science applications, 2nd edn. Wiley, New York
Wagner D, Foster JS, Brewer EA, Aiken A (2000) A first step towards automated detection of buffer overrun vulnerabilities. Network and Distributed System Security Symposium, San Diego, pp 3–17
Wang W, Tang M (2003) User-oriented reliability modeling for a web system. 14th International Symposium on Software Reliability Engineering, Denver, CO, USA, pp 293–304
Williams J (2001) “Avoiding the CNN Moment”, IT Pro, March-April, 68–72.
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editor: Laurie Williams
Rights and permissions
About this article
Cite this article
Huynh, T., Miller, J. Another viewpoint on “evaluating web software reliability based on workload and failure data extracted from server logs”. Empir Software Eng 14, 371–396 (2009). https://doi.org/10.1007/s10664-008-9084-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-008-9084-6