Profit-aware overload protection in E-commerce Web sites

https://doi.org/10.1016/j.jnca.2008.02.020Get rights and content

Abstract

Overload protection is critical to E-commerce Web sites. This paper presents a profit-aware admission control mechanism for overload protection in E-commerce Web sites. Motivated by the observation [Measure Twice, Cut Once—Metrics For Online Retailers, 2006, (http://www.techexchange.com/thelibrary/online_retail_metrics.html)] that once a client made an initial purchase, the buy-to-visit ratio of the client escalates from less than 1% to nearly 21%, the proposed mechanism keeps track of the purchase records of clients and utilizes them to make admission control decisions. We build two hash tables with full IP address and network ID prefix, which maintain the purchase records of clients in fine-grain and coarse-grain manners, respectively. We classify those clients who made purchases before as premium customers and those clients without prior purchase behavior as basic customers. Under overload conditions, our mechanism differentiates premium customers from basic customers based on the record hash tables, and admits premium customers with much higher probability than basic customers. In favor of premium customers, our mechanism maximizes the revenues of E-commerce Web sites. We evaluate the efficacy of the profit-aware mechanism using the industry-standard TCP-W workloads. Our experimental results demonstrate that under overload conditions, the profit-aware admission control mechanism not only achieves higher throughput and lower response time, but also dramatically increases the revenue received by E-commerce Web sites.

Introduction

E-commerce has been rapidly growing in recent years. More and more people now purchase items online. In 2006, online spending in US increased 24% over 2005 (Online spending tops $100 billion, 2007), and online spending in China surged 47% over 2005 (China Is No. 2 Online, 2007). This rapid growth of E-commerce has imposed an ever-increasing workload on E-commerce Web sites, leading to a great demand for overload protection. An overloaded E-commerce Web site is swamped with numerous Web requests that are well beyond the system capacity. Without proper protection, system throughput drops quickly and the response time of those already-admitted requests increases dramatically to an unacceptable level. This in turn results in significant revenue loss to the overloaded E-commerce Web site. Latest studies have shown that 75% of visitors to a slow E-commerce site will never shop on that site again (E-Commerce facts posted by ZDNet Research, 2007). Being a temporary solution, simple over-provisioning mitigates the negative effect caused by overload but at very high cost. Moreover, simple over-provisioning cannot cope with flash crowds, the typical events that often overload Web sites (Arlitt and Jin, 2000, Jung et al., 2002).

As effective approaches to overload protection in E-commerce Web sites, several admission control mechanisms have been proposed and developed (Chen and Mohapatra, 2002, Cherkasova and Phaal, 2002, Elnikety et al., 2004). The Web traffic of E-commerce is session based, in which a session is defined as a sequence of temporally and logically related requests originated from the same client. Session integrity requires that once a request is admitted for processing, all the following requests within a session should be accepted. The importance of session integrity has been studied in Chen and Mohapatra (2002) and Cherkasova and Phaal (2002), and session-based admission control (SBAC) mechanisms has been proposed in Cherkasova and Phaal (2002). SBAC improves the revenue of an E-commerce Web site by honoring the completion of longer sessions, since longer sessions are typically the ones that would result in purchases (Cherkasova and Phaal, 2002). However, the current session-based mechanisms (Chen and Mohapatra, 2002, Cherkasova and Phaal, 2002) focus on inter-request relationship within a session only, and none of them attempt to use the inter-session purchase record of a client for making admission decisions. Recently, Totok and Karamcheti (2006) proposed a reward-driven request prioritization (RDRP) mechanism that gives higher execution priority to the requests whose sessions are likely to bring more profit. However, the focus of RDRP also lies in inter-request structure, instead of inter-session relationship.

In this paper, we present a profit-aware admission control mechanism for overload protection in E-commerce Web sites. The key feature of our mechanism is to keep track of inter-session purchase records of clients and utilize them for admitting new sessions. The primary e-metric we use is the buy-to-visit (B2V) ratio, which captures the critical customer behavior of an E-commerce Web site (Measure Twice, Cut Once—Metrics For Online Retailers, 2006). The study in Measure Twice, Cut Once—Metrics For Online Retailers (2006) shows that buyers without prior purchase record, i.e., who are making their first purchase, have a B2V ratio of less than 1%; however, once a client initiates a purchase, the B2V ratio of the client escalated to nearly 21%, approximating that one purchase for every five visits. This more than 20 times B2V ratio difference motivates us to design a profit-aware admission control mechanism, which differentiates premium customers who have made purchase(s) before from basic customers who have never made any purchase yet, and admits premium customers with much higher probability than basic customers under overload conditions.

The major challenge to designing such a profit-aware admission control mechanism is how to classify customers in an efficient and reliable manner. If clients were always required to explicitly authenticate themselves by username and password before shopping online, or if persistent cookies (Fu et al., 2001) were ubiquitously used in client browsers, customers would be easily identified and B2V ratio-based profit-aware admission control mechanism would be easily implemented. However, as we will discuss in Section 2, quite often neither password-based explicit authentication nor cookie-based implicit authentication is available, and we must rely on a more generic method to classify customers as complementary to these unreliable schemes.

Since each Internet host has an IP address, it is straightforward to use IP address for customer classification. This approach is more efficient and generic because of two facts: (1) no user interactions or extra information (e.g., cookies) are needed; (2) a Web request cannot reach the server without an IP address. Although the usage of Dynamic Host Configuration Protocol (DHCP) and Network Address Translation (NAT) may reduce accuracy in customer identification, the actual effect on the overall revenue of an E-commerce Web site is minor. We will further discuss the inaccuracy induced by DHCP and NAT in Section 2.

Utilizing IP address as a complementary but efficient customer classification method, we build two hash tables with full IP address and network ID prefix to maintain the purchase records of customers in fine-grain and coarse-grain ways, respectively. Under overload conditions, our profit-aware admission control mechanism gives much higher admission probabilities to premium customers than basic customers. In favor of premium customers, we can maximize the revenue of an overloaded E-commerce Web site. Based on the industry-standard TCP-W workloads, we evaluate the performance of the proposed profit-aware mechanism in our testbed. To highlight the importance of admission control and compare with current admission control mechanisms, we design and conduct four sets of experiments. The experimental results show that under overload conditions, the profit-aware mechanism not only achieves higher throughput and lower response time, but also dramatically increases the revenue received by E-commerce Web sites.

The remainder of this paper is structured as follows. Section 2 discusses the issues related to customers classification. Section 3 details the proposed profit-aware admission control mechanism. Section 4 presents the experimental design and results based on the TCP-W workloads. Section 5 surveys related work. Finally, Section 6 concludes the paper.

Section snippets

Customer classification

In general, E-commerce Web sites rely on login and password authentication as their primary method to explicitly identify a customer, and use cookies stored in a customer's Web browser as their secondary method to implicitly identify a customer. However, both methods have their own limitations. On the one hand, explicit customer authentication method may incur high shopping cart abandonment rate. One of the top E-commerce strategies of reducing shopping cart abandonment suggests to make

Profit-aware admission control

The proposed profit-aware admission control mechanism consists of two major modules: the ID profiling module and the admission decision module. The ID profiling module is always turned on to record customer purchase behaviors. The admission decision module is in action only when the Web site is under an overload condition.

When a new session request arrives from a reliable IP address,1

Performance evaluation

In this section, we first describe the experimental setup and the experimental design for evaluating the profit-aware admission control mechanism. Then, we present the experimental results based on the TPC-W workloads.

Related work

Web server workload characterization has been studied in Arlitt (2000), Ménasce et al. (1999) and Vallamsetty et al. (2002). In Arlitt (2000), user session characteristics such as requests per session, session length, and inter-session times are thoroughly examined by analyzing the workload of the 1998 World Cup Web sites. In Ménasce et al. (1999), the CBMG is introduced to characterize customer navigational patterns as viewed from the server side. In Vallamsetty et al. (2002), the authors

Conclusions

E-commerce Web sites perform poorly during overload periods. Not only system performance significantly degrades, but also payments received by E-commerce Web sites drop rapidly. Existing SBAC mechanisms only focus on inter-request relationship within a session, and none of them attempt to use the inter-session purchase record of a client for making admission decisions. Therefore, these session-based mechanisms fall short in protecting revenue losses of E-commerce Web sites under overload

References (40)

  • T.F. Abdelzaher et al.

    Performance guarantees for web server end-systems: a control-theoretical approach

    IEEE Trans Parallel Distrib Syst

    (2002)
  • M. Arlitt

    Characterizing web user sessions

    ACM SIGMETRICS Perform Eval Rev

    (2000)
  • M. Arlitt et al.

    Workload characterization of the 1998 world cup web site

    Network IEEE

    (2000)
  • Blanquer JM, Batchelli A, Schauser K, Wolski R. Quorum: flexible quality of service for internet services. In:...
  • Brik V, Stroik J, Banerjee S. Debugging dhcp performance. In: Proceedings of the 4th ACM SIGCOMM IMC, October...
  • Carlstrom J, Rom R. Application-aware admission control and scheduling in web servers. In: Proceedings of the IEEE...
  • Casado M, Freedman MJ. Peering through the shroud: the effect of edge opacity on IP-based client identification. In:...
  • CERT Advisory CA-1996-21 TCP SYN Flooding and IP Spoofing Attacks 〈http://www.cert.org/advisories/CA-1996-21.html〉;...
  • Chen H, Mohapatra P. Session-based overload control in qos-aware web servers. In: Proceedings of the IEEE INFOCOM’2002,...
  • L. Cherkasova et al.

    Session-based admission control: a mechanism for peak load management of commercial web sites

    IEEE Trans Comput

    (2002)
  • China Is No. 2 Online, 2007 〈http://www.forbes.com/〉, January...
  • Corbato FJ. A paging experiment with the multics system. MIT Project MAC Report MAC-M-384,...
  • DHCP Leases, Lease Length Policies and Management, 2006...
  • Elnikety S, Nahum E, Tracey J, Zwaenepoel W. A method for transparent admission control and request scheduling in...
  • E-Commerce facts posted by ZDNet Research, 2007...
  • Fu K, Sit E, Smith K, Feamster N. Dos and don’ts of client authentication on the web. In: Proceedings of the 10th...
  • M. Harchol-Balter et al.

    Size-based scheduling to improve web performance

    ACM Trans Comput Syst

    (2003)
  • Heiss H-U, Wagner R. Adaptive load control in transaction processing systems. In: Proceedings of the 17th international...
  • Jin C, Wang H, Shin KG. Hop-count filtering: an effective defense against spoofed DDoS traffic. In: Proceedings of the...
  • Jetty Java HTTP Servlet Server, 2006...
  • Cited by (0)

    View full text