Weblins: A scalable WWW cluster-based server

doi:10.1016/j.advengsoft.2005.04.002

Advances in Engineering Software

Volume 37, Issue 1, January 2006, Pages 11-19

https://doi.org/10.1016/j.advengsoft.2005.04.002 Get rights and content

Abstract

With the ever-growing web traffic, cluster-based web servers have become very important to the Internet infrastructure. Thus, making the best use of all available resources in the cluster to achieve high performance is a significant research issue. In this paper, we present Weblins, a cluster-based web server that can achieve good throughput. Weblins has Gobelins operating system as platform. Gobelins is an efficient single system image operating system that transparently makes use of the resources available in the cluster. The architecture of Weblins is fully distributed. Weblins implements a content-aware request distribution policy via a new interface on top of Gobelins. Popular web files are dynamically replicated on all nodes via a cooperative caching mechanism. For the non-popular files, the requests are handed-off to the corresponding nodes via the TCP Handoff protocol. Simulation results show that the strategy used by Weblins is more suitable for cluster-based Web severs in comparison with pure content-aware strategy and pure cooperative caching strategy.

Introduction

Clusters of workstations are becoming an increasingly popular hardware platform for cost-effective high performance network servers. Typically, a cluster-based web server consists of a front-end server, responsible for request distribution, and a number of back-end servers, responsible for request processing. Back-end servers can handle several incoming requests concurrently. However, to ensure high performance, cluster-based web servers should satisfy two requirements: load balancing and high cache hit rate. Load balancing solutions can be classified as: DNS based approaches, and IP/TCP/HTTP redirection based approaches. The second approach employs a specialized front-end node and a load balancer, which traditionally determines the least loaded server to which the packet has to be sent [12], [13]. Previous request distribution methods, such as round robin, have focused mainly on load balancing to maximize the utilization of the cluster. When the front-end server distributes incoming requests in this manner, each back-end server is likely to cache identical set of data in its main memory. If the size of the working set exceeds the size of the main memory cache, back-end servers tend to suffer from the expensive disk I/O.

Recent work has focused on the content or type of requests sent to servers [8], [9], [10], [11]. These request distribution methods aim to achieve both load balancing and high cache hit rate. While it takes load balancing into account, these methods attempt to dispatch the same kinds of requests to only one back-end server in order to reduce the number of disk accesses. There are three main components comprising a cluster configuration with content-aware request distribution strategy: (a) a dispatcher that specifies which web server will process a given request, (b) a distributor, which interfaces with the client and implements the mechanism that distributes the client requests to a specific web server, and (c) a web server, which processes HTTP requests. In order to distribute requests based on the requested content, the distributor should implement some mechanisms such as TCP handoff [9] or TCP splicing [7].

There are three typical cluster configurations: with single front-end distributor, co-located distributor and server, and co-located dispatcher, distributor and server [11]. Under a load balancing policy in a k-node cluster, each node will statistically serve only 1/k of the incoming requests locally and will forward (k−1)/k of the requests to the other nodes using the TCP handoff mechanism. TCP handoff is an expensive operation. This could lead to a significant forwarding overhead, decreasing the potential performance benefits of the proposed solution [14]. On the other hand, when a back-end server becomes overloaded, it moves some contents to an under-loaded back-end server. That is, some incoming requests are re-assigned to another less busy back-end server. Then, the re-assigned back-end server will process the subsequent requests. Locality-aware request distribution (LARD) [9] has been proposed as one framework of the content-based strategy. Other research has focused on the high cache hit rate, where cooperative caching [5], [6] may be applied to cluster-based web servers. Cooperative caching treats all back-end server memory systems as a large file cache. When a back-end server misses some data, it first searches the main memory cache of other servers before it accesses the corresponding hard disk. One problem is that most of existing cooperating caching algorithms have been developed to provide remote users with file sharing in traditional Unix network file systems and do not consider the specific characteristics of cluster web servers.

In this paper, we present a cluster-based web server system, called Weblins, which exploits Gobelins cluster operating system single image properties [4]. Gobelins provides global management of all resources. Higher-level operating system services such as distributed shared memory system, distributed file system and cooperative file cache can be easily implemented in Gobelins. It has primarily been designed to support the execution of high performance parallel applications. However, its mechanisms are also suitable for the implementation of efficient cluster web servers even if resource management policies designed for parallel applications are not adequate for data server applications. Weblins integrates new memory and file management policies to efficiently support the execution of a web server. Weblins incorporates a mixed strategy that combines content-aware policy and cooperative caching strategy. Our main goal is to minimize the overhead of TCP handoff on one hand and on the other hand to take advantage of cooperative caching for obtaining high cache hit rate. Our specific contributions in this paper include: (a) adapting the traditional greedy dual-size frequency (GDSF) replacement algorithm to suit cluster environments, (b) developing a new request distribution policy for obtaining load balancing and high cache hit rate, (c) developing a new distribution algorithm of web documents across the disks of all nodes for uniformly distributing the load, and (d) comparing the performance of Weblins with that of cooperative caching and content-aware servers for serving static Web content.

This paper is organized as follows. Section 2 presents the overall architecture of Weblins. 3 Cache replacement policy, 4 Web documents distribution, 5 Request flows through the system, 6 Distributed vs. centralized dispatcher describe the cache replacement policy, web database distribution and request distribution policy. Section 7 presents the simulation results. Section 8 concludes the paper.

Section snippets

Weblins architecture

Weblins aims to provide: (a) a scalable distributed architecture, (b) a high request throughput by balancing the load on the cluster nodes, and (c) a high cache hit rate by exploiting Gobelins features. The main features of Weblins are: it can potentially use content-aware request distribution policy by using the TCP Handoff whenever doing so is profitable; it can improve the cache hit rate by dynamically constructing a set of popular files. This set is replicated on all nodes by the use of the

Cache replacement policy

The cache replacement policy is used for choosing the file(s) to be evicted, in order to make room for new inserted file(s). This choice is important for obtaining a high cache hit rate. The best choice is to evict the files that have low file values. Several parameters constitute the value of a file. Among these parameters are the size, the popularity, the time of reference and the cost of bringing the file into the cache. The GDSF algorithm seems the best in the literature for web caching

Web documents distribution

When we consider locally distributed Web systems that do not use a content-aware dispatching mechanism, any server node should be able to respond to client requests for any part of the provided content tree. This means that each server owns or can access a replicated copy of the Web site content unless internal re-routing mechanisms are employed. There are essentially two mechanisms for distributing static information among the Web servers of the cluster. One is to replicate the content tree

Request flows through the system

A back-end server that initially receives a request from the front-end is referred to as the first member. A server member hit occurs when the first member receiving a request from the front-end cache a copy of the requested object. Likewise, a server member miss indicates the case when the first member does not contain a copy of the requested object. If no replication is used, the probability of a server member hit is roughly 1/n where n is the number of servers in the cluster array. The exact

Distributed vs. centralized dispatcher

A real risk of a system bottleneck exists when the cluster's front end implements the content-aware request distribution policy (layer-7 Web switch) [9], [11]. Indeed, the additional overhead caused by such content-aware routing reduces the system scalability by one order of magnitude with respect to the load balancing request distribution policy (layer-4 Web switches) [9], [11]. To overcome this drawback, Weblins proposes an alternative that combines the two policies. A layer-4 Web switch is

Simulation results

To study various request distribution policies for a range of cluster sizes under different assumptions for CPU speed, amount of memory and other parameters, we developed a configurable web server cluster event-driven simulator. The costs for the basic request processing steps used in our simulation were derived from the measures used in [9]; in particular, we use file size threshold, S=100 KB, and server load threshold, T=130 active connections. In addition to Weblins, we simulated two more

Conclusion

In this paper, we present Weblins, a scalable cluster-based WWW server. Weblins includes a new request distribution algorithm that combines the features of content-aware request distribution and the mechanism proposed by its underlying cooperative cache system. Also, a new cache replacement policy and Web database storage are integrated into Weblins. Simulation results show that Weblins gives better throughput and overall performance in comparison with pure content-aware request distribution

Acknowledgements

Part of this work was done by the first author at the IRISA, University of Rennes-1. We thank Christine Morin for her support of this work.

References (14)

Arlitt M. A performance study of Internet web servers. Master Thesis, University of Saskatchewan;...
Cherkasova L. Improving WWW proxies performance with greedy-dual-size-frequency caching policy. Hewlett-Packard...
Arlitt M, Williamson C. Web server workload characterization: the search for invariants. In Proceedings of the ACM...
Lottiaux R. Gestion globale de la mémoire physique d'une grappe pour un système à image unique. PhD Thesis, Université...
Sarkar AP, Hartman J. Efficient cooperative caching using hints. In: Second symposium on operating systems design and...
Dahlin MD, Wang RY, Anderson TE, Patterson DA. Cooperative caching: using remote client memory to improve file system...
Cohen A, Rangaragan S, Slye H. On the performance of TCP splicing for URL-aware redirection. In: Proceedings of the...

There are more references available in the full text version of this article.

Cited by (4)

Web cluster load balancing techniques: A survey
2015, International Journal of Applied Engineering Research
A high-concurrency web map tile service built with open-source software
2013, Modern accelerator technologies for geographic information science
An up-to-date survey in web load balancing
2011, World Wide Web
Temporal load-balancing of web-server traffic
2006, Parallel and Distributed Computing, Applications and Technologies, PDCAT Proceedings

View full text