Elsevier

Computer Networks

Volume 39, Issue 4, 15 July 2002, Pages 437-455
Computer Networks

Prefetching the means for document transfer: a new approach for reducing Web latency

https://doi.org/10.1016/S1389-1286(02)00184-6Get rights and content

Abstract

User-perceived latency is recognized as the central performance problem in the Web. We systematically measure factors contributing to this latency, across several locations. Our study reveals that domain name system (DNS) lookup times, transmission control protocol (TCP) connection–establishment, and start-of-session delays at hypertext transfer protocol (HTTP) servers are major causes of long waits. Moreover, wait due to these factors also afflicts high-bandwidth users, which enjoy relatively short transmission times.

We propose simple techniques that address these factors: (i) pre-resolving host names (pre-performing DNS lookup), (ii) pre-connecting (prefetching TCP connections prior to issuance of HTTP request), and (iii) pre-warming (sending a “dummy” HTTP HEAD request to Web servers). Trace-based simulations demonstrate a potential to significantly reduce long waits.

Our techniques are complementary to the more traditional document prefetching techniques. Like document prefetching, deployment of our techniques at Web browsers or proxies does not require protocol modifications or Web server cooperation, and the prefetching strategy itself can be based on analysing hyperlinks or request patterns. In contrast to document prefetching, they can be applied with non-prefetchable URLs. Furthermore, bandwidth overhead is minimal and they are considerably more effective in performance improvement per bandwidth used than document prefetching. We propose scalable deployment solutions to control the potential overhead to proxies and particularly to Web servers.

Introduction

The central performance problem of the Internet today is user-perceived latency, that is, the period from the time a user issues a request for a document till the time a response is received. A key realization is that latency is often NOT dominated by document transmission time, but rather by the setup process that precedes it. This realization was a central motivation for HTTP/1.1, which addressed connection–establishment time on subsequent hypertext transfer protocol (HTTP) requests [1] and for proposals such as prefix caching of multimedia streams [2]. Even users enjoying persistent HTTP and high-bandwidth connectivity, however, are still frequently afflicted with annoying long setup waits. We propose natural techniques to address dominating latency causes that precede document transfer.

Background: Communication between Web clients and servers uses the HTTP, which in turn utilizes transmission control protocol (TCP) as the de facto underlying reliable transport protocol. A TCP connection needs to be established and acknowledged prior to transporting HTTP messages. To facilitate connection–establishment, the Web server's host-name representation is translated to a numeric Internet protocol (IP) address. This translation is done by querying a domain name system (DNS) server that may consult a hierarchy of DNS servers. Determining factors of the user-perceived latency are name-to-address resolution, TCP connection–establishment time, HTTP request–response time, server processing, and finally, transmission time.

Web browsing sessions typically consist of many HTTP requests, each for a small-size document. Practice with HTTP/1.0 was to use a separate TCP connection for each HTTP request and response [3]. Hence, incurring connection–establishment and slow start1 latencies on each request [4]. Persistent connections [1] address that by reusing a single long-lived TCP connection for multiple HTTP requests. Persistent connections became a default with HTTP/1.1 [5], which gets increasingly deployed. Deployment of HTTP/1.1 reduces the latency incurred in subsequent requests to a server utilizing an existing connection, but longer perceived latency is still incurred when a request necessitates establishment of a new connection.

Document caching and prefetching are well studied techniques for latency reduction. Caching documents at browsers and proxies is an effective way to reduce latency on requests made to cache-able previously accessed documents. Studies suggest that due to the presence of cookies, CGI scripts, and limited locality of reference, caching is applicable to only about 30–50% of requests [6], [7], [8]. Furthermore, caching is also limited by copyright issues. To avoid serving stale content, cached resources are often validated by contacting the server (e.g., through an If-Modified-Since GET request). Hence, caching eliminates transmission time, but often, and unless pre-validation is used [9], [10], still incurs considerable latency. Document prefetching reduces latency by predicting requests and initiating document transfer prior to an actual request. The effectiveness of document prefetching is limited by accuracy of predictions and the availability of lead time and bandwidth [6], [10], [11]. The use of document prefetching is controversial due to its extensive overhead on network bandwidth.

Our contribution: We systematically measure several latency factors and study their sensitivity to reference locality. We propose techniques that address significant latency factors by prefetching the means to document transfer (rather than the document itself). Finally, we conduct performance evaluation of our techniques by replaying actual users request sequences. Next we overview conclusions from our measurements, our proposed techniques, and their performance evaluation.

(1) Measurements. We used a list of about 13,000 Web servers extracted from a proxy log. The measurements were conducted from several locations in order to ensure they are not skewed by unrepresentative local phenomena. Below we summarize our findings.

  • DNS lookups: DNS lookup time (name-to-address translation) exceeded 3 seconds for over 10% of servers. Lookup time is highly dependent on reference locality to the server due to caching of query results at name servers.

  • Cold- and warm-server state: We observed that request–response times of start-of-session HTTP requests are on average significantly longer than of subsequent requests (even when each request utilizes a separate TCP connections). Presumed start-of-session HTTP request–response time exceeded 1 second for over 10% of servers and exceeded 4 seconds for over 6% of servers. These fractions dropped by more than half for subsequent requests.

  • Connection–establishment time: In agreement with many previous studies (e.g., [1], [8]), we observed that TCP connection–establishment time is significant relative to HTTP request–response times.

  • Cold and warm route states: As we explain later in the paper the first IP datagram traversing a path to a destination (when the route is cold) is more likely to travel for a longer time period than subsequent datagrams (when the route is warm). This effect is visible on consecutive TCP connection–establishments times, but our study concludes that is not as significant a contributor to long latency as other factors.


Furthermore, our study shows that DNS lookup time, connection–establishment time, and HTTP request–response time all exhibit heavy-tail behavior (i.e. the average is considerably larger than the median).

(2) Solutions. We propose the following techniques to address the three latency factors described above.

  • Pre-resolving: Browser or proxy performs DNS lookup before a request to the server is issued thereby eliminating DNS query time from user-perceived latency.

  • Pre-connecting: Browser or proxy establishes a TCP connection to a server prior to the user's request. Pre-connecting addresses connection–establishment time on the first request utilizing a persistent connection.

  • Pre-warming: Browser or proxy sends a “dummy” HTTP HEAD2 request to the server prior to the actual request. Pre-warming addresses start-of-session latency at the server.


We refer to the three techniques combined as pre-transfer prefetching.

Pre-connecting complements persistent HTTP and caching of TCP connections [8], [12]: it addresses wait due to connection–establishment incurred on the first request utilizing a connection. Pre-transfer prefetching techniques also complement caching of documents since (1) they are more beneficial for requests to servers which were not recently accessed and (2) their effectiveness does not depend on the URL being a cache-able document. They complement document prefetching by providing a range of overhead/benefit tradeoffs and utilizing less bandwidth. Similar to document prefetching, our pre-transfer techniques require a scheme to predict servers that are likely to be accessed.

(3) Performance. We evaluated the effectiveness of these techniques on two classes of requests across two locations.

The first class of requests was search engines referrals, namely follow-up requests on results returned by search engines or Web portals (directories). We choose to focus on these requests since search engines and Web portals sites typically invest considerable effort to provide low-latency high-quality service. Performance on follow-up requests is crucial in the overall perception of performance, however, we observed that follow-up requests are subjected to considerably longer latencies than average HTTP requests. Perceived latency on AltaVista [13] referrals exceeded 1 second for 22% of requests and exceeded 4 seconds for 12% of requests. With pre-resolving, pre-connecting, and pre-warming (with pre-connecting) applied, latency exceeded 1 second only for 10%, 6%, and 4% of requests, respectively. Latency exceeded 4 seconds for only 4%, 3%, and 2% of requests, respectively, yielding a dramatic decrease in longer wait times.

The second class of requests was considerably wider, containing all requests that were not preceded by a request to the server in the last 60 seconds. Latency exceeded 1 second on 14% of requests and exceeded 4 seconds on 7% of requests. With pre-resolving, pre-connecting, and pre-warming, latency exceeded 1 second only on 9%, 5%, and 3% of requests, respectively. Latency exceeded 4 seconds on 4%, 2.5%, and 1% of requests, respectively. This demonstrated significant potential improvement in performance.

Outline: The remainder of this paper consists of six sections. Section 2 describes our data and how we performed latency measurements. Section 3 describes our measurements of the different latency factors. Section 4 discusses pre-resolving, pre-connecting, and pre-warming and their deployment. Section 5 lists possible overheads and suggests ways to address them. Section 6 presents the performance evaluation. We conclude in Section 7 and propose directions for further research.

Section snippets

Data

Our user activity data was a log of the AT&T Research proxy server which is described in detail in [14]. The log provided, for each HTTP request, the time of the request, the user's (hashed) IP address, the requested URL and Web server, and the referring URL. The log contained 1.1 million requests issued between the 8th and the 25th of November 1996 by 463 different users (IP addresses). Requests were issued to 17,000 different Web servers, and for 521,000 different URLs. Our use of the log was

Study of latency factors

We measured the different latency factors and study the effect of reference locality on them. A measurement was taken for each server in a group and results were aggregated. We aggregated the measurements across the complete set of servers ocurring in the AT&T research proxy log described in Section 2, and across the subsets of servers accessed as a result of a referral of a particular search engine or Web portal. The dominating latency factors were the same for both sets of servers so we show

Combating latency

We propose three techniques to address the major latency factors identified in Section 3. In essence, our techniques perform the setup work associated with an HTTP request–response prior to the time the user issues the request. All our proposed techniques are applicable to servers rather than documents, and require minimal bandwidth. The general paradigm is to apply each technique to a set of servers with above-threshold likelihood to be accessed within the effective period of the technique.

Scalable deployment

Our pre-connecting techniques have minimal bandwidth overhead but they do impose other overheads: pre-resolving increases the number of DNS queries, pre-connecting increases the number of connection–establishments as well as the number of idle TCP connections, and pre-warming results in additional work for the HTTP application. Whereas the benefit of these techniques is to the end users the cost incurred mainly in the network and in the servers. This discrepancy between location of cost and

Performance evaluation

We compared perceived latency with and without applications of each of our techniques. We used the request sequence in the AT&T proxy log (see Section 2), and subsets of it consisting of follow-up requests after queries to search engines or Web portals. Our motivation for considering these subsets is described in detail in Section 1.

Conclusion and future research

Due to prevailing long user-perceived latency, the Web is pejoratively dubbed the “Word Wide Wait.” We observe that to a large extent, long waits experienced by users with high-bandwidth connectivity are dominated by the setup process preceding the actual transmission of contents. We propose pre-transfer prefetching techniques as a direct solution, and demonstrated their potential for significant decrease in long wait times. We view the deployment of pre-transfer prefetching as a natural

Acknowledgements

The pre-connecting technique was conceived in a joint discussion with Uri Zwick. We thank Jeff Mogul and Rob Calderbank for their comments on an early version of this manuscript. We are grateful to Menashe Cohen and Yishay Mansour for many good suggestions and pointers. We also thank Yehuda Afek, Ramon Caceres, Dan Duchamp, Sam Dworetsky, David Johnson, Balachander Krishnamurthy, Ben Lee, and Jennifer Rexford, for interesting discussions on our ideas and their implications.

Edith Cohen is a Researcher at AT&T Labs-Research. She did her Undergraduate and Masters studies at Tel-Aviv University, and received her Ph.D. in Computer Science from Stanford University in 1991. She joined Bell Laboratories in 1991. During 1997, she was in UC Berkeley as a Visiting Professor. Her research interests include design and analysis of algorithms, combinatorial optimization, Web performance, networking, and data mining.

References (32)

  • J.C. Mogul

    The case for persistent-connection HTTP

    Computer Communication Review

    (1995)
  • S. Sen, J. Rexford, D. Towsley, Proxy prefix caching for multimedia streams, in: Proceedings of the IEEE INFOCOM'99...
  • T. Berners-Lee, R. Fielding, H. Frystyk, RFC 1945: hypertext transfer protocol, HTTP/1.0, May...
  • H. Frystyk Nielsen, J. Gettys, A. Baird-Smith, E. Prud'hommeaux, H.W. Lie, C. Lilley, Network performance effects of...
  • R. Fielding, J. Gettys, J. Mogul, H. Nielsen, L. Masinter, P. Leach, T. Berners-Lee, RFC 2616: hypertext transfer...
  • T.M. Kroeger et al.

    Exploring the bounds of web latency reduction from caching and prefetching

  • F. Douglis, A. Feldmann, B. Krishnamurthy, J. Mogul, Rate of change and other metrics: a live study of the world wide...
  • A. Feldmann, R. Cáceres, F. Douglis, G. Glass, M. Rabinovich, Performance of Web proxy caching in heterogeneous...
  • B. Krishnamurthy, C.E. Wills, Study of piggyback cache validation for proxy caches in the world wide web, in:...
  • E. Cohen, B. Krishnamurthy, J. Rexford, Improving end-to-end performance of the Web using server volumes and proxy...
  • D. Duchamp, Prefetching hyperlinks, in: Proceedings of the 2nd USENIX Symposium on Internet Technologies and Systems,...
  • E. Cohen, H. Kaplan, J.D. Oldham, Policies for managing TCP connections under persistent HTTP, in: Proceedings of the...
  • AltaVista....
  • J.C. Mogul, F. Douglis, A. Feldmann, B. Krishnamurthy, Potential benefits of delta encoding and data compression for...
  • Yahoo!....
  • Apache HTTP server project....
  • Cited by (26)

    • Understanding the latency to visit websites in China: An infrastructure perspective

      2020, Computer Networks
      Citation Excerpt :

      Internet latency has been a research focus for a long time because it has a significant influence on user experience of networking applications. As early as about twenty years ago, researchers had conducted measurements to understand the sources of latency in downloading web pages [11–14]. In [11], the authors present several sources, i.e., DNS, TCP, the Web server, and the network links and routers, and conclude that the bottleneck in accessing pages is due to the Internet latency and TCP mechanism instead of servers.

    • Gaze dependant prefetching of web content to increase speed and comfort of web browsing

      2015, International Journal of Human Computer Studies
      Citation Excerpt :

      The bulk of this time, authors concluded, was the round-trip delay, with the delay at the server constituting only a tiny portion of the total amount of time delay. Cohen and Kaplan (2002) identified user׳s perceived latency as the central performance problem in web browsing. Authors also systematically measured factors contributing to this latency.

    • A taxonomy of web prediction algorithms

      2012, Expert Systems with Applications
    • Piggybacking related domain names to improve DNS performance

      2006, Computer Networks
      Citation Excerpt :

      The two renewal approaches only reduce PS misses because they can only renew entries that have been seen before. The pre-resolving approach of [4] can reduce both FS and PS misses, but will only do so based on the immediate needs of the application. The PRN approach not only reduces FS misses based on server knowledge, but if those entries are already cached, it can be used to restore these entries to their full TTL duration, thus reducing PS misses.

    View all citing articles on Scopus

    Edith Cohen is a Researcher at AT&T Labs-Research. She did her Undergraduate and Masters studies at Tel-Aviv University, and received her Ph.D. in Computer Science from Stanford University in 1991. She joined Bell Laboratories in 1991. During 1997, she was in UC Berkeley as a Visiting Professor. Her research interests include design and analysis of algorithms, combinatorial optimization, Web performance, networking, and data mining.

    Haim Kaplan received his Ph.D. degree from Princeton University at 1997. He was a member of technical stuff at AT&T research from 1996 to 1999. Since 1999 he is an Assistant Professor in the School of Computer Science at Tel-Aviv University. His research interests are design and analysis of algorithms and data structures.

    View full text