Elsevier

Computer Networks

Volume 46, Issue 2, 7 October 2004, Pages 219-235
Computer Networks

Optimal sliding-window strategies in networks with long round-trip delays

https://doi.org/10.1016/j.comnet.2004.03.032Get rights and content

Abstract

A method commonly used for packet flow control over connections with long round-trip delays is “sliding windows”. In general, for a given loss rate, a larger window size achieves a higher average throughput, but also a higher rate of spurious packet transmissions, rejected by the receiver merely for arriving out-of-order. This paper analyzes the problem of optimal flow control quantitatively, for a connection that has a cost per unit time and a cost for every transmitted packet. The optimal strategy is defined as one that minimizes the expected cost/throughput ratio, and is allowed to transmit several copies of a packet within a window. We present an algorithm for computing the optimal strategy and study its properties; in particular, we derive bounds on the optimal strategy cost/throughput performance, and show that it increases merely logarithmically with the time price, whereas the cost/throughput of the `traditional' classic window scheme is linear in the time price.

Introduction

A common method for packet flow control over network connections, used both in the data-link and the transport layers, is sliding windows [1]. In this method, the receiver regularly reports to the sender the index of the next-expected packet, thereby acknowledging all the packets up to that index. The sender may transmit up to a certain number of packets, called the window size, beyond the last acknowledged packet; if a packet is not acknowledged within a certain `timeout' period (ideally aimed to be the connection round-trip time, or slightly higher), the window is retransmitted from that packet on. In its pure form, this scheme implies that packets must arrive to the destination in-order. While the receiver may temporarily keep out-of-order packets in a buffer, this does not affect the connection's performance unless the protocol is extended to allow selective, rather than cumulative, acknowledgments [2], [3]. Such extensions are not universally implemented, and even when they are, the space allocated to hold such out-of-order packets is, typically, not very large. Therefore, on a coarser level, the packet stream still has to arrive in-order, allowing exceptions only to a limited extent.

Since a lost packet may trigger a retransmission of up to an entire window, its negative effect on throughput is not only due to the loss itself, but due to the time wasted in waiting for the acknowledgment as well. This effect is more severe when the connection's round-trip time (more precisely, the timeout) is long compared to the transmission time of a packet; such a connection is said to have a large bandwidth-delay product. A good example is a geostationary satellite link, with a round-trip propagation delay of roughly 0.25 s, used within a high-speed connection where a packet transmission typically takes a fraction of a millisecond; the delay-bandwidth product is then measured in thousands.

Assuming that packet losses are independent (e.g., caused by white noise or a randomized discarding policy along the connection path, such as RED [4], [5]), and that transmission of a window takes less than the round-trip delay, the throughput can be improved considerably by retransmitting some or all of the packets several times within the window itself (rather than just after a timeout, as in `classic' sliding-window schemes), as this increases their initial probability of successful arrival. For the rest of the paper, we extend the definition of the window size to include all such transmissions, counting each one separately whether it is a new packet or a copy of a previous one. We define a sliding-window strategy to be a rule that specifies how many copies of each packet, relative to the start of the window, are transmitted and in what order; in particular, it also specifies the window size. We mention at this point that alternative methods, such as forward error correction (FEC), can be used within this framework instead of simple retransmissions; we comment more on this later.

In general, for a given packet loss rate, transmitting more packets in a window––whether new ones or more copies of the same––increases the expected number of successful packets in every round-trip period, and, hence, the long-term throughput (at any rate, so long as the total window transmission time remains below the round-trip time). However, a larger window also increases the average rate of duplicate and out-of-order packets, which needlessly contributes to the network load. Thus, selection of a window size constitutes a tradeoff between these conflicting goals. To quantify this tradeoff, we associate with the connection a `cost' per unit time and a `cost' per packet transmission, and define the optimal strategy as one that minimizes the average cost/throughput ratio over time. We point out that these costs can have various interpretations, and should not be taken literally as money charges [6]. For example, the time cost may be associated with the disutility incurred by the application due to increased delay, and the transmission cost may be related to the energy consumption of a mobile device. Similarly, a `social' (e.g., TCP-friendly) sender that refrains from retransmitting to avoid loading the network for others behaves as if it had a high per-transmission cost.

In `classic' sliding windows, the sender transmits each packet in the window once, and the optimal strategy computation thus reduces to a trivial optimization of a single parameter (the window size). When each packet may be (re)-transmitted several times within a window, the problem becomes much more interesting. Finding the optimal strategy can then be viewed as being composed of two subproblems: an `outer' problem of finding the optimal window size N, depending on the time and packet transmission costs; and an `inner' problem of optimally distributing a total `budget' of N transmissions among the packets in a window, which, for a given N, no longer depends on the costs. A salient feature of the resulting solution is that not all packets are transmitted an equal number of times: earlier packets in every window get more copies transmitted than later ones, in accordance with their `importance' (e.g., the loss of the first packet in a window results in the loss of the entire window even if later packets arrive correctly, while the reverse is not true).

In this paper, we present a detailed analysis of optimal sliding-window strategies, following the above decomposition to the `outer' and `inner' subproblems. It turns out that the inner problem, of deciding which packet copies to transmit for a given window size N, involves a certain combinatorial optimization problem, and we explore in detail its properties, derive bounds on the solution's performance, and suggest an efficient solution algorithm based on dynamic programming. We then proceed to extend it for the outer problem (of finding the optimal window size) as well, thus establishing an integrated solution algorithm for the strategy optimization problem. Finally, we show that the cost/throughput ratio increases only logarithmically in the time price; this is a significant improvement of the linear dependence achievable by `classic' sliding windows.

Our current study analyzes optimal strategies limited to simple retransmissions only. A potentially better scheme for increasing the success probability of a group of packets is that of forward error correction (FEC) coding; generally, a (n,k) FEC code encodes a group of k packets into n>k `copies', so that any k successful ones allow reconstructing the original data. We wish to emphasize that the ideas presented in this paper are not inconsistent with FEC coding, but rather complement it. If the code parameters are fixed (e.g., in a lower layer), our analysis can be readily applied by treating each encoded block as a “super-packet” with the appropriate loss probability. If the code can be controlled, the problem becomes that of finding an optimal coding strategy, which, though more complex, is based essentially on the methodology introduced here, except that the number of retransmissions is replaced by the notion of coding redundancy. In particular, it is to be expected that the optimal strategy would use higher-redundancy coding for the first packets in every window than for later ones.

The special concerns raised by connections with large delay-bandwidth products in general, and satellite links in particular, have attracted considerable research in recent years. Most of these studies are in the context of the widely-used TCP protocol and propose how to improve its performance, either by tuning the parameters of existing features like extended windows, slow-start, and congestion avoidance [7], [8], or by introducing extensions, such as explicit congestion notifications [9]. Considerable attention has also been devoted to FEC coding that is able to adapt to higher-layer protocol requirements, partly in the context of multimedia applications with real-time requirements [10], but mostly, again, in conjunction with TCP [11], [12]. None of these works, however, suggested improvements of the sliding-window mechanism itself. In fact, to the best of our knowledge, the idea of basing the number of retransmissions (or the FEC coding redundancy) on the position of the packet within a window, which is central to this paper, has not been suggested before. The approach we follow in the paper is generic, with the goal of discovering fundamental properties of optimal retransmission strategies, and we do not consider specific implementation issues in existing sliding-window protocols, which may require further work.

The rest of the paper is structured as follows. Section 2 describes the model and formally defines the underlying optimization problems. Section 3 describes basic structural properties of the solution and derives bounds on the optimal strategy performance. The solution algorithm and its properties for the `inner' problem are analyzed in Section 4 and incorporated into an overall solution algorithm in Section 5. Finally, Section 6 concludes with a discussion of our methodology and its possible extensions, and outlines directions for further research.

Section snippets

The model

As explained in the Introduction, we are interested in network connections with a high delay-bandwidth product, in which the receiver accepts packets only in order (with only a small buffer space, if at all, to hold a limited number of out-of-order packets). For our analysis, we shall bring these two characteristics to an extreme. That is, we assume that the receiver is unable to accept out-of-order packets at all, and we take the packet transmission time to be zero, which implies that the size

Basic properties and bounds

In this section, we show some basic structural properties of the optimization problems' solutions, and derive important bounds, in particular, on their asymptotic behavior.

Solution of the inner problem

In this section, we present two approaches to the solution of the inner problem. First, we show how to solve it (i.e., compute the value of the function EL(N)) exactly, using a technique of dynamic programming. The corresponding solution algorithm has a complexity of O(N2). However, it does not provide an insight to the structural properties of the solution; therefore, we also consider a similar optimization problem in continuous variables, for which the dependence of the solution on the

Finding the optimal window size

We now turn to discuss the solution of the `outer problem', namely, finding the window size (N) that minimizes the cost/throughput ratio (5). To begin, note that generic search algorithms (e.g., Fibonacci or golden-section search [14]), using algorithm DI as a `subroutine' for computing EL(N), are inefficient, as they neglect the internal redundancy between computations for different N. Indeed, for any N, algorithm DI computes the scores for all window sizes up to N anyway. This raises the idea

Conclusion

We have investigated optimal sliding-window strategies in network connections where the packet transmission time is negligible compared to the round-trip delay. We associated a cost per unit of time and per packet transmission with the connection, and defined the optimal strategy as one that minimizes the expected cost/throughput ratio. We derived several important bounds on the optimal strategy performance; specifically, for a window size of N, we showed the number of successful in-order

Lavy Libman received the B.Sc. degrees (summa cum laude) in Electrical Engineering and in Computer Engineering, and the M.Sc. and Ph.D. degrees in Electrical Engineering, from the Technion––Israel Institute of Technology, Haifa, Israel, in 1992, 1997, and 2003, respectively.

He has joined the Networks and Pervasive Computing research program at National ICT Australia, Sydney, in September 2003. He has previously held several visiting and consulting positions, including with Bell Laboratories,

References (15)

  • C. Barakat et al.

    Bandwidth tradeoff between TCP and link-level FEC

    Computer Networks

    (2002)
  • A. Tanenbaum

    Computer Networks

    (2003)
  • ISO/IEC standard 13239:2000 (HDLC procedures), February...
  • M. Mathis, J. Mahdavi, S. Floyd, A. Romanow, RFC 2018: TCP selective acknowledgment options, October...
  • S. Floyd et al.

    Random early detection gateways for congestion avoidance

    IEEE/ACM Transactions on Networking

    (1993)
  • D. Lin, R. Morris, Dynamics of random early detection, in: Proc. ACM SIGCOMM, Cannes, France, 1997, pp....
  • L. Libman et al.

    Optimal timeout and retransmission strategies for accessing network resources

    IEEE/ACM Transactions on Networking

    (2002)
There are more references available in the full text version of this article.

Cited by (8)

View all citing articles on Scopus

Lavy Libman received the B.Sc. degrees (summa cum laude) in Electrical Engineering and in Computer Engineering, and the M.Sc. and Ph.D. degrees in Electrical Engineering, from the Technion––Israel Institute of Technology, Haifa, Israel, in 1992, 1997, and 2003, respectively.

He has joined the Networks and Pervasive Computing research program at National ICT Australia, Sydney, in September 2003. He has previously held several visiting and consulting positions, including with Bell Laboratories, NJ, in summer 2002, and with Millimetrix Broadband Networks, Israel, in summer 2000. Between 1993 and 1999, he served as a computer engineer in the Israel Defence Forces.

His research interests include protocols and algorithms for mobile networks, including error control, power control, and QoS, as well as application of game theory to the analysis, design, control and management of large-scale networks with heterogeneous and non-cooperative users. He received the Wolff prize for distinguished Ph.D. students.

He was a member of the Executive Committee of IEEE Infocom'2002, and is currently serving on the Technical Program Committees of IEEE LCN'2004 and ACM SIGCOMM Asia 2005 workshop.

Ariel Orda received the B.Sc. (summa cum laude), M.Sc., and D.Sc. degrees in Electrical Engineering from the Technion––Israel Institute of Technology, Haifa, Israel, in 1983, 1985, and 1991, respectively.

Since 1994, he has been with the Department of Electrical Engineering at the Technion, where he is currently an Associate Professor and the Academic Head of the Computer Networking Laboratory. He has held visiting and research positions at the Center for Telecommunication Research, Columbia University, New York, NY, Bell Laboratories, NJ, and IBM Watson Research Center, NY. In addition, he has held several consulting positions with Israeli industry.

His current research interests include network routing, QoS provisioning, wireless networks, the application of game theory to computer networking and network pricing.

He received the Award of the Chief Scientist in the Ministry of Communication in Israel, a Gutwirth Award for Outstanding Distinction, the Research Award of the Association of Computer and Electronic Industries in Israel, and the Jacknow Award for Excellence in Teaching.

He served as Technical Program co-chair of IEEE Infocom'2002. He is an Editor of Computer Networks and of the IEEE/ACM Transactions on Networking.

A preliminary version of this study appeared in the Proceedings of IEEE Infocom 2003.

1

This research was performed while L. Libman was a Ph.D. student in the Technion––Israel Institute of Technology, and supported in part by the Israeli Ministry of Science.

View full text