Elsevier

Computer Networks

Volume 42, Issue 2, 5 June 2003, Pages 211-229
Computer Networks

Connection caching to reduce signaling loads with applications to softswitch telephony

https://doi.org/10.1016/S1389-1286(03)00190-7Get rights and content

Abstract

In order to support quality of service, many network operators continue to rely on connection oriented technologies, even as their networks migrate towards packet switching. Such technologies can allocate resources according to user requirements, with path stability providing low jitter for real-time services. CBR/VBR traffic classes in ATM networks fall within the connection oriented paradigm, as do Traffic Engineering initiatives such as CR-LDP and RSVP-TE protocols (from the IETF’s MPLS working group). As the capacities of network links increase, network switches/routers will need to support signaling loads that grow in proportion to these increases in bandwidth. However the capacity of such elements to process connection setup and teardown requests may not grow as quickly as transmission bandwidth. The resulting congestion in the connection control plane leads to delays in connection setup (and, in extreme cases, to setup failures). In response, next-generation switch vendors have implemented various connection-caching schemes. Despite this motivation, the problem of how to design an effective caching scheme appears to be little-studied in the literature. In this paper we propose a dynamic connection caching strategy which achieves a trade-off between bandwidth utilization and decreased signaling load. The proposed scheme is based on the optimal policy for a Markov decision process; this Markov decision process models the dynamics of caching policies on a single link. We extend the single-link approach to the network context. Simulations show that the proposed mechanism is robust and effective at reducing the signaling load without significantly decreasing throughput. The simulation scenarios feature network topologies that are appropriate for softswitch telephony, thereby demonstrating the applicability of our approach in this context.

Introduction

We study the effectiveness of connection caching as a means of reducing signaling load in telecommunication networks. Signaling load is our term for the tasks involved in setting up and tearing down connections. If the processing resources that perform these tasks are overloaded, deleterious effects on system performance result:

  • delays in setup/teardown procedures increase to the point that industry-standard delay requirements are not met, and

  • setup/teardown procedures may fail altogether.


While it eases signaling load, caching exacts a toll in terms of throughput; we wish to quantify this cost and to shed light on some interesting trade-offs.

Our model does not explicitly include connection setup delay. In view of the above observations, our model uses signaling load as a proxy for delay. Our focus is on bandwidth allocation; the notion of effective bandwidth can be used to treat variable bit rate traffic types within our modeling framework.

It is important to point out a major difference between connection caching and route caching. Although the latter addresses delay associated with route computation, it does nothing to reduce signaling overhead for setup and teardown procedures. Since our main motivation in this paper is to reduce signaling load, we cannot overemphasize this crucial distinction.

Organization of the paper. After detailing our motivation, we formulate a Markov decision process (MDP) model that describes the dynamics of a single link with a state-dependent caching policy. We discuss a numerical implementation of policy iteration; based on our results, we propose a linear dynamic caching heuristic that appears to be near-optimal for the single-link case. Finally, we validate the efficacy of this heuristic with simulation results in the single link and network contexts.

As mentioned in the abstract, operators prefer to employ connection oriented technologies for real-time services: path stability provides low jitter as well as a familiar paradigm for operations, administration and maintenance. Note that connection caching schemes could be combined with arrangements in which there is more than one call per connection, and therefore do not preclude such multi-call arrangements. In the same vein, ATM Adaptation Layer 2 (AAL2) deployments may feature AAL2 switching/subcell multiplexing nodes which experience heavy signaling load even though switches that function only at the ATM layer are unaffected by setups and teardowns happening within the AAL2 layer.

For our purposes, a telecommunication network is a set of switches connected by transmission links. A switch has the capacity to direct traffic from any input link to any output link. A connection is a collection of resources (such as transmission capacity) allocated to a specific user at various points in the network.

In our model, software on the network explicitly allocates/deallocates resources as connection requests arrive, are served, and are then completed. Signaling is the process by which system software accomplishes this goal. Connection setup is the allocation process that takes place when an arriving call request is accepted. Connection teardown is the process of returning allocated resources to the available pool once a call has completed. A schematic representation of a switch appears in Fig. 1. The software processes that manage signaling reside in the controller.

In today’s networks, 10 Gbps transmission links are coming into wide-spread use. Signaling capacity, however, has not kept the pace––for example, a capacity of 3K calls/s (setups and teardowns) is quite good for an ATM switch. A 10 Gbps link can carry 128K simultaneous (uncompressed) telephone conversations; if such a link is run at 80% utilization (with telephone traffic only), there will typically be around 100K calls in progress. With an average call holding time of 5 min (=300 s) per call, this means that a switch will have to process about 330 call setups and teardowns per second for a single link. Thus a switch with a 3K calls/s signaling capacity could not support 10 links, each with 10 Gbps transmission capacity, with the offered load described above. As terabit switching fabrics become available, we expect the signaling capacity of such systems to lag behind.

Naturally, signaling capacity of switches and routers will evolve along with link speeds and switching fabric capacities. However, we see the following motivation for our view of call processing capacity as a scarce resource:

  • Migration of traditional telephony services towards softswitch architectures promises to place new demands on broadband switches and routers. Control-plane interworking of softswitch components with traditional telephone equipment is complex and inevitably consumes time. Thus acceptable call setup latencies will be difficult to maintain.

  • Compression schemes for delay-sensitive traffic types (such as voice and video) are improving and are becoming more widely deployed. Of course, compressed streams require less transmission bandwidth than uncompressed streams. But the per-connection signaling requirement may not decrease (indeed, it may increase because of added sophistication).

  • Effective techniques for reducing signaling load could extend the useful lifetime of older switches and routers whose signaling capacity is inadequate for the emerging mix of services.


In this paper, connection caching will refer to delayed connection teardown: when a call completes, the connection is retained. Network resources associated with this cached connection are still reserved, but are unused. A new call request can be bound to a cached connection if (and only if)

  • (1)

    the network entry and exit points are the same as those of the initial request (i.e. the request for which the connection was originally set up),

  • (2)

    the resource requirements are the same as those of the initial request on each of the switches and links involved.


In this paper, we will assume that any two connections that traverse a given link or switch have identical resource requirements on that link or switch. Thus, in our model, requirement 1 is the only prerequisite for reuse of a cached connection.

We will refer to the reuse of a cached connection to carry a new call as a cache hit. Each cache hit avoids the effort of a teardown and subsequent setup procedure. Referring to the network of Fig. 2, suppose a user at switch A calls a user at switch B and the associated connection is cached when the call completes. The next time a user at A calls a user at B, the call request will be bound to the existing connection. Establishing this binding requires some effort on the part of switches A and B (although arguably less than that involved in teardown and subsequent setup). However, toggling an A–B connection between active and cached states involves no state change on the link connecting switches C and D and therefore places no processing load on the controllers at C and D. This is the main benefit of caching: signaling load is reduced.

The main drawback of caching is that it tends to decrease throughput. Throughput is the expected number of calls accepted per unit time. A connection request which does not have the same requirements as the original connection (e.g. an E–F call request in the network of Fig. 2) may be blocked because of the resources consumed by cached connections.

To summarize, the fundamental trade-off of connection caching is reduced signaling load (which is of course desirable) at the expense of reduced throughput.

A connection is an end-to-end allotment of network resources. When a connection is cached, these resources remain reserved, and associated with one another as an end-to-end entity. Thus connection caching is very different than other caching schemes, such as route caching (which is studied in [1], [8], for example). In route caching, a network device stores results of routing calculations in an internal table. Route caching schemes try to make intelligent decisions about when to (re-)compute routes and/or which routes will be discarded when storage constraints force such discards. Whether a network element keeps or discards a routing table entry has no effect on other network elements (at least, it has no direct effect). Tearing down a connection, on the other hand, typically changes the state of many network elements. Note also that route caching does nothing to reduce signaling overhead for setup and teardown procedures, but only addresses delay associated with route computation.

Our work is also significantly different from web caching. As is the case with connection caching, web caching schemes aim to reduce overhead by achieving high hit ratios. Moreover, many well known schemes provide effective caching on the World Wide Web [4]. Thus one may ask whether we can, by treating each connection as an object, apply a web caching scheme to achieve our goal of reduced signaling load. That is, based on some metric, why not bump the lowest-valued connection from the chunk of cached connections and cache a new connection with higher expected value? Web caching schemes (as exemplified by LRU and generalizations such as LFU, GD-size and Hybrid policies [3], [5]) could conceivably be applied in the connection caching context. Substantial modification would be required, however.

Unmodified, the web caching replacement policies mentioned above do not reduce signaling load at all because of frequent bumping and replacing actions; this point stands out when we examine an LRU-like policy in Section 1.3. Second, current web caching schemes fail to provide intuition on how to determine the number of each connection type that should be cached: in web caching the cache buffer is fixed and an object will have at most one copy in the cache. In connection caching, the “number to cache” is critical: each additional cached connection of one type means less available bandwidth (and thus potentially higher blocking probability and signaling rate) for other types of connections. So the dynamic decision on the number to cache is one of the keys to the trade-off between blocking and signaling. Thus it is clear that connection caching and web caching indeed address different scenarios. Our simulations show that our scheme has the same desirable properties in the connection caching scenario as LFU has in web caching, even though the starting points for the two schemes are very different. Specifically, our approach favors connection types who have higher arrival rate. Moreover, our scheme could be easily modified to incorporate the consideration of size and value into connection caching, which is similar to GD-size and Hybrid’s approach in web caching.

To the best of our knowledge, connection caching has received very little attention in the literature. In a recent paper [11], Serbest et al. study the performance of a connection caching scheme with adaptive timeouts.

Lippmann’s 1975 paper [7] stimulated interest in the MDP framework as a tool in the analysis of queueing systems. Our analysis is similar to that supporting the use of trunk reservation in network routing mechanisms to increase revenue [6], [10].

Having motivated the use of caching, we turn to the question of what caching policy to employ. To that end, we ask

  • (1)

    Under what circumstances should an “outgoing” connection be cached?

  • (2)

    How long should an unused cached connection be retained?


In the next two sections, we will consider general state-dependent caching policies, analyze these policies using MDP machinery, and demonstrate that there are “good” policies with some simple structural properties. Before doing so, however, we discuss candidate caching schemes and argue that our study of state-dependent caching is worthwhile.

We emphasize that none of the schemes considered here set up cached connections prior to experiencing demand. That is, the only sort of decision that leads to the existence of a cached connection is a decision not to tear down a connection upon completion of the associated call.

  • (1)

    Always cache+LRU bumping of cached connections. As the nomenclature suggests, whenever a call completes, its connection is cached. For arrivals, the “flowchart” is as follows.

    • (a)

      An incoming call request is served using a cached connection with appropriate network entry and exit points if one is available.

    • (b)

      Otherwise, if sufficient idle capacity is available, a connection setup is performed in order to carry the call.

    • (c)

      Otherwise, we try to bump cached connections in least recently used (LRU) fashion to free sufficient resources for the incoming call request. If this last option fails, the call is blocked.

  • Note in particular that we do not preempt active connections in this policy (or in any other policy studied in this article).

  • On the surface, the “Always cache with LRU bumping” scheme has an appealing simplicity, and there is no throughput penalty under this policy. The problem with this policy is that it is ill suited to networks with distributed control. Referring to the network of Fig. 2, suppose an E–F connection request wants to “bump” an A–B cached connection. Then E must communicate its desire to A. Furthermore, A must notify E once the connection teardown procedure is complete. In a sense, this re-creates the signaling overhead that we hope to ameliorate with caching. Therefore we eliminate this approach from further consideration.

  • (2)

    Limit cache size. In this policy, we cache outgoing connections subject to a per origin–destination pair (per O–D pair) limit on the number of cached connections. This should limit the overall throughput penalty due to caching, while still reducing signaling load. The problem with this scheme is twofold:

    • (a)

      One must find appropriate settings for each of the O–D pair limits.

    • (b)

      This scheme is not responsive to changes in the distribution of traffic load among O–D pairs (unless the O–D pair limits are updated).

  • (3)

    Timeout-only policy. Always cache but place a timeout on each cached connection. This approach also limits the throughput penalty due to caching. We see three problems with this approach:

    • (a)

      Finding appropriate settings for timeout(s).

    • (b)

      Unless the timeouts are very well tuned, low-demand O–D pairs can experience disproportionately high blocking rates.

    • (c)

      Cache sizes for individual O–D pairs are allowed to vary widely, potentially increasing blocking and reducing stability.

  • (4)

    State-dependent caching with timeouts. Here we make each cache/release decision based on system state. The system state will include the number of cached and active connections for each O–D pair. In a complex network, the paths followed by these existing connections may also need to be included in the state information.

  • By introducing this class of policies, we are attempting to combine the virtues of policies 2 and 3 and generalize the framework for additional flexibility. The problem with this approach is that, due to the size of the state space, state-dependent policies can potentially be extremely complex. Our goal is to find simple design heuristics that produce good performance.


We are seeking an approach that is suitable for networks with distributed control and that adapts well to changing traffic loads; the first scheme has already been eliminated. We argue that timeouts are necessary to achieve our goals, and that the second candidate scheme (limit cache size) should also be removed from consideration: for example, if an O–D pair becomes inactive for a period of time, any cached connections for that O–D pair remain in place throughout that period, regardless of its duration. To address this, the per O–D pair limits would have to be frequently updated, and such a scheme would become very difficult to administer.

We will return to the timeout-only policy (third in the list) when we discuss simulation results in Section 5.

Section snippets

A single-link model

We now focus our study on the performance of caching on a single link. We abstract the network context of the previous discussions as follows: each O–D pair will be represented by a distinct stream of arriving call requests. Each of these arrival streams will be regarded, in our single-link model, as originating from a distinct user type. To clarify this abstraction, we return to the example network of Fig. 2. Link C-D “sees” arrival streams from four user types. This is because four O–D pairs

Markov decision process approach

We now turn to the framework of Markov decision processes (MDPs). Our goal is to identify structural features of optimal state-dependent policies, and to design simple, near-optimal heuristic policies using this information. Because of the difficulty of solving constrained MDPs, the explicit signaling rate constraint of the previous section is replaced by a penalty term in the objective function (details will be forthcoming when we discuss the reward structure of our MDP). In our MDP model, the

Network model

In the single-link case, we can solve for the exact optimal policy. However in a network, when applying such a caching policy, the optimal policy is hard to obtain. To extend our single link caching policy to a network, we propose a natural extension. The idea is to consider each link on a connection’s route independently. Upon connection departure, each link decides whether it should cache the connection or not based on its own state information, i.e. makes a decision based on a single link

Simulation

In order to evaluate the performance of the caching policy developed above, we ran two sets of simulations. Throughout this section, timeouts are deterministic; whenever more than one cached connection is available to serve an incoming call request, we choose the “oldest” connection, i.e. the connection that is closest to timing out.

The first set of simulations corresponds to the single-link case with various parameter setups. The simulations are intended to illustrate that the linear threshold

Conclusions

In this article, we motivate connection caching and discuss possible schemes for deciding when to cache an “outgoing” connection and when to tear down an unused cached connection. We argue that state-dependent caching is necessary for best performance and propose a linear threshold caching policy (LTCP) that appears to be near-optimal in the single-link case. Through simulations, we verify that LTCP performs well in the network context, and is relatively insensitive to errors in parameter

Acknowledgements

This work is supported by NSF Career Award NCR 96-24230, by SBC Technology Resources, Inc., by Cingular Wireless and by an Intel Technology for Education 2000 equipment grant.

Matthew Stafford is a principal member of technical staff at Cingular Wireless. His current research interests are in data networking, particularly data services in 3rd generation wireless networks. He holds a Ph.D. in Mathematics from Northwestern University and a Ph.D. in Operations Research from the University of Texas at Austin.

References (12)

  • M. Peyravian

    Network path caching: issues, algorithms, and a simulation study

    Computer Communications

    (1997)
  • G. Apostopoulos et al.

    On reducing the processing cost of on-demand QoS path computation

    Journal of High Speed Networking

    (1998)
  • J.W. Eaton et al., GNU Octave. Available from...
  • J. Wang

    A survey of web caching schemes for the internet

    ACM Computer Communication Review

    (1999)
  • G. Barish et al.

    World Wide Web caching: trends and techniques

    IEEE Communications Magazine

    (2000)
  • C. Aggarwal et al.

    Caching on the World Wide Web

    IEEE Transactions on Knowledge and Data Engineering

    (1999)
There are more references available in the full text version of this article.

Cited by (1)

Matthew Stafford is a principal member of technical staff at Cingular Wireless. His current research interests are in data networking, particularly data services in 3rd generation wireless networks. He holds a Ph.D. in Mathematics from Northwestern University and a Ph.D. in Operations Research from the University of Texas at Austin.

Xiangying Yang received his B.S. in Electrical Engineering from Tsinghua University, China in 1998 and M.S. in electrical and computer engineering (ECE) from University of Texas at Austin in 2000. He is now a Ph.D. candidate in ECE department at University of Texas at Austin. His research interests include web caching, peer-to-peer applications and wireless sensor networks. He is a recipient of Texas Telecommunications Engineering Consortium (TxTEC) Fellowship in year 2000, 2002 and a member of TauBetaPai.

Gustavo de Veciana received his B.S., M.S., and Ph.D. in electrical engineering from the University of California at Berkeley in 1987, 1990, and 1993 respectively. In 1993, he joined the Department of Electrical and Computer Engineering at the University of Texas at Austin where he is currently an Associate Professor. His research focuses on issues in the analysis and design of telecommunication networks. Dr. de Veciana has been an editor for the IEEE/ACM Transactions on Networking. He is the recipient of a General Motors Foundation Centennial Fellowship in Electrical Engineering and a 1996 National Science Foundation CAREER Award, and co-recipient of the IEEE Bill McCalla Best ICCAD Paper Award for 2000.

An early version of this work was presented at the Fifth INFORMS Telecommunications Conference, Boca Raton, FL, March 7, 2000.

1

Tel.: +1-512-471-1573; fax: +1-512-471-5532.

View full text