On the necessary and sufficient requirement of a CIOQ switch to emulate an Output Queued switch
Introduction
For the past few decades, there have been constant efforts in comparison and compromise between Input Queued switches and Output Queued switches. The groundbreaking results by Karol and Hluchyj [1] which analytically placed the throughput of input buffering at 58.6%, illustrated the Head-of-Line (HOL) characteristics of input buffering. Since then, attempts to improve this performance or remove its HOL characteristics have been continued. Various simple schemes such as looking ahead in the queues [2], [3], [4], channel grouping [5], [6], [7], [8], using a simple speedup factor [9], [10], [11], [12], [13] or using Virtual Output Queues [14], [15] have been used to improve the throughput of Input Queued switches. Also, to achieve this goal, more complicated schemes such as using non-FIFO buffers [4], priority queueing [16], [17], using parallel or cascaded planes [18], [19] were presented. Scheduling methods such as iSLIP [15], Maximal Matching, PIM, Round Robin [15], [17], [20], [21], LQF (Longest Queue First), OCF (Oldest Cell First), and their variations [22], [23] have been introduced to reach the 100% throughput.
The 100% throughput would be achieved in these schemes, but only after a long time of running the system or when the queues are saturated. However, because of the performance limitation of these solutions during the short time windows, the existence of possible unfairness or starvation amongst different ports, and the possibility of large delays, there have been much interest and many attempts to emulate the behavior of Output Queued switches by Input Queued switches.
The early result of such attempts was reported by Prabhakar and McKeown [24], in which using the Combined Input and Output Queued (CIOQ) switch with limited speedup factor of 4, they were able to emulate the OQ switch. The importance of this result was in their speedup requirement which should be compared to the speedup requirement of an Output Queued switch of size , which is . Later on, Stoica and Zhang [25] and independently Chuang et al. [26], [27] introduced other scheduling schemes and showed that a speedup of 2 in conjunction with their scheme would be sufficient for a CIOQ switch to behave identically as an Output Queued switch. Work on this subject is ongoing [17], [23], [28].
In their widely cited papers [26], [27], Chuang et al. have also shown that in “Average Sense” the speedup of is both necessary and sufficient for the CIOQ in order to emulate Output Queued switch behavior. In “Average Sense” they measure the average speed up requirement among different cell time.
Our paper, reports that in the “Strict Sense” a speedup of 2 is both necessary and sufficient. By the “Strict Sense”, we mean that the speedup is the same in all cell times, and we compute the “Minimum” speedup that is required in any cell time. We show this requirement using examples for 2×2 and 3×3 switches. Using the same assumptions as in [26], [27] and employing examples for 2×2 and 3×3 switches, it is shown that in the “Strict Sense” the speedup requirement of is not sufficient to emulate the behavior of an Output Queued switch.
Also, using a constructed traffic pattern and the same assumptions as in [26], we show that in the “Strict Sense”, a speedup of 2 is necessary to emulate the behavior of an Output Queued switch for any switch size .
Combining this result with the previous scheduling schemes of Stoica and Zhang [25] or Chuang et al. [26], [27], we show that in the “Strict Sense”, a speedup of 2 is both the necessary and sufficient condition to emulate the behavior of an Output Queued switch using a Combined Input Output Queued switch [29].
Additionally, relaxing the condition of sending each packet as a single unit through the switch and allowing for its segmentation to the smaller units, it is shown that in the “Strict Sense”, the speedup requirement to emulate the behavior of an Output Queued switch might be reduced. For this case a lower bound value of 3/2 and an upper bound value of 2 is proved.
Finally, again in the “Strict Sense”, it is proved that as approaches the infinity and even if segment size approaches infinitesimally small values, the speedup requirement would be a non-decreasing value of , and this speedup would converge to a value between 3/2 and 2.
The organization of the paper is as follows. In Section 2, using examples for 2×2 and 3×3 switches insufficiency of speedup is demonstrated in the “Strict Sense”. In Section 3, the necessary condition for 2×2 switches is extracted. Using this result, in Section 4, a worst case traffic is constructed to prove that in the “Strict Sense”, a speedup of 2 is necessary for any switch size . Section 5, demonstrates the improvement in speedup and its limitations when the segmentation of packets into smaller units is allowed. Finally, Section 6 concludes the paper.
Section snippets
Insufficiency argument: examples
In the “Strict Sense”, the following assumptions are used which are the same as those used in [26].
- (1)
Packets are of the same size.
- (2)
Packets arriving in a given timeslot cannot leave before the start of the next timeslot. In other words they need to be completely received before they can start to be transmitted.
- (3)
Packet transmission time is the timeslot length divided by the internal speedup factor.
- (4)
If speedup factor is less than 2, two complete packets cannot leave from the same input port or arrive
Main idea: necessary condition for switches
Now, the question arises that: what is the necessary speedup requirement for a 2×2 switch in the “Strict Sense”? To answer this question, the following theorem is used.
Theorem 1 In the “Strict Sense” the speedup 2 is necessary for a 2×2 switch to emulate the OQ behavior.
Proof The proof is based on the traffic pattern which is depicted in Fig. 5(a). In this figure a systematic traffic pattern is shown and the following observations are made. Let us assume that the speedup factor is less than 2, shortly it will
Generalizing the necessary condition for any switch size
Now, it is possible to create a general traffic pattern for any switch size as it is shown in Fig. 6(a). Using this traffic pattern it is shown that in the “Strict Sense”, the same speedup factor of 2 is necessary for any switch size. In fact, as it is shown in Fig. 6(a), for this traffic pattern it is enough to use only two ports from all the input ports. (Please note that actually we can use the same traffic pattern of Fig. 5(a) for an switch and assume that only 2 of the input ports and
Speedup requirements when packet segmentation is allowed
Till now it has been assumed that packets would be switched intact and without segmentation, (the 5th assumption of Section 2).
It will be shown that relaxing this condition of switching each packet as a unit, or in another word, segmenting each packet into smaller fragments and allowing each fragment to be switched independently; will reduce the speedup requirement of the switch.
Now, there would be a question:
- (a)
“What is the speedup requirement for a given fragments size f and/or switch size n?”
Conclusion
In this paper introducing a constructive method of creating a traffic pattern and using the same assumptions as in [26], it is shown that, in the “Strict Sense”, for any size switch the speedup factor of 2 is the necessary condition to emulate the behavior of an Output Queued switch using a Combined Input Output Queued switch (while assuming that packets are switched intact inside the switch).
Combining this result with the previous schemes of [25], [26], in the “Strict Sense”, makes the speedup
References (30)
- et al.
Scheduling policies for CIOQ switches
Elsevier Journal of Algorithms
(2006) - et al.
Input versus output queueing on a space-. division packet switch
IEEE Transactions on Communications
(1987) - et al.
Queueing in high-performance packet switching
IEEE Journal on Selected Areas in Communications
(1988) - et al.
Performance analysis of a growable architecture for broad-band packet (ATM) switching
IEEE Transactions on Communications
(1992) - et al.
Performance study of an input and output queueing ATM switch with a window scheme and a speed constraint
Springer Journal of Telecommunication Systems
(1996) Multichannel bandwidth allocation in a broadband packet switch
IEEE Journal on Selected Areas in Communications
(1988)- M.J. Karol, K.Y. Eng, H. Obara, Improving the performance of input-queued ATM packet switches, IEEE, INFOCOM ’92....
- et al.
A nonblocking architecture for broadband multichannel switching
IEEE/ACM Transactions on Networking (TON)
(1995) - et al.
On the performance of an ATM switch with multi-channel transmission groups
IEEE Transactions on Communications
(1993) - Y. Oie, M. Murata, K. Kuota, H. Miyahara, Effect of speedup in nonblocking packet switch, IEEE International Conference...