Keywords

1 Introduction

A lot of applications benefit from an accurate view of flow-level traffic statistics in a software defined network (SDN). For example, many data centers implement dynamic scheduling of flow routing based on traffic statistics [1]. An accurate global view of individual flows’ traffic also helps to achieve better QoS [2], such as higher throughput, and lower latency. Moreover, traffic statistics information is important for network management, such as accounting management, and performance management [3,4,5].

There are three different ways for traffic measurement, TCAM-based counting [6, 7], packet sampling [8, 9], and sketches [10, 11]. However, due to limited size of TCAM and SRAM on a commodity SDN switch [12], both TCAM-based counting and packet sampling can only achieve coarse-grained traffic measurement [13]. Though measurement accuracy can be improved by increasing the TCAM size or sampling rate, the memory/computing resource usage will dramatically increase and pose scalability issues, especially in high-speed networks [14]. On the contrary, sketches can provide fine-grained traffic measurement for individual flows. Unlike the other two measurement solutions, sketches can summarize traffic statistics of all packets using compact data structures with fixed-size memory, while incurring only bounded estimation errors [14]. Many sketch-based solutions have been proposed in the literature to address different measurement requirements [10, 11].

Unfortunately, it is challenging to deploy sketches in practice for two reasons. 1) Sketches (e.g., Count-Min [15]) are only primitives and usually can not provide the flow ID, which restricts its usage for network management. Instead, they should be supplemented with additional operations to fully support a measurement task. 2) Sketches are often task specific, and a sketch only measures one simple object, such as heavy hitters and traffic changes detection. To implement various measurement requirements, the SDN controller usually requires different types of statistics knowledge from switches. For example, OpenSketch [13] deploys Count-Min, k-ray, and Bloom filter sketches on each switch to implement various applications, e.g., heavy hitters, traffic changes detection, and traffic counting.

Besides, if a packet is measured/processed by all the sketches at a switch upon its arrival, the total measurement overhead per packet may be massive, not to mention that today’s networks require very high throughput (e.g., beyond 10 Gbps). Moreover, OpenFlow capable switches usually have only limited processing power [16], which needs to be shared by different operations, e.g., rule insertion/deletion/modification, statistics collection, and traffic measurement. The testing results in [12] have shown that the switch can complete only 275 flow setups per second even without any traffic load. If traffic measurement on sketches costs 50% CPU utilization, the switch can only complete 137 flow setups per second, let alone forward packets.

Therefore, it is important to perform efficient traffic measurement with limited per-switch computing overhead, so that basic rule operations on each switch will not be interfered, especially for high-speed networks. Though some works, e.g., [17], use the hash value of a packet’s 5-tuple to distribute traffic measurements among switches, it requires all packets to carry their ingress-egress pairs, which are not available in practical networks [18], and also requires massive memory cost for auxiliary information.

Inspired by the fact that a flow will pass through several switches from its source to destination, we propose to solve the problem by determining which sketches (not all sketches) will be turned on (or be enabled) at each switch such that a flow can be measured by all required sketches in a distributed manner without violating per-switch computing capacity constraint. Our objective is to derive the statistics information of more flows processed by each kind of sketch so as to draw a more accurate view of traffic statistics. Specifically, we consider the optimization of proportional fairness, so that each kind of sketch can measure many flows without unduly restricting the number of flows measured by other sketches. This optimization metric has been widely adopted in different applications, such as association control [19], rate control [20, 21], and resource allocation [22]. We should note that this paper does not design a new sketch for traffic measurement, but focuses on efficient sketch configuration among all switches. The main contributions of this paper are:

  1. 1.

    We formulate a spatial sketch configuration with per-switch computing capacity constraint and present a case study of an optimal sketch configuration with per-switch computing capacity constraint for proportional fairness (SCP) problem. The complexity of this problem is analyzed.

  2. 2.

    We then present an efficient algorithm with approximation ratio 1/3 based on the greedy 0–1 knapsack method, and analyze the time complexity of the proposed algorithm.

  3. 3.

    The extensive simulation results show that our proposed algorithm can measure traffic of 44%–91% more flows compared with the existing solution. Besides, our proposed algorithm can make a good tradeoff between the CPU resource and the number of flows.

The rest of this paper is organized as follows. Section 2 introduces the network/sketch models and formulates the SCP problem. We propose an efficient algorithm for SCP and give the detailed performance analysis in Sect. 3. The simulation results are reported in Sect. 4. We conclude the paper in Sect. 5.

2 Preliminaries

In this section, we first introduce some typical sketches and motivate the problem by presenting microbenchmark on software implementations of these sketches. Then we show our solution with an intuitive example. Last, we formally formulate the spatial sketch configuration problem and present a case study of an optimal sketch configuration for proportional fairness problem.

2.1 Testing Results for Sketches’ Overhead

This section will test the sketch’s computing overhead per packet on the open virtual switch (OVS) [23], and the average number of CPU cycles (or CPU cycles for abbreviation) is adopted as the metric. In our experiment, our OVS runs on a VMware with 1 GB of RAM and 3.7 GHz of CPU frequency, and the sketches are integrated with the OVS. Since packet forwarding costs CPU resource through the OVS, we should avoid this impact on the sketch measurement overhead. To this end, packet forwarding is disabled during the testing. We mainly test the measurement overhead per packet for six sketches, Bloom Filter, Count-Min, k-ary, Cold Filter, CountSketch, and MRAC, respectively. Each of these sketches has 5 rows and 2000 columns, except of Bloom Filter with 1 row and 2000 columns. For each sketch, we take a test during a minute, and count the number, denoted as g, of measured packets on this sketch. Let G be the CPU frequency. Then, we obtain its average CPU cycles for sketch measurement is \(\frac{60 \cdot G}{g }\). The number of CPU cycles for six kinds of sketches is listed in Table 1. Obviously, the measurement overhead of CountSketch (350) is much more than that of Count-Min (78), for CountSketch needs additional heap operations besides updating.

Table 1. Number of CPU cycles for per packet’s measurement

2.2 A Motivation Example for Sketch Configuration

In this section, we give an example to motivate our solution. As shown in Fig. 1, there are three flows \(\{\gamma _1, \gamma _2, \gamma _3\}\) in the network. We assume that the arrival rate of each flow is 10 packets per second. The controller requires to deploy two sketches \(\{s_1,s_2\}\) for all three flows. For simplicity, the measurement overhead of each sketch per packet is denoted as 1 unit. In the left plot of Fig. 1, two sketches are enabled on each switch. As a result, the measurement overhead on switches \(v_1, v_3, v_4\), and \(v_5 \) is 40 packets per second. Through proper sketch configuration, each switch just enables one sketch at most, as illustrated in the right plot of Fig. 1. Therefore, there only need 4 sketches for measurement of flows \(\gamma _1\), \(\gamma _2\), and \(\gamma _3\), and the maximum measurement overhead among all switches is reduced. Specifically, the measurement overhead on switches \(v_1, v_3, v_4\), and \(v_5 \) is reduced to 20 packets per second, as described in Table 2.

Fig. 1.
figure 1

An example of sketch configuration. Left plot: without sketch configuration; right plot: with proper sketch configuration.

Motivated by this example, we should enable a subset of sketches on each switch for efficient traffic measurement. To this end, there are two different ways. One is called sketch placement [24]. That is, the controller only places those proper (or enabled) sketches on switches. The other is sketch configuration. That is, all sketches are pre-deployed on a switch, and the state of each sketch can be dynamically configured (on or off) by the controller. The first way is memory efficient, for each switch only keeps a subset of all sketches. However, network traffic is dynamic and ever-changing [18], and the sketch placement needs to be adjusted based on the observed traffic statistics with an online manner to meet the capacity constraints. The former, unfortunately, hinders such processes as the sketch software needs to be compiled to work again.

Table 2. Illustration of measurement overhead on switches

2.3 General Optimization of Sketch Configuration

An SDN typically consists of a logically-centralized controller and a set of switches, \(V=\{v_1, ..., v_n\} \), with \( n = |V| \). These switches comprise the data plane of an SDN. Thus, the network topology of the data plane can be modeled by G. Note that the logical controller may be a cluster of distributed controllers [25], which help to balance the control overhead among these controllers. Since we focus on the per-switch measurement overhead, the number of controllers will not significantly impact this metric. For simplicity, we assume that there is only one controller.

We define the spatial sketch configuration problem. Under the general SDN framework, we denote the flow based on originator-destination (OD) pairs as \(\varGamma =\{\gamma _1,...,\gamma _m \}\), with \(m=|\varGamma |\). For simplicity, we assume that each flow will be forwarded through only one path. With the advantage of centralized control in an SDN, the controller knows all the installed rules on switches, thus mastering the route path of each OD-pair/flow. Let \(P_{\gamma _k} \) denote the path of flow \(\gamma _k\). The set of flows (or pairs) passing through switch \(v_i\) is denoted as \(\varGamma _i\). In an SDN network, each switch can count the number of forwarded packets through port statistics, and we can derive the total number of forwarded packets on a switch by adding them together. We denote the set of packets passing through switch \(v_i\) as \(\mathcal {P}_i\).

To fulfill different application requirements, we assume that each switch has deployed a set of q sketches, denoted as \(\mathcal {S} = \{s_1,...,s_q \}\). These sketches are able to measure traffic for different objects. For sketch \(s_j\), its measurement/computing overhead per packet is denoted as \(c(s_j)\), which can be measured by the number of CPU cycles. We will configure the status (on or off) of each sketch on a switch. If the status of a sketch is on, we call this sketch “enable”. To avoid the additional control on switches, we assume that each arrival packet will be measured by all enabled sketches on this switch. As a result, the total measurement overhead on each switch \(v_i \) is denoted as \(c(v_i) \). Due to limited computing power on each switch, we expect a given fraction (e.g., 30%) of computing capacity for traffic measurement. Thus, the computing overhead for traffic measurement on switch \(v_i\) should not exceed the threshold \(C_i \).

In many applications, such as flow spread and traffic change detection, it is expected that more flows can be measured. In the following, if all packets of a flow are measured by sketch \(s_j\), we call that this flow is covered by sketch \(s_j\). \(f(\gamma _k)\) denotes the value of attributes (e.g. traffic size) of flow \(\gamma _k\) and \(H_j\) denotes the lower bound of the attributes of measured flows. In Sect. 2.4, we study a special case of optimal sketch configuration for proportional fairness for better network performance.

Accordingly we formalize the spatial sketch configuration problem as follows:

$$\begin{aligned} \max \ G(H_1,...,H_q) \end{aligned}$$
(1)

where \(x^j_i \) denotes the result of sketch configuration by the controller. That is, if \(x^j_i =1\), the sketch \(s_j\) is enabled on switch \(v_i\). Otherwise, its status is off. The first set of inequalities denotes whether flow \(\gamma _k\) is covered (\(z_k^j=1\)) by sketch \(s_j\) or not. The second set of inequalities ensures that the attribute of measured flows (e.g., the number of measured flows or the traffic amount of measured flows) by sketch \(s_j\) is not less than \(H_j \), where \(f(\gamma _k)\) represents the attributes of measured flows, e.g., traffic information. For example, if we let \(f(\gamma _k)\) be 1, this set of inequalities means that sketch \(s_j \) can measure at least \(H_j\) flows. The third set of inequalities means the cost on each switch \(v_i\) should not exceed its computation threshold \(C_i\). The objective is to maximize the function \(G(H_1,...,H_q)\), which refers to the attributes of measured flows.

2.4 A Case Study of Optimal Sketch Configuration for Proportional Fairness

In this section, we consider a special case (aiming to make an optimal sketch configuration for proportional fairness) of the general sketch configuration.

Based on the basic description in Sect. 2.3, after sketch configuration, the total number of covered flows by sketch \(s_j\) is denoted as \(\beta _j \). We set \(f(\gamma _k)\) as 1 and \(H_j\) as \(\beta _j\). Due to CPU resource constraint on commodity SDN switches, we may not cover all flows. One natural objective is to maximize the minimum number of covered flows among all sketches, i.e., max-min fairness. This fairness manner is simple, but may reduce the total number of covered flows measured by all these sketches. Thus, our goal is to propose an optimal sketch configuration in a proportional fairness manner [19, 26]; the configuration allows each sketch measuring enough flows without unduly restricting the number of flows measured by other sketches, i.e., \(\max \sum \nolimits _{j=1}^q \log \ {\beta _j } \).

According to Eq. (1), we formalize the problem of sketch configuration problem with the limited CPU capacity (SCP). The objective function of proportional fairness can be referred to the definition in [26].

$$\begin{aligned} \max \ \ \sum \nolimits _{j=1}^q \log \ {\beta _j } \end{aligned}$$
(2)

The definition of \(x^j_i \) is same as that in Eq. (1). The first set of inequalities denotes whether flow \(\gamma _k\) is covered (\(z_k^j=1\)) by sketch \(s_j\) or not. The second set of inequalities ensures that at least \(\beta _j\) flows will be covered by sketch \(s_j\). The third set of inequalities means the cost on each switch \(v_i\) should not exceed \(C_i\), in which \(C_i\) is the reserved computing capacity for traffic measurement on switch \(v_i\). The objective is to optimize the proportional fairness among all sketches.

Theorem 1

The SCP problem defined in Eq. (2) is NP-hard.

Proof

Assuming that there is only one switch in a network, only some specified sketches can be selected to cover flows passing through the switch with the CPU constraint. That is, this becomes a 0–1 knapsack problem [27], which is NP-Hard. Since the case that there is only one switch in a network is a special case of our problem, the SCK problem is NP-Hard too.

3 Algorithm Design of Proportional Fairness

Due to the NP-hardness, the SCP problem cannot be solved in polynomial time. In this section, we propose an efficient algorithm, and analyze its approximation factor.

figure a

3.1 Algorithm Design

In this section, we present a sketch configuration (SCK) algorithm based on 0–1 knapsack [27] for proportional fairness. The detailed description of the SCK algorithm is given in Algorithm 1. Initially, the profit of each sketch on switch \(v_i\) is \(\log |\varGamma _i|\) (Line 4), where \(\varGamma _i\) denotes the set of flows passing through switch \(v_i\). The cost of each sketch \(s_j\) on switch \(v_i\) is denoted as \(c(s^i_j)=c(s_j) \cdot |\mathcal {P}_i| \), where \(c(s_j)\) is the computing overhead per packet and \(\mathcal {P}_i\) denotes the set of packets through switch \(v_i\). The algorithm mainly consists of iterations, each of which is divided into two steps. In the first step, for each switch, we compute the maximum profit using the greedy 0–1 knapsack method [27] under a computing cost constraint (Line 10). Then, we choose a switch, denoted as \(v_i\), with the maximum profit, determine the set of enable sketches on switch \(v_i\), and update the set of covered flows by these sketches. In the second step, we update the profit for each sketch \(s_j\) on switch \(v_i\) as \(p(s^i_j)=\log |\varGamma _i \cup \overline{\varPi }_j | - \log |\overline{\varPi }_j | \) (Line 19), where \(\overline{\varPi }_j\) denotes the current set of flows covered by sketch \(s_j\).

3.2 Greedy Method for the 0–1 Knapsack Problem [28]

As described above, the 0–1 knapsack algorithm is a core module for the SCK algorithm. For each switch \(v_i \), we regard each sketch \(s_j\) as an item, whose profit and cost are denoted as \(p(s^i_j)\) and \(c(s^i_j) \), respectively. Our objective is to maximize the total profit of the selected items with a total cost constraint \(C_i \). The knapsack algorithm first computes the profit-cost ratio for each sketch \(s_j\) as: \(\delta (s_j^i) = \frac{p(s_j^i)}{c(s_j^i) }\). Then, we sort all the sketches by the decreasing order of their profit-cost ratios. Finally, we check each sketch to determine whether this sketch will be selected or not with cost constraint. The formal algorithm is described in Algorithm 2.

figure b

3.3 Performance Analysis

We first prove a simple conclusion, which will serve performance analysis of the SCK algorithm. Assume that Y and Z are arbitrary non-empty sets.

Lemma 1

\(\log |Y_1 \cup Z| + \log |Y_2 \cup Z| \ge \log |Y \cup Z| + \log |Z|\), where \(Y_1 \cup Y_2 = Y \), and \( Y_1 \cap Y_2 = \phi \).

Proof

Assume that \( |Y| = y \) and \(|Z| = z\). Moreover, \(|Y_1 \cup Z| = z + y_1\), and \(|Y_2 \cup Z| = z + y_2 \). Then, we have \(|Y \cup Z| = z + y_1 + y_2 \). Obviously, \((z + y_1) \cdot (z + y_2) \ge z \cdot (z + y_1 + y_2)\). As a result, \(\log |Y_1 \cup Z| + \log |Y_2 \cup Z| \ge \log |Y \cup Z| + \log |Z|\).

Now, we give the following lemma. Given a set Y, we consider a division \(\{Y_1,...,Y_n \}\) of set Y. That is, \(Y_1 \cup ... \cup Y_n = Y \), and \( Y_i \cap Y_j = \phi \), \(\forall i,j\).

Lemma 2

\(\sum \nolimits _{i} (\log |Y_i \cup Z| - \log |Z| ) \ge \log |Y \cup Z| - \log |Z| \).

Proof

We prove this lemma by induction on variable n. When \(n=1 \) or 2, the lemma is proved. We assume that the lemma is proved for any \(n \le k\). Now, we consider the case \(n = k + 1 \).

$$\begin{aligned} \sum \nolimits _{i=1}^{k+1} (\log |Y_i \cup Z| - \log |Z| )&= \sum \nolimits _{i=1}^{k-1} (\log |Y_i \cup Z| - \log |Z|) + (\log |Y_k \cup Z| - \log |Z|) \nonumber \\ \, + \, (\log |Y_{k+1} \cup Z| - \log |Z|)&\ge \sum \nolimits _{i=1}^{k-1} (\log |Y_i \cup Z| - \log |Z| ) + (\log |(Y_k \cup Y_{k+1}) \cup Z| - \log |Z|) \nonumber \\&\ge \log |( Y_1 \cup ... \cup Y_{k+1}) \cup Z| - \log |Z|\\&= \log |Y \cup Z| - \log |Z| \nonumber \end{aligned}$$
(3)

Note that the third and fourth inequalities follow by the induction as \(n=2\) and \(n=k\), respectively.

Let \(Q_i\) and \(Q'_i\) be two vectors of flow sets passing through switch \(v_i\) as follows: \(Q_i=[Q_{i,1},...,Q_{i,q}]\) and \(Q'_i=[Q'_{i,1},...,Q'_{i,q}]\), where \(Q_{i,j}\) and \(Q'_{i,j}\) are both sets of flows covered by sketch \(s_j\) on switch \(v_i\). We define the profit of \(Q_i \) as \(\omega (Q_i )= \sum \nolimits _{j=1}^{q} \log | Q_{i,j}| \). For simplicity, we also define a vector operation as follows:

$$\begin{aligned} \omega (Q_i \uplus Q'_i) = \sum \nolimits _{j=1}^{q} \log |Q_{i,j} \cup Q'_{i,j} | \end{aligned}$$
(4)

The incremental profit from vectors \(Q'_i\) to \(Q_i\) is:

$$\begin{aligned} \omega (Q_i \backslash Q'_i)&= \omega (Q_i \uplus Q'_i) - \omega (Q'_i) =\sum \nolimits _{j=1}^{q} (\log |Q_{i,j} \cup Q'_{i,j} | - \log |Q'_{i,j} | ) \end{aligned}$$
(5)

In the following, \(Q_G\) denotes a vector of flow sets covered by all sketches after the SCK algorithm. In the \(l^{th}\) iteration of SCK, \(G'_l\) denotes the vector of flow sets covered by all sketches, and the incremental profit is denoted as \(X'_l\). Obviously \(X'_l =\omega (G'_l \backslash \biguplus ^{l-1}_{i=1}G'_i)\). For simplicity, the optimal solution for SCP is denoted as OPT.

Lemma 3

The SCK algorithm can achieve the approximation ratio 1/3 for the SCP problem.

Proof

Let \(\alpha \) be the approximation ratio of the greedy algorithm for 0–1 knapsack. Consider an instant that the SCK algorithm has executed l-1 iterations. In the \(l^{th}\) iteration, the algorithm chooses the switch \(v_{l'}\). Assume that the optimal solution will select a vector of covered flow sets, denoted as \(O_l\), from switch \(v_{l'}\). If we choose \(O_l\) instead of \(G'_l\) in this iteration, the incremental profit becomes \(\omega (O_l \backslash \biguplus \nolimits ^{l-1}_{i=1}G'_i)\), denoted as \({X''}_l\). Obviously, we have \(X'_l \ge \alpha \cdot X''_l = \alpha \cdot \omega (O_l \backslash \biguplus \nolimits ^{l-1}_{i=1}G'_i) \ge \alpha \cdot \omega (O_l \backslash Q_G)\). It follows

$$\begin{aligned} \omega (Q_G)&=\sum \nolimits _{l=1}^{n} X'_l\ge \sum \nolimits _{l=1}^{n} \alpha \cdot \omega (O_l \backslash Q_G) = \alpha \cdot \sum \nolimits _{l=1}^{n} \omega (O_l \backslash Q_G) \ge \alpha \cdot \omega (\biguplus \nolimits ^{m}_{l=1}O_l \backslash Q_G) \\&= \alpha \cdot \sum \nolimits _{l=1}^{n} [\omega (O_l \uplus Q_G) - \omega (Q_G)] \ge \alpha \cdot [\omega (\biguplus \nolimits ^{n}_{l=1}O_l \uplus Q_G) - \omega (Q_G)] \\&= \alpha \cdot [\omega (OPT \uplus Q_G) - \omega (Q_G)] \ge \alpha \cdot [ \omega (OPT)-\omega (Q_G)] \end{aligned}$$

Thus, we have

$$\begin{aligned} \omega (Q_G) \ge \frac{\alpha }{1 + \alpha } \cdot \omega (OPT) \end{aligned}$$
(6)

Since the greedy method achieves the approximation ratio 1/2 for 0–1 knapsack [27], by Eq. (6), the SCK algorithm can achieve the approximation ratio 1/3 for SCP.

4 Performance Evaluation

In this section, we first introduce the metrics and benchmarks for performance comparison (Sect. 4.1). We then evaluate our proposed algorithm by comparing with the random method through simulations (Sect. 4.2).

4.1 Performance Metrics and Benchmarks

In this paper, we expect to measure more flows with proportional fairness, which benefits different applications, e.g., traffic engineering, with the constraint of CPU processing capacity through efficient sketch configuration. Thus, we adopt the following metric in our numerical evaluations.

  1. 1.

    Flow cover ratio (FCR). The controller computes the number of covered flows by each sketch. The flow cover ratio is defined as the number of covered flows dividing the number of all flows in the network. We denote the flow cover ratio of each sketch as \(\theta _j = \beta _j / m\), where \(\beta _j\) is the number of flows covered by sketch \(s_j\) and m is the number of flows in the network. In this paper, we measure three metrics of FCR, the average, the maximum and the minimum, respectively. For example, the average flow cover ratio is \(\overline{\theta } = \sum \nolimits _{j=1}^{q} \theta _j / q \).

According to [29], it considers the memory allocation of multiple sketches for accurate traffic measurement. We consider the CPU resource allocation of multiple sketches instead of memory allocation in the paper. Specifically, the controller will choose sketches in order to cover as many as flows configured on each switch with the CPU processing capacity constraint. The detailed algorithm called SCREAM is referred to [29, 30]. Besides, we compare the proposed SCK algorithm with the random solution, denoted as RND, through both simulations and prototype experiments. Specifically, the controller will randomly choose sketches configured on each switch with the CPU processing capacity constraint.

4.2 Simulation Evaluation

Simulation Setting. In the simulations, as running examples, we select two practical and typical topologies: one for campus networks and the other for data center networks. The first topology, denoted as (a), contains 100 switches, 200 servers and 397 links from [31]. The second one is the fat-tree topology [32], denoted as (b), which has been widely used in many data center networks. The fat-tree topology has in total 80 switches (including 16 core switches, 32 aggregation switches, and 32 edge switches) and 192 servers. To observe the impact of different traffic traces on the measurement performance, we adopt two types of traffic traces. One is the 2–8 distribution. Specifically, the authors of [12] have shown that less than 20% of the top-ranked flows may be responsible for more than 80% of the total traffic. The other is that the traffic size of each flow follows the Gaussian distribution. We adopt six kinds of sketches, Count-Min, CountSketch, Bloom Filter, Cold Filter, MRAC and k-ary, respectively, in our simulation. The sketch computing overhead per packet is listed in Table 1. By observing the configuration of some commodity SDN switches, e.g., H3C and Pica8, the switch’s CPU capacity is set as 3 GHz, and we assume that at most 50% CPU capacity will be allocated for traffic measurement. We execute each simulation 100 times, and take the average of the numerical results.

Simulation Results. We run two groups of simulations to check the effectiveness of the SCK algorithm. The first group of four simulations observes the FCR by changing the CPU capacity constraints from 0.5 GHz to 1.5 GHz. We generate 30K flows by default in the network. Figures 2 and 3 show that the average flow cover ratio is increasing with enhanced CPU capacity constraints for all algorithms. Our SCK algorithm can significantly improve the flow cover ratio compared with SCREAM and RND. Specifically, given the CPU capacity constraint of 1 GHz, the average FCR by our SCK algorithm is 0.89, while SCREAM and RND achieve the average FCRs only about 0.67 and 0.48, respectively, by the Fig. 2. That is, our proposed algorithm can improve the average flow cover ratio by 33% and 85% compared with SCREAM and RND. Moreover, the average FCR of SCK is always near to the optimal result with the CPU capacity increasing. For example, the average FCR of SCK is about 96% of that of the optimal result under the CPU capacity constraint of 1 GHz in Fig. 2.

Fig. 2.
figure 2

Flow cover ratio vs. CPU capacity constraint under 2–8 distribution with topology (a)

Fig. 3.
figure 3

Flow cover ratio vs. CPU capacity constraint under 2–8 distribution with topology (b)

Fig. 4.
figure 4

Flow cover ratio vs. CPU capacity constraint under Gaussian distribution with topology (a)

Fig. 5.
figure 5

Flow cover ratio vs. CPU capacity constraint under Gaussian distribution with topology (b)

Fig. 6.
figure 6

Flow cover ratio vs. number of flows under 2–8 distribution with topology (a)

Fig. 7.
figure 7

Flow cover ratio vs. number of flows under 2–8 distribution with topology (b)

Fig. 8.
figure 8

Flow cover ratio vs. number of flows under Gaussian distribution with topology (a)

Fig. 9.
figure 9

Flow cover ratio vs. number of flows under Gaussian distribution with topology (b).

Figures 4 and 5 plot the FCR under the Gaussian traffic distribution. The trend of these curves is similar as that of curves in Figs. 2 and 3. We find that our SCK algorithm can improve the average FCR by 30% compared with SCREAM under the CPU capacity constraint of 1 GHz by Figs. 4 and 5. When the CPU capacity constraint is 1 GHz, we observe that SCK can improve the average FCR by 51%–89% compared with the RND algorithm by Figs. 4 and 5. Moreover, the average FCR of SCK is always near to the optimal result with the CPU capacity increasing. For example, the average FCR of SCK is about 90% of that of the optimal result under the CPU capacity constraint of 1 GHz.

In the second simulation set, we observe the FCR by changing the number of flows from 10K to 50K. By default, we set the CPU capacity constraint as 0.8 GHz. Figures 6, 7, 8 and 9 show that the average flow cover ratio is decreasing with more flows in the network. Specifically, when there are 30K flows in the network, the average FCR of SCK is 0.72 by the left plot and 0.85 by the right plot in Fig. 7. Moreover, SCK can improve the average FCR 30%–78% compared with SCREAM and RND by Fig. 6 and 7. Besides, the average FCR of SCK is at least 79% of that of the optimal result with the number of flows increasing in Figs. 6 and 7. That is, SCK can achieve the similar performance with the optimal result.

From these simulation results in Figs. 2, 3, 4, 5, 6, 7, 8 and 9, we can make the following conclusions. First, our algorithm achieves the similar measurement performance compared with the optimal result. For example, the average FCR of SCK is about 96% of that of the optimal result under the CPU capacity constraint of 1 GHz in Figs. 2 and 3. Second, our SCK algorithm can achieve better flow cover ratio than SCREAM and RND. For example, our SCK algorithm can improve the average flow cover ratio by 30%–91% compared with SCREAM and RND by Fig. 2. According to SCREAM [29, 30], the flows covered by the applied sketchs hould meet the requirements of a certain kind of sketch in a certain order before choosing the next one, which makes a large gap in the number of flows covered by all types of sketch. Thus, our SCK algorithm performs better than SCREAM and RND on proportional fairness.

5 Conclusion

In this paper, we studied how to perform optimal sketch configuration on switches for proportional fairness. We proposed the SCP problem, and designed an efficient algorithm with approximation ratio 1/3 for this problem. We implemented the proposed algorithm on our SDN platform, and the simulation results showed high efficiency of our proposed algorithm. In the future, we will study the trade-off between sketch’s memory cost, measurement accuracy and computing cost for more practical designs.