1 Introduction

Nowadays, we are facing with a huge increase in connected intelligent Internet of Things (IoT) devices spread across wide geographical areas, along with an increasing demand of efficient cutting-edge pervasive services and applications [according to the Gartner prediction [1], there will be 26 billion IoT devices (excluding PCs, tablets, and smartphones) by 2020]. However, despite the rapid advances in IoT-related technologies, the major bottleneck for IoT applications are the limited calculation resources available in each IoT device, including CPU, storage, etc. The situation is further exacerbated by the requirements of many IoT applications that need intensive use of resources, data analysis and application processing in real time.

A common way to overcome the limited computing resources of IoT devices in various applications is to offload some tasks to systems with sufficient computing resources, i.e Cloud servers. The offloading approach improves the performance of IoT applications and reduces the energy consumption, but often introduces additional overhead. This could increase the latency of IoT applications and network congestion, especially in those applications involving resource-intensive operations.

These disadvantages are the main driving factors that motivate the recent development of Edge Computing. Edge Computing exploits devices with larger computational power, such as cloudlets or computing-enabled switches, at the edge of the network of connected IoT devices, in order to support IoT applications decreasing the latency and increasing the bandwidth offloading some tasks. In this panorama, a new promising paradigm to dynamically extend Edge/IoT resources towards the Cloud (and viceversa) is Osmotic Computing [2, 3]. Borrowing the main osmotic concepts from chemistry, Osmotic Computing assumes that each application can be slit in many dynamic MicroELements (MELs) [4] that can be executed across different computing systems. Such MELs (in particular, microservices and microdata) are connected in complex graphs according to the specific requirements of the application to deploy, and run on IoT devices, Edge microdatacenters and Cloud datacenters. Since MELs related to the same application need to be connected, Osmotic Computing puts significant stress on the network management issues, driving the need for new networking paradigms. The need of dynamic networking resources, and the trend of building networks without vendor dependence, led to the creation of software solutions for network management, such as Software Defined Networks (SDNs), which allow a rapid and automatic configuration of network traffic and routing, by abstracting network resources. The ability to dynamically define via software the behavior of a network, gives the Osmotic Computing orchestrator the flexibility to move MELs across IoT/Edge/Cloud nodes according to different types of requirements at the application or at the lower layers.

Fig. 1
figure 1

SDN that interconnects MELs deployed over heterogeneous computing infrastructures in an Osmotic scenario

In this paper, we propose a new solution to implement networking functionalities for Osmotic Computing based on SDNs to interconnect MELs deployed over heterogeneous computing infrastructures in the Osmotic Computing scenario (see Fig. 1). In particular, we investigate how to optimize the estimation of routing paths into a dynamic SDN, where links can change frequently. To select paths, we use the well known Dijkstra’s algorithm, where different QoS metrics for path evaluation are considered. Indeed, IoT applications are characterized by different possible constrains (e.g., delay, jitter, energy consumption,...) as well as IoT/Edge/Cloud nodes are characterized by different capacity and properties. Our modified version of the Dijkstra’s algorithm is able to configure routing paths according to all these requirements.

Due to the high complexity of the reference scenario in terms of both scalability and heterogeneity of network components, it is hard to provide quick management of routing information, especially in a dynamic osmotic ecosystem. When the number of connected devices increases even the amount of data to be processed increases, therefore the use of framework for Big-Data processing becomes necessary. The most known framework of Big-Data analysis are Hadoop [5] and Spark [6]. In literature, several works such as [7,8,9] have compared Hadoop and Spark performance. One of the major difference between Hadoop and Spark consists in the way they store and process data. By design Spark stores and processes data in memory, therefore it needs more RAM compared to Hadoop that stores data on Hadoop File System (HDFS). Due to limited capacity of RAM of real systems, the latter is more suitable to process batch data such as in the proposed scenario. In this paper, we exploit computation capacity of Hadoop to overcome limitations of IoT and Edge devices in the estimation of paths for complex graphs of MELs. In particular, our main contribution is the implementation of a distributed version of the Dijkstra’s algorithm that uses Hadoop–MapReduce tasks to calculate the best paths over an osmotic system, and to update them when changes in the graphs occur. The map-reduce technology enables a parallel computing of paths at the cost of overhead for task management.

In the paper, we provide an analysis of the effective advantages introduced by our solution. In particular, we perform a scalability analysis based on the number of interconnected MELs evaluating the average execution time of the proposed solution against a traditional execution of the Dijkstra’s algorithm. Results are promising and showed that actually the proposed approach performs faster than the traditional one when the number of MELs is high. The rest of the paper is organized as follows. Section 2 describes related works and how our paper advances the state of the art. Section 3 provides the motivation at the basis of our work. Section 4 summarises the basic technologies adopted in our work. The adopted model and its implementation are presented in Sects. 5 and 6 respectively. In Sect. 7, performance of the proposed approach are evaluated. Finally, Sect. 8 concludes the work providing highlights for the future.

2 Related Work

2.1 Mapping of Virtual Networks on Physical Networks

In recent years, several approaches have been proposed regarding the design and deployment of distributed resources on Cloud and Edge computing.

One of the major challenges regards the mapping of virtual networks onto physical network infrastructures, that is referred as virtual network embedding (VNE) problem. In this context, a virtual network mapping problem with survivability was formulated and solved by means of a heuristic global resource capacity (GRC) survivable SVE (GRC-SVNE) algorithm based on the Dijkstra algorithm was proposed in [10]. Moreover, an alternative GRC-M algorithm in combination with multicommodity flow (MCF) algorithm is discussed in [11].

A similar problem of mapping multiple virtual networks (VNs) with geographic location constraints onto a heterogeneous physical network is also studied in [12], where the design is centered on the survivability and reliability of each VN request in edge-of-things based data centers. The proposed model simultaneously considers resource sharing and mapping VN links and nodes in edge-of-things computing.

The integration of software defined networking (SDN), with IoT, Edge and Cloud Computing for dynamic distribution of IoT analytics and efficient use of network resources is proposed in [13]. In particular, it is presented an experimental IoT-aware multilayer transport SDN and edge/cloud orchestration architecture that deploys IoT-traffic control and congestion avoidance mechanisms for the dynamic distribution of IoT processing to the edge of the network based on the actual network resource state. This contribution is complemented and extended by the work proposed in this paper, because Osmotic computing dynamically deploys microservice components on Edge/IoT/cloud datacenters according to the actual network resource state, while SDNs interconnet them. Since during the osmotic process microservices are re-allocated on time due to specific (application/context) parameters, like QoS, SDNs must be redefined accordingly.

2.2 Optimization of Resource Allocation on SDNs

In this work we propose a model which takes into account the relationship between the type of applications (e.g., real-time vs. file transfer data) and the network parameters (e.g, delay, number of hops, energy consumption) to properly allocate resources according to the required QoS. The problem of QoS-aware resource allocation on SDNs involves several variations, according to the application domain: in Industrial IoT environment [14] reports that an SDN-based Edge/Cloud interplay for big data streaming is handled by a multi-objective evolutionary algorithm using Tchebycheff decomposition for flow scheduling and routing. Telecommunication sector is very prolific about the dynamic allocation of resources. For instance, the collaborative computation offloading for multi-access Edge computing over fiber-wireless networks is presented in [15], where an approximation collaborative computation offloading scheme, and a game-theoretic collaborative computation offloading scheme is described. Mobile edge computing (MEC) is an emerging paradigm towards the 5G communications driven by the increasing demand of intensive computing services and the resource limitation of mobile devices at the edge of mobile networks. To cooperatively process MEC services, radio resources can virtualized along with the computation and storage resources as described in [16], where SDN is related to a cellular network and radio.

In smart cities domain an integrated framework that enables dynamic orchestration of networking, caching, and computing resources to improve the performance of applications is discussed in [17]. In smart mobility scenario, a veichle-to-veichle (V2V) data offloading for cellular network based on SDN inside a mobile Edge computing architecture is presented in [18], whilst a novel collaborative vehicular Edge computing framework is presented in [19].

Even if resource allocation approaches are based on different optimization criteria according to the analyzed problem but currently the algorithms for resource allocation on SDNs are mainly based on Dijkstra.

2.3 Applying Dijkstra’s Algorithm

The application of the Dijkstra algorithm in SDN raises several challenges regarding reliability, capacity control and scalability. They are already investigated in a number of papers. The limits of traditional hierarchical architecture design principles based on Dijkstra’s algorithm in the perspective of emerging Cloud/Edge computing systems are highlighted in [20]. This is the reason why in this work we propose to apply a distributed approach for processing Dijkstra algorithm in the case of very dense networks.

The application of network virtualization in Fiber-wireless (FiWi) networks with the purpose to alleviate bandwidth tension when a physical link is serving different virtual networks fails is discussed in [21]. In particular, a shared protection mechanism is embedded within the Dijkstra routing algorithm in order to improve its reliability when a physical link fails. A reliable security-oriented SDN routing mechanism, named RouteGuardian, which considers the capabilities of SDN switch nodes combined with a piece of Network Security Virtualization framework is proposed in [22]. In particular, it overcomes the limits of traditional routing mechanisms in SDN, based on the Dijkstra shortest path, in terms of capacity control in order to prevent network congestion. A self-adjusting architecture based on pairing heap to scale SDN network, overcoming scalability issues due to a centralized control plane, is proposed in [23]. By using network virtualization function (NVF), the whole network is viewed as a huge heap and it is divided into several sub heaps repeatedly until the basic units of physical switches in the network is got. In this context, the Dijkstra’s algorithm is applied and optimized based on Pairing heap outperforming the original one when the network is dense. Dijkstra’s Algorithm has been recently used in many emerging applications based on SDN. In [24], an autonomous agent based shortest path load balancing using the Dijkstra’s algorithm was proposed to find the shortest paths to virtual machines when a Cloud services saturates its processing capabilities. A piece of framework to process the 3D shape based on Web Browser considering Web3D technology areas in the era of “Internet plus” is discussed in [11]. This framework is based on Mesh Segmentation and a new Dijkstra-based mesh segmentation approach.

The application of SDN/OpenFlow in an Internet Protocol Television (IPTV) multicasting implementation is proposed in [25]. In this context, an important function of IPTV multicasting is the Joint/Leave request of client in a multicast group. In order to obtain an efficient IPTV service routing, Dijkstra’s and Prim’s algorithms were used to comparatively calculate the minimum total edge weight. The development platform is based on the Mininet environment to emulate the system, and it consists of Open Switch and a POX controller. Experiments compare the transmission time of the first joint/receive packet to a client when using Dijkstra’s and Prim’s algorithms. In [26], the service function chaining (SFC) is adopted as a model of the Shortest Path Tour Problem in order to find the minimum transmission cost path by exploiting a constructed multistage graph. In particular, the minimum transmission cost paths for multiple SFC classes is derived using the Dijkstra’s Shortest Path Algorithm with resource constraints in a flexible way.

At our best knowledge no scientific contribution has been found about the use of Dijkstra algorithm to allocate resources (specifically microservices) on IoT/Edge micro data centers/cloud data centers in ultra dense dynamically changing networks, like the case of osmotic computing. In this scenario where the network topology is very dynamic, the Dijkstra algorithm can be applied to maintain an updated shortest path tree for each network (graph) node. This task is computational intensive, therefore a MapReduce approach is required in order to process a graph with huge amount of virtual network nodes for assessing the best path. We present a novel approach to route packets over an SDN which uses Hadoop to compute the latency-based shortest path using Dijkstra algorithm. Our objective is to minimize the shortest routing path computation time as well as to ensure a global view and a high level of dynamism of our network topology.

3 Motivations

Osmotic Computing [2] is a new computing paradigm that automatizes the deployment and runtime management of MELs across IoT devices, Edge and Cloud nodes. It takes into account issues at different layers (e.g., application, network, virtual infrastructure, physical node,...), in order to optimize the performance of IoT applications deployed in the system as a composition of MELs. This new autoconfiguring approach for application management implies several challenges that need to be addressed [3].

MELs cooperate to provide high level IoT application features. Considering the very limited size of MELs compared to the potential complexity of IoT applications, and that such MELs could be deployed in resource constrained nodes, such as IoT devices, the number of interconnected MELs for each application scenario can increase a lot. For example, let’s consider the well known IoT application of video processing [27, 28], where data that need to be processed can be split in several microdata and, at the same time, different processing tasks can be performed exploiting several microservices. In [29], for example, the camera produces 60 frames per second, each video frame of \(1920\times 1080\) pixels is split into 9 images \(28\times 28\), and then video transcoding is performed. On each of such a microdata component, image processing microservices are executed. In a more complex scenario (e.g., video surveillance), we can have several smart cameras deployed with a significant increase of the amount of microdata and running microservices. Thus, the number of networked nodes becomes very high and the management of network issues becomes a key feature of Osmotic Computing.

Even if the MEL graph depends on the application design and could persist during the time, the deployment of MELs in the osmotic system, and hence communication links among MELs, can change during the time according to the availability of resources at IoT, Edge and Cloud nodes, and application requirements. MELs can be migrated across different infrastructures according different osmosis migration strategies as already explained. Defining new networking solutions for Osmotic Computing from scratch is very hard, due the complexity of the hardware and software components involved in the system. Thus, we decide to design a new solution exploiting consolidated technologies and algorithms, making our approach modular and robust against possible operation faults. In the design of our solution, we consider the following requirements to address:

  • Osmotic computing dynamically reorganize MELs across the system. Thus, it is necessary to set up a software oriented approach to interconnect MELs in order to increase feasibility and flexibility; to satisfy this requirement, SDNs are used in our solution.

  • Each SDN interconnects MELs related to the same application, and, for different applications, different and isolated SDNs are set up. Nevertheless, as discussed in the above example on video surveillance, each SDN can be very large and composed by thousand of nodes. It means that SDNs must implement scalable strategies for network management. Cloud computing is the current dominant technology aimed at scalable computing tasks and it is natively included in our osmotic scenario. Moreover, Cloud technologies also support scalable networking functionalities of SDNs.

  • SDNs connect MELs hosted in networked nodes. These nodes have the same role in the system and can be linked as a dynamic and distributed mesh network, where the topology changes as effect of the osmotic reconfiguration of resources. Communication among nodes can be managed according to routing algorithms. In this paper we decided to adopt routing strategies based on the well known Dijkstra’s Algorithm because it works for weighted Graphs and, therefore, allows us to implement QoS networking strategies (as described in Sect. 6.1).

  • When the number of connected devices increases even the amount of data to be processed increases, therefore the use of framework for Big-Data processing becomes necessary. The most known frameworks of Big-Data analysis are Hadoop [5] and Spark [6]. In literature, several works such as [7,8,9] have compared Hadoop and Spark performance. One of the major difference between Hadoop and Spark consists in the way they store and process data. By design Spark stores and processes data in memory, therefore it needs more RAM compared to Hadoop that stores data on Hadoop File System (HDFS). Due to limited capacity of RAM of real systems, the latter is more suitable to process batch data such as in the proposed scenario. In particular, our contribution in this paper is the implementation of a distributed version of the Dijkstra’s algorithm that uses Hadoop–MapReduce tasks to calculate the best paths over such a complex system (Fig. 2).

Fig. 2
figure 2

Osmotic computing ecosystem

4 Key Technologies

In this section we provide some hints on the basic technologies we adopt in our solution.

4.1 Software Defined Network

Software defined networking (SDN) was conceived at the UC Berkeley and Stanford University in 2008. SDN is an emerging network architecture where the network control is decoupled from forwarding and directly programmable. The migration of control logic, strictly linked to individual network devices, to accessible IT devices, allows to abstract the underlying infrastructure layer, to applications and services, and to ensure a global vision of the entire network. Network intelligence is centralized in an SDN controller purely software and maintains a global view of the network. As result, the network appears to applications and mechanisms for the policy as a single logical switch. Therefore, the network operators could write high-level control programs to specify the behavior of an entire network. Moreover, the centralized control allows to define more specific and complex tasks that could involve many network functionalities, e.g., security, resource management and control, into a single framework. A mechanism that allows the control plane to communicate with the data plane is OpenFlow [30]. OpenFlow, developed at the Stanford University around 2008, is one of the first open protocols which ensures a middle layer communication plane between the control plane device, the controller, and the data plane devices, switches and routers so that we can monitor and manage traffic according to certain necessities. In particular, an OpenFlow switch consists of one or mode flow tables and/or group tables. An OpenFlow controller can update, add or delete flow entries in the flow tables in a reactive or proactive way. Thus, the OpenFlow protocol makes the deployment of innovative routing and switching protocols easy. Furthermore, it can be used for applications such as virtual machine mobility, high-security networks and next generation IP-based mobile networks.

4.2 Mininet

The prototyping environments currently available have their pros and cons. Simulators, like ns-2 [31] or Opnet [32], are interesting because they can work on laptops, but they are far from reality. The developed code in the simulator is different from the one that will be used in real networks. Moreover, they are also not interactive. At first glance, a network of virtual machines seems the most interesting solution. In fact, it is possible to have the abstraction of a real topology by using a virtual machine (VM) for each host, a VM for each switch, and virtual interfaces. Virtual machines are heavy, thus the memory overhead for each VM limits the scalability to switches and hosts. Another of the characteristics we have mentioned above is the possibility of moving our prototype into a real infrastructure once the experiment is complete. Thanks to Mininet [33], we can implement a new network functionality or a new architecture, test it on a fairly extensive topology (applying traffic) and then use the same code developed on a real network. Mininet is an open source network emulator that creates scalable SDN networks using lightweight virtualization mechanisms. It is one of the most popular tool used by the SDN research community that allows the rapid creation and management of network of virtual hosts, switches, controllers and links on a single Linux kernel. This emulated environment can be executed in a virtual machine (VM) (e.g., VirtualBox or VMware) or directly on a native Linux distribution. Mininet is able to run a collection of virtual end-hosts, switches, routers, and links on a single Linux kernel. Moreover, it allows to create many custom topologies and emulate some link parameters like a real Ethernet interface, e.g., link speed, packet loss, and delay. Thus, by means of Mininet and thanks to lightweight virtualization, a single system can act as a complex infrastructure. It is possible to create an emulated network that reproduces a hardware network, or a hardware network that resembles a Mininet network, and run the same binary code and applications on either platform. Although other platforms use the same technique, Mininet differs from these because of its propensity towards SDN solutions. A host generated with Mininet, through the process-based virtualization, behaves like a real machine: it can be accessed remotely through the SSH protocol and you can launch any type of application. This gives the impression of observing packets sent through real Ethernet interfaces, characterized by a certain transmission speed and a certain delay, that are processed by real network devices, be they Ethernet switches, routers or middleboxes. Mininet’s latest fundamental asset enables different complex network topologies to test the SDN network behavior and performance. Moreover, we can also create custom network topologies using the extensible Python Mininet API to create, interact, customize and share an SDN network that will work on real hardware without major changes. Mininet can also work with several different SDN controllers, e.g., Floodlight [34].

4.3 Floodlight

In the SDN world, the presence of the controller, the “brain” of the entire ecosystem, is fundamental. It has the task of listening to the requests coming from the network switches, executing its logic, applications, and responding to the switch with the instructions that must be performed by them, i.e update the Flow Tables to be used for routing packets. Many SDN controllers have been developed since the emergence of SDN e.g., Floodlight [34], OpenDaylight [35] or FlowVisor [36]. However, one of the most widespread OpenFlow controller is Floodlight. Floodlight is a Java-based open source software based on the Beacon controller implementation developed at the Stanford University, and works with physical and virtual OpenFlow switches. The Floodlight controller realizes a set of common functionalities to control and inquire an OpenFlow network. Floodlight allows the rapid development of an SDN network control software with minimal dependencies and supports a broad range of virtual and physical OpenFlow switches. Extending the Floodlight APIs we are able to implement our OpenFlow controller and test it with the prototyped network. In literature, there are some publications that have analyzed Floodlight overload [37,38,39]. In this paper, we overlook performance evaluation, focusing on the adoption of a Floodlight controller approach, and how it allows us to reduce the computation time of the Dijkstra’s algorithm in a high densely network represented by a graph. In fact, we performed a scalability analysis where we evaluated and compared the computation for both sequential and MapReduce implementations.

4.4 Hadoop MapReduce

Hadoop [5] is a big data application platform developed by Apache Software Foundation and inspired by the MapReduce systems of Google and the Google File System (GFS). It is written in Java and allows to execute applications for managing and computing intensive distributed data. Architecturally, in a MapReduce System, there are two types of nodes that control the job: JobTracker and different TaskTrackers. Worker nodes (Mapper and Reducer) are TaskTrackers while the master node is called JobTracker. It coordinates all jobs by scheduling all tasks to its TaskTrakers. The TaskTrackers perform the tasks and send occasionally reports on the status of the trial to the JobTracker. If a task fails, the JobTracker reschedules it to another TaskTracker. When a MapReduce job is invoked by an user, the JobTracker will divide the job into a set of tasks and will assign them to TaskTrackers to compute the job in parallel.

4.5 Ganglia

Ganglia [40] is an open-source, distributed and scalable control system used for monitoring large clusters. It operates by gathering, aggregating and providing vary metrics such as CPU usage, memory usage, storage, network usage and so on. Ganglia has 3 main components:

  1. 1.

    Gmond (Ganglia monitoring daemon)—runs on each node of the cluster which should be monitored and collects local monitoring metrics.

  2. 2.

    Gmetad (Ganglia meta daemon)—polls and collects metrics from gmond storing them by using storage engines like RRD.

  3. 3.

    Ganglia PHP web frontend—collects all the metrics from the gmetad daemon and displays them on dynamical HTML pages in real-time graphs.

5 The QoS-Aware Model

The purpose of this section is to describe in depth our proposed QoS model for SDN networks. The QoS-aware model is the core of our proposed system and it allows to assign a flow to a specific path according to applications requirements. In particular, the model takes into account the relationship between the type of applications (e.g., real-time vs. file transfer data) and the network parameters (e.g, delay, number of hops, energy consumption).

To reach this goal, it is very important to continuously supervise the network conditions, make the right decisions, and manage the devices accordingly. The model is also able to find the best path (or more than one best path in case of multiple flows) that can satisfy the applications flow necessities.

5.1 Taxonomy

Thus, we define the mathematical notation for our model. We can derive from a SDN topology an oriented, connected and weighed \(G = (V,E)\) graph, where V is the set of nodes and E is the set of arcs between each pair of nodes, that represents the network infrastructure under observation.

Figure 3 shows an example of real/virtual network through a weighted, directed and connected graph. Vertexes represent network devices, edges represent network links, whereas weights of edges represent different network parameters. The arcs E are bi-directional and they have associated the weight expressed in terms of latency [\({\textit{lat}}_{ij}\)]/energy consumption [\(e_{ij}\)]. The flow is routed on the graph from the source node s to the destination node t. For each flow, three parameters are given:

  1. 1.

    the source node s;

  2. 2.

    the destination node t;

  3. 3.

    the application flow requirement: \(\alpha\) and \(\beta\).

The flow equation is summarized in the following:

$$\begin{aligned} f_{ij} = \alpha {{\textit{lat}}_{ij}} +\beta {e_{ij}},\quad \forall (i, j)\in A \end{aligned}$$

where \(\alpha\) and \(\beta\) are the scale factors. This computation allows to weight the cost based on the importance of the delay and the energy consumption for a particular flow. Thus, we can manage these parameters according to the requirements of the type of application. For example, the Multimedia Applications has end-to-end delay requirements, so we can put \(\alpha = 1\) and \(\beta = 0\) in order to optimize only the delay. A latency-oriented application requires a virtual path with the lowest latency between each source microservice to each destination microservice deployed in Cloud, Edge or IoT layers. Allocated virtual paths will be updated periodically as underlying physical network change in order to guarantee a given latency requirement. Considering that there are many factors influencing connections’ properties, network changes are more and more varied and unpredictable because Cloud/Edge and IoT networking scenarios are very complex. This complexity is due to the fact that virtual paths that directly connect two nodes, actually pass through tunnels and/or overlay networks, built on different physical network devices placed in Cloud, Edge and IoT layers, that can change frequently.

Notice that we assume that there exists no pair of flows with the same origin and destination.

First of all, a parallel is established between the nodes of the graph and the OpenFlow switches within the network. In other words, the generated “dummy” graph will contain many nodes as switches that form the SDN network. Subsequently, each link present between two switches of the real (or virtual) topology must be represented by one or two arcs, which will join the nodes of the corresponding graph.

Table 1 Configuration taxonomy

Referring to the Table 1, in fact, a Full Duplex connection between the port a of the switch \(s_{1}\) and the port b of the switch s2, characterized by certain network parameters, can be considered as a pair of Simplex connections. Thus, a one-way link corresponds to a single arc, oriented from the source node to the destination one, while a bidirectional link will be mapped into a pair of edges, one for each direction in which information can flow. We remind the fact that the networks emulated with Mininet have links exclusively Full Duplex. The QoS-aware model uses the shortest path calculated using Dijkstra’s algorithm fulfilling a set of constraints.

Fig. 3
figure 3

Representation of a SDN topology through a weighted, directed and connected graph

6 Design Overview

Our proposed framework aims to ensure QoS in SDN environments deployed over heterogeneous infrastructures (e.g., Cloud or Edge), as shown in Fig. 4. It continuously monitors the system and gathers information on nodes (e.g., information of microservices) and network parameters to improve the awareness of the network status. Then, we provide an algorithm that calculates the best path among nodes to allow microservices (and hence IoT applications) to flow data in an efficient way. This is achieved by adopting the Dijkstra’s algorithm.

Fig. 4
figure 4

SDN Architecture of the proposed system

Our proposed framework is architecturally structured in the following logical modules:

  • Network Map Module: is used to collect information about the SDN real network topology (network links and switches) and maps it into a graph with the corresponding nodes and links.

  • Network Analyzer: continuously collects information about any changes in the network structure e.g, new nodes joining into the network, latency and energy consumption variations. This data is updated in the Topology module.

  • QoS Path Compute module: implements the mathematical model for finding the best path according to both flow requirements and network status. It periodically recalls the \({\textit{isQoSEnabled}}\) module to verify if the QoS policy is enabled in the Floodlight controller. If it is not enabled, the Dijkstra’s algorithm will process the resulting graph with unit weights, minimizing the number of hops. Otherwise, the Dijkstra’s algorithm will run according to the fulfillment of the application requirements in terms of latency/energy consumption minimization.

  • Flow Statistics Collection Module: gathers information on all incoming flows.

  • Flow Classifier Module: allows to identify the QoS requirements of each application flow e.g, latency, energy consumption. Therefore, according to the application requirements the \(\alpha\) and \(\beta\) parameters are initialized. In fact, for latency minimization \(\alpha = 1\) and \(\beta = 0\) QoS, while for energy consumption minimization \(\alpha = 0\) and \(\beta = 1\).

6.1 QoS Support with Dijkstra

This section evaluates the effectiveness of our control application that enables the enforcement of differentiated routing policies. In particular, the validation step aims to analyze the behavior of the control QoS-aware framework in a typical Osmotic Computing ecosystem scenario with heterogeneous applications. Moreover, according to the scenario to which the application belongs, there are required different QoS policies e.g, latency-sensitive, energy consumption-sensitive.

6.1.1 Scenario 1: Low Latency

This scenario is characterized by a network topology of 11 OpenFlow-enabled switches distributed across Osmotic Computing nodes. The network topology is deployed on each node through Mininet and consists of 11 switches and 7 hosts. The overall network is structured as shown in Fig. 5. Considering as source the node 1 and as destination the node 11, the optimal path that minimizes the end-to-end latency (highlighted in green) is through nodes 2, 4 and 10 as can be seen in Fig. 5.

Fig. 5
figure 5

Network topology showing the shortest route under single-objective Dijkstra’s algorithm (latency optimization)

6.1.2 Scenario 2: Low Energy

In this scenario, the goal is to minimize the energy consumption for the packets transmission. In overall Mininet networking topology is composed of 11 switches and 7 hosts. Figure 6 shows a high level representation. Considering as well as source the node 1 and as destination the node 11, the optimal path that minimizes the energy consumption is through nodes 3, 4, 10 as shown in Fig. 6.

Fig. 6
figure 6

Network topology showing shortest route under single-objective Dijkstra’s algorithm (energy consumption)

Table 2 summarizes routing tables of above presented scenarios.

Table 2 Paths resulting from the experimentation scenarios applying Dijkstra’s algorithms for optimizing delay or energy consumption and related values

6.2 MapReduce Dijkstra’s Algorithm Approach

Dijkstra’s algorithm is very useful in deriving the best routing path from a specific source node s to a destination node t in an SDN environment fulfilling a set of constraints e.g, hops, latency and energy consumption specified by the application requirements in Cloud/Edge and IoT scenarios. The respective scenarios are described by a network topology highly dynamic and complex, hence, the resulting graph is widespread.

Therefore, a single computational server will not be able to process such graph and, then, it is necessary to adopt the MapReduce programming model. MapReduce allows the processing of large datasets. When an input data set is processed by a MapReduce job, depending on its size will be split into many independent smaller datasets, which are committed to the map tasks. After the map phase is completed the framework sorts the outputs of the map tasks and commits them to the reducers. The MapReduce process can run several times instead of only once. The advantage of MapReduce is that MapReduce’s tasks can be run in a completely parallel manner.

Hadoop is an open-source, Java-based programming framework that supports the processing of large data sets in a distributed computing environment. It was inspired by Google MapReduce and Google file system (GFS)

From an architectural point of view, there are two types of nodes that control the job execution: one JobTracker and different TaskTrackers. The first acts as master node and coordinates all job executions by scheduling all tasks to different TaskTrakers that act as workers. TaskTrackers perform the assigned tasks and send back to the JobTracker reports on the status of processing. If a task fails, the JobTracker reschedules it on another TaskTracker. When a MapReduce job is invoked by an user, the JobTracker will divide the job into a set of tasks that are assigned to TaskTrackers to process the job in parallel.

The key element is represented by the priority queue Q that keeps a globally-sorted list of vertexes by current distance. This is not possible in MapReduce, as the programming model does not provide a mechanism for exchanging global data. Therefore, we adopted a brute force approach known as parallel breadth-first search. First of all, as a simplification, we assumed that all edges have associated weights. This assumption allows us to make the algorithm easier to be understood. The basic idea of the MapReduce Dijkstra algorithm is that iteratively the distance of all vertexes connected directly to the source vertexes is one; the distance of all vertexes directly connected to those is two; and so on.

Let us suppose that we want to compute the shortest path to vertex n. The shortest path must go through one of the vertexes in M that contains an outgoing edge to n. Therefore, we need to examine all \(m \in M\) to find \(m_{s}\), the vertex with the shortest distance. The shortest distance to n is the distance to \(m_{s}+1\).

The pseudo-code of the parallel breadth-first search algorithm is provided in the Listing 1 and Listing 2. As already assumed for sequential Dijkstra algorithm, we consider a connected, directed graph represented as adjacency lists. Distance to each vertex is directly stored alongside the adjacency list of that vertex, and initialized with distance d[v], \(v \in V\) to \(\infty\), except for the source vertex, whose distance to itself is zero. Therefore, in the pseudo-code, n denotes the nodeid (i.e., an integer) and N denoted the node’s corresponding to the adjacency list. Substantially, the algorithm works by mapping over all vertexes and emitting a key–value pair for each neighbour on the vertex’s adjacency list. Therefore, the key will contain the \({\textit{nodeid}}\) of the neighbor, and the value will be the current distance plus one.

To achieve the implementation of the Dijkstra’s algorithm using the MapReduce programming model it has been necessary to implement the \({\textit{Map}}()\) and \({\textit{Reduce}}()\) functions as follows.

Map() is invoked in the Mapper task for each available vertex within the graph. The output of the Mapper produces different key–value pairs—a key value pair having as key the source vertex, and as value the adjacent vertexes and another key–value pair where the key is given by the source vertex and the value represents the minimum distance value.

Reduce() for each key vertex all distances are gathered together and the minimum between them is chosen. Gathering of distances is performed by the Hadoop framework while the choice of the minimum distance is implemented by the user. The output of the Reducer produces another key–value pair where the key is represented by the respective selected vertex and the value is the minimum distance.

Parallel breadth-first search is an iterative algorithm, in which each iteration corresponds to a MapReduce job. At the first iteration, the algorithm discovers all vertexes that are connected to the source. At the second iteration, all vertexes connected to those are discovered, and so on. With each iteration, the algorithm expands the search frontier by one hop.

A crucial aspect of the algorithm, is the determination of the number of iterations that it needs in order to finish its computation. Typically, there are six degrees of separation suggesting that everyone on the planet is connected to everyone else by at most six steps (the people a person knows are one step away, people that they know are two steps away, etc.). In practical terms, we will iterate our algorithm until there are no more vertex distances that are \(\infty\).

The execution of an iterative MapReduce algorithm requires a non-MapReduce “driver” program, which submits a MapReduce job in order to iterate the algorithm, checks to see if a termination condition has been met, and if not, repeats. The iterative approach is realized using the Hadoop API to construct “counters”, which, can be used for counting events that occur during execution, e.g., number of corrupt records, number of times a certain condition is met, or anything that the programmer desires. Counters can be defined to count the number of nodes that have distances of \(\infty\): at the end of the job, the final counter value is checked in order to see if another iteration is necessary. The counter values of each worker are periodically propagated to the master. It brings together the values from the completion of the mapping operations and reducing and subsequently returned to the user. The Mapper and Reducer through the use of Reporter can communicate the progress (Fig. 7).

figure a
figure b
Fig. 7
figure 7

SDN QoS-aware framework modular structure

7 Performance Assessment

In this section, we evaluate the QoS-aware framework, which is implemented as an external module on top of the Floodlight controller, presented in Sect. 5. In order to validate the system, we conducted a micro-benchmark comparing the performance of the sequential implementation of the Dijkstra’s algorithm with the distributed one. The distributed implementation uses a Hadoop Cluster composed of three virtual machines (VMs), in which one acts as a master and two as slaves. Each VM have the following hardware and software characteristics: 4 vCPUs @2.9 GHz, 8 GB of RAM and Ubuntu Server 16.04 LTS.

In order to evaluate the sequential implementation of the Dijkstra’s algorithm, we performed it on a VM with the same software and hardware configuration. The QoS-aware framework was also implemented on a VM with the same hardware and software configuration.

In our cluster we installed Apache Hadoop version 2.6.1 and JDK version 1.8. For the cluster monitoring we used Ganglia version 3.7.1; gmetad daemon was configured on the master node, while gmond daemons were configured on both slaves.

Fig. 8
figure 8

Average execution time expressed in seconds of distributed and sequential approaches—logarithmic scale

To evaluate the performance of the proposed system in SDN environments we also used Mininet version 2.2.2 to generate the network topology. We also deployed Floodlight controller version 1.2 and OpenFlow 1.3 on a VM with the same hardware and software characteristics as above specified. Specifically, for each scenario we derived the resultant graph from the network topology created using Mininet and calculated the shortest path between one source and a destination.

We performed a scalability analysis based on the size of the input dataset to evaluate the average execution time of both implemented approaches in different scenarios. Specifically, we generated our input datasets according to the resulting network topologies present in the Osmotic Computing scenario, with different sizes: (i) 10 nodes, (ii) 100 nodes, (iii) 1000 nodes and (iv) 10000 nodes respectively. We remark that in each proposed scenario the specified number of nodes are randomly connected to each other in order to create a weighted, directed and connected graph. In order to have truthful results we performed 30 subsequent iterations of the algorithm for both distributed and sequential approaches and calculated mean values of the execution time and 95% confidence intervals.

Table 3 Summary of execution time

In Fig. 8, the average execution time of both, distributed and sequential approaches, are shown. As the reader can see, in Fig. 8a–c the execution time of the distributed approach is higher compared to that registered using the sequential one. This behavior is due to the overhead introduced by the intra-cluster nodes communications. A different behavior can be seen in Fig. 8d. That is, the distributed implementation of the Dijkstra’s algorithm is faster that the sequential one; this behaviour is due to the fact that the overhead introduced by the internal communications is comparable to the time required to process data.

The execution times of the sequential implementation of the Dijkstra’s algorithm on graph with 10,000 node is computationally onerous, in fact the collected times are really high. Even if we add more RAM resources, we do not obtain any significant benefit in the execution time. Table 3 summarizes the execution time for each scenario.

Figure 9 shows the time needed for the map and reduce tasks of the without considering any communication overhead. As we can see, the values illustrated in the figure are almost insignificant compared to those showed in Fig. 8.

Fig. 9
figure 9

Total time spent by all Map/Reduce tasks expressed in seconds

Fig. 10
figure 10

Average throughput In/Out expressed in Bytes/s

A very interesting aspect of a MapReduce application is represented by the I/O bytes that convey within the cluster nodes. This concept cannot be analyzed with the sequential approach because it runs on a single node and therefore there is not any I/O bytes exchange. Hence, Fig. 10a, b illustrate the trend characterizing the I/O throughput within the cluster registered on the \({\textit{Slave}}1\) and \({\textit{Slave}}2\) respectively.

At a first sight, we notice that the I/O throughput by both slave are not quite similar; this depends on the network link that interconnects cluster VMs. Considering Slave1, except for the case with 100 nodes, the In throughput is always greater than the Out one. A different behaviour can be seen for the Slave2, that is, considering the cases with 10 and 100 nodes the In throughput is greater than the Out one, for other cases the opposite behavior is shown. This can be due to the fact that the Slave1 is faster to finish its tasks and therefore commits and generates more data. Another cause can be due to the communication, Slave1 is slower to commit and respond due to communication latency and therefore Slave2 being faster to respond executes more tasks.

These trends are explained by the two phases of a Map/Reduce job—shuffling and reducing. In particular, during the shuffling phase, each reducer will contact every node of the cluster to collect intermediate files, while during the reduce phase, the final results of the whole job will be written to HDFS. Therefore, as the number of input records increases, the shuffling and reducing phases will deal with more data.

A drastic increase is registered using an input dataset with 10,000 nodes where the amount of bytes in input and output increases by about 48% with respect to previous ones. As we expect greater the input size dataset the grater the quantity of exchanged bytes.

Another performance parameter of our micro-benchmark is the CPU usage. The trend that characterizes the percentage of CPU usage collected during the execution of the sequential approach is illustrated in Fig. 11. As the reader can see, its value is almost constant regardless of the input dataset size.

Fig. 11
figure 11

CPU usage % of sequential approach

Figures 12 and 13 show the percentage of CPU usage on MapReduce cluster nodes (Slave1 and Slave2) obtained through Ganglia.

Fig. 12
figure 12

Hadoop cluster: Slave1—CPU usage metrics

By analyzing the Figures, the x axis represent the time, the y axis the percentage of CPU utilisation.

The CPU utilization plots of the cluster nodes which are presented in the Figs. 12 and 13 illustrate that the Map phase of the MapReduce job is expanded when the input dataset size increases, since the number of the InputSplits, and the corresponding number of issued Map tasks rises.

The intermediate \(\langle {\textit{key}},{\textit{value}}\rangle\) pairs are evenly distributed among the reducers by the HashPartitioner and as consequence, a slight increase in the duration of the Reduce phase is also observed when the size of the input text file increases.

The idle period in the beginning of the execution accounts for the Job initialization phase that is performed by the JobTracker. The idle period between the Map and the Reduce phases indicates that the Map task of this specific TaskTracker has finished, but the JobTracker waits for the completion of Map tasks that run on other TaskTrackers, so that the Reduce phase can be started for the Job. The idle period after the Reduce phase depicts that the Reduce task of this specific TaskTracker has completed its execution, but the JobTracker waits for Reduce tasks that run on other TaskTrackers to finish as well.

As the input dataset size increases, there is a major period of time characterized by a high percentage of CPU cycles due to outstanding I/O and increased outgoing network traffic. This behavior is more evident considering an input dataset of 10,000 nodes; the time period characterized by a high percentage of CPU usage varies between 2 and 2.5 min. However, the execution of the sequential approach with an input dataset of 10,000 nodes is characterized by a high percentage of CPU usage about 50% for a large period of time 2450 s.

Fig. 13
figure 13

Hadoop cluster: Slave2—CPU usage metrics

8 Conclusions and Future Works

The paper investigated how the adoption of Hadoop/Map-Reduce can improve the convergence of the Dijkstra’s algorithm in the implementation of SDNs. In order to evaluate this technique, we made a comparison between the sequential approach and the parallelised one. Experimental results show an improvement in network performances when the number of nodes increases, and this is an interesting results for Osmotic Computing covering several heterogeneous and complex systems. The numerical results obtained during the experimentation depend on some limitations of Hadoop, such as the overhead introduced by the intra-cluster nodes communications. However, the impact of the Hadoop implementation on the whole Osmotic scenario is hard to be investigated, and we will deal with this kind of analysis in our future works. Moreover, we plan to design and implement a scheduler able to run the sequential or the parallelised version of Dijkstra’a algorithm based on our finding in this paper.