Feature selection for optical network design via a new mutual information estimator

doi:10.1016/j.eswa.2018.04.018

Expert Systems with Applications

Volume 107, 1 October 2018, Pages 72-88

https://doi.org/10.1016/j.eswa.2018.04.018 Get rights and content

Highlights

•
Feature selection of topology parameters to support optical network design systems.
•
New mutual information via entropy estimation to filter the most important parameters.
•
Real-world and random-generated optical networks are under study.
•
Some parameters related to congestion, connectivity, distance, and degree stand out.
•
Wavelengths use, an NP-hard metric, explained by polynomial computation time features.

Abstract

An efficient design of optical networks is a complex challenge that requires knowledge of the desired performance trends. Such knowledge would have a potential impact on an expert system to this end, for instance, would help identify reliable topological parameters to characterize the desired behavior of the network. Feature selection from information theory is widely explored in many areas of expert and intelligent systems, and it is a suitable technique to choose such parameters. In optical networks, many signals are carried along the same fiber, each one with its wavelength. A possible desired performance is the minimal usage of different wavelengths, which can be influenced by many topological parameters established in the network design. However, it is difficult to determine the dependence between topological parameters and the number of wavelengths, because this latter addresses an NP-hard problem. We perform a comprehensive literature review to find topological metrics that are easier to compute and apply feature selection using a new mutual information estimator. Based on coincidence detection, this estimator is lightweight and easy-to-use and allows measuring the relevance between discrete and continuous features, without discretization nor estimating probability density functions. For this purpose, tests are performed using 315 topological parameters from graph theory and complex networks, in 15 real-world optical networks and 2.2 million random topologies that mimic real-world ones. The topological parameters are ranked based on its mutual information values, obtaining a set of the most influential for explaining the wavelength requirements. Among these parameters, as a result, the method highlights the ones derived from the edge betweenness. Moreover, some parameters proposed by the literature do not perform as expected. The results of this study can serve as a basis for new expert systems to design and expansion of optical networks, driven by the most relevant topological parameters.

Introduction

Optical Networks are currently the foundation of the extensive global digital communication network due to many factors such as their large traffic capacity, velocity, and reach. In this networks, many independent channels can share the same optical fiber, increasing the data rate within the same infrastructure, i.e., many signals can be transported using a same optical fiber. Each signal uses a wavelength, and this wavelength should be different for each signal transported. The totality of these different wavelengths passing through the fibers is commonly a variable of interest in the literature. Conventional optical networks, such as Optical Transport Networks (OTN), use the Wavelength Division Multiplexing (WDM) technology, which allows the implementation of Wavelength Routed Optical Networks (WRON) (Banerjee & Mukherjee, 2000). A new generation of optical networks, called Elastic Optical Networks (EON), is based on Optical Orthogonal Frequency Division Multiplexing (OOFDM). This technology allows more flexible usage of the optical spectrum, with channels of different sizes, achieving higher spectral efficiency (Christodoulopoulos, Tomkos, Varvarigos, 2011, Tessinari, Puype, Colle, Garcia, 2016, Zhang, De Leenheer, Morea, Mukherjee, 2013).

WRON networks involve the Routing and Wavelength Assignment (RWA) problem, which addresses the demand routing and wavelength allocation to the optical channels. The RWA can be solved to optimality, for relatively small networks using integer programming models, for instance as in Jaumard, Meyer, and Thiongane (2007) and Cousineau, Perron, Caporossi, Paiva, and Segatto (2015). There are many approaches with different objectives, but the one that separates the RWA into two clear subproblems is prevalent: demand routing, followed by wavelength allocation to the optical channels, with the objective to minimize the number of wavelengths to meet the demand for the attributed routing. This objective function is often chosen in optical network design because the wavelength requirement is related to the cost and the capacity of networks.

The RWA problem includes the wavelength continuity constraint, which makes it an NP-hard problem (Zang, Jue, & Mukherjee, 2000). In this context, each channel should use the same wavelength from the start to the end of the route. As a consequence, such constraint can generate a fragmentation of the available spectrum, which could have wavelengths available in many links, but without continuity between consecutive links, hampering to create routes with more than one link.

The called minimum number of wavelengths is the minimum number of wavelengths required to meet a given traffic demand. The term minimum number of wavelengths is called the number of wavelengths and denoted by λ. In the present study, λ is calculated in an exact way using the integer programming model proposed by Cousineau et al. (2015), as it is said in Section 2.1.

In EON networks, there is the corresponding Routing, Modulation, and Spectrum Assignment (RMSA) problem, which adds the non-overlapping constraint (Patel, Ji, Jue, & Wang, 2012) to the RWA. The optical spectrum is subdivided into slots, and channels of different sizes are created from combinations of the slots to meet the demands of different rates and requirements. The continuity constraint also applies to the channels, but the combination of channels with different sizes causes another type of fragmentation, given that only contiguous slots can form channels. A common strategy for addressing this problem is the subdivision of the spectrum into partitions, allocating to each one only channels of the same size. Hence, within each partition, the problem is reduced to the classical RWA (Wang & Mukherjee, 2014). In any case, because the RMSA is a more complex problem, the network should ideally have a low requirement for the number of wavelengths to meet the continuity constraint (Talebi et al., 2014).

Regardless of network type, an expert system to design it often involves conflicting aspects, which leads to optimize a set of features simultaneously as in Yang, Wu, Chen, and Dai (2010), or different types of network flow like in Przewoźniczek, Goścień, Walkowiak, and Klinkowski (2015). These problems have been extensively studied using all sort of artificial intelligence methods, neural networks, and genetic algorithms (Hanay, Arakawa, & Murata, 2015). For this design process, linear programming models are also frequently used, as in Antunes, Craveirinha, and Climaco (1993) and Yoon, Baek, and Tcha (1998).

The decision of which variables to include in the design model is as important as, or more important than, the model itself. Then the choice of the features to be optimized define the profile of the optical networks obtained. For instance, survivability, traffic capacity, and resources requirements such as number of wavelengths, depend on the variables chosen to be optimized in the design process.

In this work, we are particularly interested in investigating which are the topological parameters that better explain wavelength requirements. A suitable technique to obtain these parameters is the feature selection, a topic widely explored in many areas of expert and intelligent systems (Bennasar, Hicks, & Setchi, 2015). We perform a comprehensive literature review of graph topological parameters that are easier to compute than the number of wavelengths. Then, we apply a filter feature selection based on a new mutual information estimator to rank and select those parameters that lead to low wavelength usage. Bennasar et al. (2015) also works with filter method based on mutual information, but in our case, the mutual information estimator can be applied to discrete or continuous data regardlessly, without discretization nor any a priori knowledge concerning source distribution. Filter methods have advantages like computational efficiency and scalability in terms of the dataset dimensionality. The expected drawbacks in filter methods are the lack of information about the interaction between the parameters (the features) and the classifier, and selection of redundant parameters (Bennasar et al., 2015).

A novel entropy estimator based on coincidence detection from Montalvão, Attux, and Silva (2014) is applied to define the new mutual information estimator used in this study. This estimator is lightweight and suitable for high-dimensional spaces, similar to the Neighborhood Mutual Information from Hu et al. (2011). Our new approach uses the so-called “Method of Coincidence”, a notion borrowed from Statistical Mechanics (Ma, 1985). In Section 2.3 such entropy estimator is presented, first for discrete variables, then extended to continuous ones. The resulting mutual information estimation allows measuring the relevance between discrete and continuous features, without discretization nor any a priori knowledge concerning source distribution. This technic is easy-to-use and suitable for small datasets, or where the data is difficult to reproduce. Hu et al. (2011) also estimate entropy for discrete and continuous features without discretization. However, our approach based on the method of coincidence is a more intuitive and not need to estimate probability density functions.

The topological parameters are ranked based on its mutual information values, obtaining a set of the most influential for explaining the wavelength requirements. As mutual information does not assume linearity between the variables (Bennasar et al., 2015), then relevant parameters are selected regardless of their relationship with the wavelength requirement. The results of this study serve as a basis for new expert systems to design and expansion of optical networks.

Some authors have studied topological parameters in WDM, as can be seen in Section 1.3, but with less comprehensiveness than the present work, and without any result of parameters sorting. To the best of our knowledge, no studies relating network topological features to EON parameters can be found in the literature. In the present study, optical networks are modeled as graphs, then this modeling and needed graph theory concepts are both presented in Section 1.1. In Section 1.2, we expose the problem in a concrete instance.

An intuitive way to represent an optical network is through graphs. A graph is denoted by G(V, E), or just G, where V is a set of vertices and E is a set of edges (Diestel, 2016). The vertices correspond to the network nodes, and the edges correspond to the links, so the edges connect the vertices in the same way the links connect the nodes. The number of vertices is the order of the graph, denoted as n = |V|. The number of edges is the size of the graph and is denoted as m = |E|.

From the graphs, it is possible to compute the so-called graph invariants, or topological invariants, which are numeric parameters that do not change when labels of vertices or edges change. Invariants are important because they represent topological parameters of the graph and, consequently, of the optical network modeled by the graph. Hence, graph invariants are computed in this study to understand how the network topological parameters influence the number of wavelengths (λ).

Graphs can have weights in its vertices or edges. In this case, the graph is called weighted, otherwise, it is called unweighted. A graph G(V,E) is connected if, for each pair of vertices u, v  ∈  V, there is at least one path interconnecting u and v. The shortest paths in terms of number of edges are called geodesics. In unweighted graphs, the distance between two vertices u, v  ∈  V is defined as the length of a geodesic interconnecting u and v. The mean distance of a graph is just the mean of all distances of all pair of vertices.

In addition, depending on the application, edges may or may not have a direction. When edges of a graph have a direction, the graph is said to be directed. In the context of optical networks, each link allows flows in both directions. Thus, only non-directed graphs are discussed in this work.

A simple graph is the one that (i) has at most one edge for each pair of vertices, i.e., without parallel edges, (ii) has no loops, i.e., without edges that start and end in the same vertex, and (iii) is non-directed.

Let G be a simple graph on n vertices and m edges. The maximum number of edges, m_max, in any simple graph on n vertices is a combination of n taken 2 at a time. Then, the edge density (α) of G is the ratio between m and m_max. This topological invariant can be used to estimate an amount of resources in the network.

Another property in graph theory is the vertex degree. For each vertex i, the number of edges connected to this vertex is called the vertex degree (or just degree) of this vertex (Diestel, 2016). The mean degree is the mean of the degrees of all vertices of a graph, and the variance degree is the variance also of the degrees of all vertices of a graph.

A graph is called 2-connected when, for each pair of vertices, there are at least two vertex-disjoint paths interconnecting the pair. This property is convenient to optical networks because, in case of single vertex or link failure, the graph remains connected.

In the next section, we present a concrete instance of the problem treated in the present study via graph modeling, which induces us to explore the optical network topologies to understand the wavelength requirements.

In optical network design, there is a huge number of possible topologies that can be generated from fixed numbers of nodes (n) and links (m), which are the most elementary topological parameters. For instance, there are 11,716,571 possible 2-connected topologies with 10 nodes. So it is not easy to find topologies that meet a set of desired parameters. To illustrate how wavelength requirements depend on topological parameters, Fig. 1 shows topologies with the same number of nodes (n = 10) and links (m = 15), but with different wavelengths requirements in uniform traffic demand, which are calculated by the method of Cousineau et al. (2015), as established in Section 1.

Efficient optical network design is a complex challenge. Among the multiple ways to connect nodes using links, the control of wavelength requirements is not easy. To reduce this complexity, instead of directly using wavelength requirement values it is suitable to explore the topological parameters of optical networks and identify how these parameter values can influence the wavelength requirements.

Thus, this study aims to find topological parameters, which can be obtained with low computational cost and that can be optimized to lead to a low wavelength requirement. Mutual information is used to select the most influential ones. The knowledge of such parameters can better guide the network topology design to expand a current optical network or create new networks.

As we model optical networks via graphs, then the topological parameters correspond to the invariants from graph theory. A complete list of the invariants analyzed in this paper can be found in Appendix A.

During the design phase of optical networks, a set of invariants can be used to characterize network topologies and identify the ones that optimize certain parameters of interest, such λ, for instance. Over time, some studies in the literature have made an effort in this direction, as can be seen in the following section.

Baroni and Bayvel (1997) are the first to analyze the wavelength requirements of real-world networks and a large number of randomly connected WRON networks for wide-area backbone applications. The networks considered in the study are simple, unweighted, 2-connected, and satisfy 0.1 < α < 0.4. The wavelength continuity constraint is assumed. It is also assumed uniform traffic demand, i.e., for each pair of nodes {i, j} in the network, there is a single demand from i to j, where each demand uses a geodesic, and a single wavelength from the start to the end of the transmission. Through a heuristic, the number of wavelengths of the networks under study is estimated and then analyzed with respect to some invariants, such as vertices number and the edge density. Notice that in this paper the edge density is called physical connectivity. According to the observations, the average of λ requirements is practically independent of the network order but strongly decreases as edge density increases.

In Fenger, Limal, Gliese, and Mahon (2000), wavelength requirements in relation to the topological parameters of WDM optical networks are also analyzed. For this purpose, a few million random 2-connected networks with a certain number of nodes and links are generated, and the analyzed invariants of these networks are mean degree, variance degree, and number of spanning trees, all of which are described in Appendix A. General results are obtained with the average of each topological invariant studied. In this study, the authors also assume simple and unweighted graphs, uniform traffic, routing using geodesics, and wavelength continuity. It is also considered that there is no limit on the number of wavelengths that can pass through each link. The results obtained for networks with 30 vertices and 45 edges show that the average of λ increases with variance degree. The authors note that more regular networks, where the vertices tend to have closer degrees values, require fewer wavelengths. In turn, observations in networks with the number of vertices and edges equal to 10 and 20, 20 and 30, 20 and 45, and 30 and 45, respectively, show an inversely proportional relation between the number of spanning trees and the average of λ. This relation is also observed for the mean degree, with tests performed in networks with 10, 20 or 30 vertices. A considerable decrease in wavelength requirements is found as the mean degree increased to somewhere between 4 and 5, without significant advantages for networks with higher mean degrees. Thus, using the analyzed sample, the study concludes that networks with a mean degree between 4 and 5 and with low variance degree are the best possible in terms of wavelength requirements. It is also concluded that the number of spanning trees is a very accurate invariant to measure the network quality in terms of the traffic accommodation efficiency, given that the observations show a power-law relation between the number of spanning trees and the average of λ.

In the study by Châtelain, Bélanger, Tremblay, Gagnon, and Plant (2009), a set of 18 real networks with the number of nodes ranging from 11 to 53 is analyzed, as is a set with 1.5 × 10⁶ random networks with the number of nodes ranging from 10 to 50, in 10 units steps. This study only analyzes random networks with the number of edges equal to 1.5, 2 and 2.5 multiplied by the number of nodes. For the set of real networks, it is found that 0.05 < α < 0.45, where the lowest values are associated with the largest networks. The number of wavelengths used in the study is estimated by a lower bound. For random networks, a power-law relation is established between the average of λ and the algebraic connectivity (described in Appendix A). From this relation, a concise equation is derived to predict wavelength requirements, which is tested in real long-distance networks. The authors state that the estimation of wavelength requirements based on algebraic connectivity is more accurate than the estimations performed using the variance degree, the numbers of spanning trees, and the mean distance.

Yuan and Xu (2010) also study wavelength requirements, but for optical networks with small-world and scale-free physical topologies. Two characteristics are analyzed: mean distance and 1-shell structure (Bickle, 2013). Wavelength requirements are also studied in the case of network evolution. About a hundred networks with 100 nodes and 200 links are analyzed. The traffic demand used for the RWA is not necessarily all-to-all, the links allow bidirectional flow, and the routing does not force the use of geodesics. The observations show that the mean distance traveled by the signal is directly proportional to the mean distance. Also according to the observations, networks with shorter mean distances tend to require fewer wavelengths. For these networks, the wavelength requirements growth rate is lower when traffic volume and network order increase. Additionally, the presence of 1-shell structure can increase the wavelength requirements. The higher the number of 1-shell is, the more wavelengths are required by the network.

We emphasize the fact that, in almost all works discussed above, (i) the average of the number of wavelengths is considered, and that can hide extremal cases of excessive or reduced wavelength requirements; (ii) the number of wavelengths is computed using heuristics or estimated by lower bounds, that can lead to imprecise relations between topological invariants and wavelength requirements. In contrast, our work considers extremal wavelength requirements, and compute them as optimal solutions of an ILP model (Cousineau et al., 2015). These are two important aspects of our methodology, which is described in next section.

Section snippets

Methodology

The methodology consist, in short, to generate a random sample of optical networks which are represented by graphs, and to verify in this sample which topological invariants have the greatest influence on the number of wavelengths. This inspection is performed by applying the new mutual information estimation presented in the Section 2.3. This analysis is also performed for real networks, and the results are compared.

Literature presents many methods to compute mutual information with good

All networks of all orders together

The mutual information, I(λ;k), is estimated for all k = 1,..., 315 invariants, with all random networks described in Section 2.1, using the methodology proposed in Section 2.3. To make this computation feasible for such a large set of data as the one in this study, the entire data matrix with 315 columns (number of invariants) by $2.2 \times 10^{6} - 52$ rows (number of networks) is divided into 100 equal parts (S₁, S₂, ..., S₁₀₀) through systematic separation. The lines are numbered from 1 to $2.2 \times 10^{6} - 52,$

Conclusion

Expert systems to optical network design currently deal with conflicting aspects like computational and network resources (Yang et al., 2010) or different types of network flows (Przewoźniczek et al., 2015). These approaches are usually hard not merely because of their computational complexity but as well due to the immense scale of its solution space. These problems have been extensively studied using all sort of artificial intelligence methods, neural networks, and genetic algorithms (

Acknowledgment

This work is partially supported by Conselho Nacional de Desenvolvimento Científico e Tecnológico - CNPq, with projects number 462477/2014-2, 304564/2016-8, and 304853/2015-1.

References (34)

C.H. Antunes et al.
A multiple criteria model for new telecommunication service planning
European Journal of Operational Research
(1993)
M. Bennasar et al.
Feature selection using joint mutual information maximisation
Expert Systems with Applications
(2015)
M. Cousineau et al.
RWA problem with geodesics in realistic OTN topologies
Optical Switching and Networking
(2015)
D.R. De Araújo et al.
An evolutionary approach with surrogate models and network science concepts to design optical networks
Engineering Applications of Artificial Intelligence
(2015)
Y.S. Hanay et al.
Network topology selection with multistate neural memories
Expert Systems with Applications
(2015)
Q. Hu et al.
Measuring relevance between discrete and continuous features based on neighborhood mutual information
Expert Systems with Applications
(2011)
B. Jaumard et al.
Comparison of ILP formulations for the RWA problem
Optical Switching and Networking
(2007)
A.N. Patel et al.
Routing, wavelength assignment, and spectrum allocation algorithms in transparent flexible optical WDM networks
Optical Switching and Networking
(2012)
M. Przewoźniczek et al.
Towards solving practical problems of large solution space using a novel pattern searching hybrid evolutionary algorithm–an elastic optical network optimization case study
Expert Systems with Applications
(2015)
R.S. Tessinari et al.
ElasticO++: An elastic optical network simulation framework for OMNeT++
Optical Switching and Networking
(2016)

R. Wang et al.

Spectrum management in heterogeneous bandwidth optical networks

Optical Switching and Networking

(2014)

Y. Yang et al.

Multi-objective optimization based on ant colony optimization in grid over optical burst switching networks

Expert Systems with Applications

(2010)

M. Yoon et al.

Design of a distributed fiber transport network with hubbing topology

European Journal of Operational Research

(1998)

D. Banerjee et al.

Wavelength-routed optical networks: Linear formulation, resource budgeting tradeoffs, and a reconfiguration study

IEEE/ACM Transactions on Networking

(2000)

S. Baroni et al.

Wavelength requirements in arbitrarily connected wavelength-routed optical networks

Journal of Lightwave Technology

(1997)

A. Bickle

Cores and shells of graphs

Mathematica Bohemica

(2013)

B. Châtelain et al.

Topological wavelength usage estimation in transparent wide area networks

Journal of Optical Communications and Networking

(2009)

Cited by (7)

Dynamic subspace dual-graph regularized multi-label feature selection
2022, Neurocomputing
Citation Excerpt :
Conventional feature selection methods are grouped into three categories: filter models, wrapper models and embedded models [4,5]. Filter models capture the optimal feature subset based on a specific evaluation criterion that is independent of any classifier [6,7]. On the contrary, wrapper models are dependent on classifiers, which obtain better classification results based on a specific classifier but lower execution efficiency [8,9].
In multi-label learning, feature selection is a topical issue for addressing high-dimension data. However, most of existing methods adopt imperfect labels to perform feature selection. Although some graph-based multi-label feature selection methods are proposed to deal with the problem, they adopt the fixed graph Laplacian matrix so that the performances of these models are under-performing. To this end, this paper proposes a Dynamic Subspace dual-graph regularized Multi-label Feature Selection method named DSMFS. DSMFS decomposes the original label space into a low-dimensional subspace, and then both the dynamic label-level subspace graph and the feature-level graph are used to obtain a high-quality label subspace to conduct feature selection process. Seven state-of-the-art methods are compared to the proposed method on twelve multi-label benchmark data sets in the experiments. Experimental results demonstrate the superiority of DSMFS.
A WSFA-based adaptive feature extraction method for multivariate time series prediction
2024, Neural Computing and Applications
Making intelligent topology design choices: Understanding structural and physical property performance implications in optical networks [Invited]
2021, Journal of Optical Communications and Networking
Intelligent design of optical networks: Which topology features help maximise throughput in the nonlinear regime?
2020, 2020 European Conference on Optical Communications, ECOC 2020
Pixel based sar image classification using random forest algorithm
2019, International Journal of Innovative Technology and Exploring Engineering
Compact 2×2 parabolic multimode interference thermo-optic switches based on fluorinated photopolymer
2019, Chinese Physics B

View all citing articles on Scopus

View full text

Feature selection for optical network design via a new mutual information estimator

Highlights

Abstract

Introduction

Section snippets

Methodology

All networks of all orders together

Conclusion

Acknowledgment

European Journal of Operational Research

Expert Systems with Applications

Optical Switching and Networking

Engineering Applications of Artificial Intelligence

Expert Systems with Applications

Expert Systems with Applications

Optical Switching and Networking

Optical Switching and Networking

Expert Systems with Applications

Optical Switching and Networking

Optical Switching and Networking

Expert Systems with Applications

European Journal of Operational Research

Wavelength-routed optical networks: Linear formulation, resource budgeting tradeoffs, and a reconfiguration study

IEEE/ACM Transactions on Networking

Wavelength requirements in arbitrarily connected wavelength-routed optical networks

Journal of Lightwave Technology

Cores and shells of graphs

Mathematica Bohemica

Topological wavelength usage estimation in transparent wide area networks

Journal of Optical Communications and Networking