Optimal network flow: A predictive analytics perspective on the fixed-charge network flow problem

doi:10.1016/j.cie.2016.07.030

Computers & Industrial Engineering

Volume 99, September 2016, Pages 260-268

https://doi.org/10.1016/j.cie.2016.07.030 Get rights and content

Highlights

•
A predicative model is investigated to determine whether or not arcs are selected in an optimal solution of a FCNF problem.
•
The accuracy of the predictive mode is very high.
•
The model has useful explanatory power regarding the predictors defined.
•
Component importance measure is developed to rank the arcs in the network.

Abstract

The fixed charge network flow (FCNF) problem is a classical NP-hard combinatorial problem with wide spread applications. To the best of our knowledge, this is the first paper that employs a statistical learning technique to analyze and quantify the effect of various network characteristics relating to the optimal solution of the FCNF problem. In particular, we create a probabilistic classifier based on 18 network related variables to produce a quantitative measure that an arc in the network will have a non-zero flow in an optimal solution. The predictive model achieves 85% cross-validated accuracy. An application employing the predictive model is presented from the perspective of identifying critical network components based on the likelihood of an arc being used in an optimal solution.

Introduction

The fixed charge network flow problem (FCNF) can be easily described as follows. For a given network, each node may have a supply or demand commodity requirement and each incident arc have variable and/or fixed costs associated with commodity flow. The aim of the FCNF is to select the arcs and assign feasible flow to them in order to transfer commodities from supply nodes to demand nodes at a minimal total cost. The transportation problem (Balinski, 1961, El-Sherbiny and Alhamali, 2013), lot sizing problem (Steinberg & Napier, 1980), facility location problem (Aikens, 1985, Daskin, 1995), network design problem (Costa, 2005, Ghamlouche et al., 2003, Lederer and Nambimadom, 1998) and others (Armacost et al., 2002, Jarvis et al., 1978) can be modeled as a FCNF.

The FCNF problem is known to be NP-hard (Guisewite & Pardalos, 1990). A significant amount of effort has been invested to study and develop efficient approaches to the FCNF. Many techniques commonly utilize branch and bound to search for an exact solution to the FCNF (Barr et al., 1981, Cabot and Erenguc, 1984, Driebeek, 1966, Hewitt et al., 2010, Kennington and Unger, 1976, Ortega and Wolsey, 2003, Palekar et al., 1990). Branch and bound however may be inefficient due to lacking tight bounds during the linear relaxation step. Heuristic approaches to find the near-optimal solution of the FCNF have generated considerable research interest (Adlakha and Kowalski, 2010, Antony Arokia Durai Raj, 2012, Balinski, 1961, Kim and Pardalos, 1999, Molla-Alizadeh-Zavardehi et al., 2011, Monteiro et al., 2011, Sun et al., 1998). State-of-the-art MIP solvers combine a variety of cutting plane techniques, heuristics and the branch and bound algorithm to find the global optimal solution. Modern MIP solvers use preprocessing methods to reduce the search space by taking information from the original formulations, which significantly accelerate the solving processes (Bixby, Fenelon, Gu, Rothberg, & Wunderling, 2000). In this paper, we take a decidedly different approach to leveraging information from the problem formulation and FCNF instances. That is, we are interested in gaining information about how the various topological and component characteristics relate to the selection of arcs used to transmit the optimal flow. At this time, we have found no literature that approaches a study of the FCNF problem from the perspective of statistical learning.

FCNF formulations are useful in many practical problems. Modern societies are heavily dependent on distributed systems, e.g. communication networks (Cohen, Erez, Ben-Avraham, & Havlin, 2000), electric power transmission networks (Dobson, Carreras, Lynch, & Newman, 2007), and transportation networks (Zheng, Gao, & Zhao, 2007). Designing and maintaining such systems is an important research area in network science. In particular, developing resilient network infrastructures (i.e., resilient with respect to natural disasters or intentional attacks) is of utmost importance and the ability to identify critical components in complex networks has reached a level of national urgency (Birchmeier, 2007). The destruction or damage of one or more critical components in a networked system could have significant consequences in terms of overall system performance (Bell, 2000, Smith et al., 2003). The definition of component criticality is often associated with an overall network performance metric. A component whose hypothetical failure most impacts the network performance level is identified as critical. A substantial body of work using a variety of methods has focused on identifying critical components within networks, e.g. topological approach (Bompard et al., 2009, Crucitti et al., 2005), simulation (Eusgeld, Kröger, Sansavini, Schläpfer, & Zio, 2009), optimization (Bier et al., 2007, Shen et al., 2012, Zio et al., 2012), service measure (Dheenadayalu et al., 2004, Scott et al., 2006) and graph theory (Demšar, Špatenková, & Virrantaus, 2008). In this study we consider an application of our statistical model with respect to identifying critical components wherein the minimum total commodity routing cost, inclusive of fixed costs, is the overall network performance metric.

To the best of our knowledge no existing work has developed models to help characterize predictive network features of optimal solutions to the FCNF. More broadly, little work has been published so far in the application of statistical learning to traditional optimization or network problems. Rocco and Muselli, 2004, Rocco and Muselli, 2005 developed a decision tree and a hamming clustering model to predict network connectivity reliability in graphs. Hamming clustering is applicable only if both the predicted value and all predictors are binary (Muselli & Liberati, 2002). The binary predictions relating to connectivity were made based on a single type of predictor – the status of each arc in the graph as either failed or operating. Based on this information they attempted to evaluate the reliability of origin-destination connectedness. Empirically they create one network instance (11 nodes, 21 edges) and randomly sample from the possible state space of edge failures. Among the possible $2^{21}$ states, 2000 were assigned to a training set and 1000 assigned to a test set. The models were developed on the 2000 training observations and highly accurate predictions were observed on the test set. While the predictive models developed were highly accurate, they are inherently linked to the single network instance considered.

In this study we employ a statistical learning technique to analyze the data associated with optimal FCNF solutions and we develop a relatively generalizable model based on several salient network features to predict which arcs will be used in an optimal solution. By solving thousands of generated FCNF instances we collect over 60,000 observations and develop a logistic regression model based on the dataset. This model allows us to quantify the influence of several important network characteristics. The resulting model has several potential applications. In this study, we demonstrate an application for providing an alternative approach to identifying critical network components. The remainder of this paper is organized as follows. Section 2 introduces the background of the FCNF and the logistic regression model. The process for developing the predictive model is discussed in Section 3. The identification of critical components using the model is presented in Section 4. Section 5 summarizes the results and introduces planned future work.

Section snippets

Fixed charge network flow problem

The fixed charge network flow (FCNF) problem is described on a network $G = (N, A)$ , where N and A are the set of nodes and arcs, respectively. Let $c_{ij}$ and $f_{ij}$ denote the variable and fixed cost of arc $(i, j) \in A$ , respectively. Each node $i \in N$ has a commodity requirement $r_{i}$ associated with it (if it is a supply node, $r_{i} > 0$ ; if a demand node, $r_{i} < 0$ ; if a transshipment node, $r_{i} = 0$ ). An arc parameter $M_{ij}$ is used in the problem formulation to ensure that the fixed cost $f_{ij}$ is incurred whenever there is a

Feature engineering

Feature engineering is a term from machine learning used to denote the process of determining and/or deriving predictor variables used in model. Based on initial testing we derive four types of predictors for the classification model: overall network level characteristics, arc specific attributes, linear relaxation based variables, and lastly, variables related to the nodes incident to an arc. These predictors are developed with a basic guiding principle of being easily understandable and

Critical components identification

The logistic regression model successfully discriminates between “optimal” and “non-optimal” arcs in FCNF solutions. In this section we develop and demonstrate an application of such information for critical network component identification. A component importance measure (CIM) is often computed to rank nodes or arcs in terms of their potential impact on a network performance measure. The performance measure we use is the FCNF optimal objective value. Since the FCNF problem is NP-hard,

Conclusions

In this investigation we develop a predictive model to determine whether or not arcs are selected for flow in an optimal solution of a FCNF problem. To do so, we generate and solve over 1000 FCNF instances. The final model, based on 18 derived network related features, allows for high quality discrimination of “optimal” and “non-optimal” arcs. Application to larger FCNF instances retain the predictive performance.

Since we employ a logistic regression technique, the model also has useful

References (57)

C. Aikens
Facility location models for distribution planning
European Journal of Operational Research
(1985)
K. Antony Arokia Durai Raj et al.
A genetic algorithm for solving the fixed-charge transportation model: Two-stage problem
Computers & Operations Research
(2012)
M. Bell
A game theory approach to measuring the performance reliability of transport networks
Transportation Research Part B: Methodological
(2000)
V. Bier et al.
Methodology for identifying near-optimal interdiction strategies for a power transmission system
Reliability Engineering & System Safety
(2007)
E. Bompard et al.
Analysis of structural vulnerabilities in power transmission grids
International Journal of Critical Infrastructure Protection
(2009)
J. Burez et al.
Handling class imbalance in customer churn prediction
Expert Systems with Applications
(2009)
H. Camdeviren et al.
Comparison of logistic regression model and classification tree: An application to postpartum depression data
Expert Systems with Applications
(2007)
A. Costa
A survey on benders decomposition applied to fixed-charge network design problems
Computers & Operations Research
(2005)
M. El-Sherbiny et al.
A hybrid particle swarm algorithm with artificial immune learning for solving the fixed charge transportation problem
Computers & Industrial Engineering
(2013)
I. Eusgeld et al.
The role of network theory and object-oriented modeling within a framework for the vulnerability analysis of critical infrastructures
Reliability Engineering & System Safety
(2009)

D. Kim et al.

A solution approach to the fixed charge network flow problem using a dynamic slope scaling procedure

Operations Research Letters

(1999)

S. Molla-Alizadeh-Zavardehi et al.

Solving a capacitated fixed-charge transportation problem by artificial immune and genetic algorithms with a Prüfer number representation

Expert Systems with Applications

(2011)

D. Scott et al.

Network robustness index: A new method for identifying critical links and evaluating the performance of transportation networks

Journal of Transport Geography

(2006)

S. Shen et al.

Exact interdiction models and algorithms for disconnecting networks via node deletions

Discrete Optimization

(2012)

M. Sun et al.

A tabu search heuristic procedure for the fixed charge transportation problem

European Journal of Operational Research

(1998)

E. Zio et al.

Identifying groups of critical edges in a realistic electrical network by multi-objective genetic algorithms

Reliability Engineering & System Safety

(2012)

V. Adlakha et al.

A heuristic algorithm for the fixed charge problem

Opsearch

(2010)

H. Akaike

A new look at the statistical model identification

IEEE Transactions on Automatic Control

(1974)

C. Armacost et al.

Composite variable formulations for express shipment service network design

Transportation Science

(2002)

M. Balinski

Fixed-cost transportation problems

Naval Research Logistics Quarterly

(1961)

R. Barr et al.

A new optimization method for large scale fixed charge transportation problems

Operations Research

(1981)

J. Birchmeier

Systematic assessment of the degree of criticality of infrastructures

E. Bixby et al.

MIP: Theory and practice closing the gap

A. Cabot et al.

Some branch-and-bound procedures for fixed-cost transportation problems

Naval Research Logistics Quarterly

(1984)

N. Chawla et al.

Editorial: Special issue on learning from imbalanced data sets

ACM Sigkdd Explorations Newsletter

(2004)

R. Cohen et al.

Resilience of the internet to random breakdowns

Physical Review Letters

(2000)

P. Crucitti et al.

Locating critical lines in high-voltage electrical power grids

Fluctuation and Noise Letters

(2005)

E. Danna et al.

Exploring relaxation induced neighborhoods to improve MIP solutions

Mathematical Programming

(2005)

Cited by (14)

A dragonfly algorithm for solving the Fixed Charge Transportation Problem FCTP
2024, Data and Metadata
Genetic algorithm with immigration strategy to solve the fixed charge transportation problem
2023, Indonesian Journal of Electrical Engineering and Computer Science
A NOVEL APPROACH AND HYBRID PARALLEL ALGORITHMS FOR SOLVING THE FIXED CHARGE TRANSPORTATION PROBLEM
2023, Radioelectronic and Computer Systems
Improved Parallel Genetic Algorithm for Fixed Charge Transportation Problem
2023, Lecture Notes in Networks and Systems
An extreme-point tabu-search algorithm for fixed-charge network problems
2021, Networks
A Self-Organizing Extreme-Point Tabu-Search Algorithm for Fixed Charge Network Problems with Extensions
2020, arXiv

View all citing articles on Scopus

View full text

Optimal network flow: A predictive analytics perspective on the fixed-charge network flow problem

Highlights

Abstract

Introduction

Section snippets

Fixed charge network flow problem

Feature engineering

Critical components identification

Conclusions

European Journal of Operational Research

Computers & Operations Research

Transportation Research Part B: Methodological

Reliability Engineering & System Safety

International Journal of Critical Infrastructure Protection

Expert Systems with Applications

Expert Systems with Applications

Computers & Operations Research

Computers & Industrial Engineering

Reliability Engineering & System Safety

Operations Research Letters

Expert Systems with Applications

Journal of Transport Geography

Discrete Optimization

European Journal of Operational Research

Reliability Engineering & System Safety

A heuristic algorithm for the fixed charge problem

Opsearch

A new look at the statistical model identification

IEEE Transactions on Automatic Control

Composite variable formulations for express shipment service network design

Transportation Science

Fixed-cost transportation problems

Naval Research Logistics Quarterly

A new optimization method for large scale fixed charge transportation problems

Operations Research

Systematic assessment of the degree of criticality of infrastructures

MIP: Theory and practice closing the gap

Some branch-and-bound procedures for fixed-cost transportation problems

Naval Research Logistics Quarterly

Editorial: Special issue on learning from imbalanced data sets

ACM Sigkdd Explorations Newsletter

Resilience of the internet to random breakdowns

Physical Review Letters

Locating critical lines in high-voltage electrical power grids

Fluctuation and Noise Letters

Exploring relaxation induced neighborhoods to improve MIP solutions

Mathematical Programming