A parallel hub-and-spoke system for large-scale scenario-based optimization under uncertainty

Knueven, Bernard; Mildebrath, David; Muir, Christopher; Siirola, John D.; Watson, Jean-Paul; Woodruff, David L.

doi:10.1007/s12532-023-00247-3

A parallel hub-and-spoke system for large-scale scenario-based optimization under uncertainty

Full Length Paper
Published: 14 August 2023

Volume 15, pages 591–619, (2023)
Cite this article

Mathematical Programming Computation Aims and scope Submit manuscript

Bernard Knueven¹,
David Mildebrath²,
Christopher Muir³,
John D. Siirola⁴,
Jean-Paul Watson⁵ &
…
David L. Woodruff⁶

399 Accesses
1 Citation
Explore all metrics

Abstract

Practical solution of stochastic programming problems generally requires the use of parallel computing resources. Here, we describe the open source package mpi-sppy, in which efficient and scalable parallelization is a central feature. We report computational experiments that demonstrate the ability to solve very large stochastic programming problems—including mixed-integer variants—in minutes of wall clock time, efficiently leveraging significant parallel computing resources. We report results for the largest publicly available instances of stochastic mixed-integer unit commitment problems, solving to provably tight optimality gaps. In addition, we introduce a novel software architecture that facilitates combinations of methods for accelerating convergence that can be combined in plug-and-play manner. The mpi-sppy package is written in Python, leverages the widely used Pyomo (http://www.pyomo.org) library for modeling mathematical programs, builds on existing MPI implementations to ensure efficiency and scalability, and is available via http://github.com/Pyomo/mpi-sppy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Embarrassingly Parallel Method for Large-Scale Stochastic Programs

A novel parallel combinatorial algorithm for multiparametric programming

Article 04 October 2023

Algorithm Portfolios and Teams in Parallel Optimization

Availability of data and material

The data are available at https://github.com/Pyomo/mpi-sppy.

References

Benders, J.F.: Partitioning procedures for solving mixed-variables programming problems. Numer. Math. 4(1), 238–252 (1962)
Article MathSciNet MATH Google Scholar
Biel, M., Johansson, M.: Efficient stochastic programming in Julia. INFORMS J. Comput. to appear (2021)
Birge, J.R., Louveaux, F.: Introduction to Stochastic Programming. Springer, Cham (1997)
MATH Google Scholar
Boland, N., Christiansen, J., Dandurand, B., Eberhard, A., Linderoth, J., Luedtke, J., Oliveira, F.: Combining progressive hedging with a Frank-Wolfe method to compute Lagrangian dual bounds in stochastic mixed-integer programming. SIAM J. Optim. 28(2), 1312–1336 (2018)
Article MathSciNet MATH Google Scholar
Cheung, K.W., Gade, D., Silva-Monroy, C., Ryan, S.M., Watson, J.P., Wets, R.J., Woodruff, D.L.: Scalable stochastic unit commitment, part 2: solver performance. Energy Syst. 6(3), 417–438 (2015)
Article Google Scholar
Chiralaksanakul, A., Morton, D.P.: Assessing Policy Quality in Multi-stage Stochastic Programming. Humboldt-Universität zu Berlin, Mathematisch-Naturwissenschaftliche Fakultät II, Institut für Mathematik (2004)
Crainic, T.G., Hewitt, M., Rei, W.: Scenario grouping in a progressive hedging-based meta-heuristics for stochastic network design. Comput. Oper. Res. 43, 90–99 (2014)
Article MathSciNet MATH Google Scholar
Dalcin, L., Fang, Y.L.L.: Mpi4py: Status update after 12 years of development. Comput. Sci. Eng. 23(4), 47–54 (2021)
Article Google Scholar
Dowson, O., Kapelevich, L.: Sddp. jl: a julia package for stochastic dual dynamic programming. INFORMS J. Comput. 33(1), 27–33 (2021)
Fair Isaac Corporation: Xpress optimizer reference manual (2020). https://www.fico.com/en/products/fico-xpress-solver
Fischetti, M., Salvagnin, D., Zanette, A.: A note on the selection of benders’ cuts. Math. Program. 124(1), 175–182 (2010)
Article MathSciNet MATH Google Scholar
Gade, D., Hackebeil, G., Ryan, S., Watson, J.P., Wets, R., Woodruff, D.: Obtaining lower bounds from the progressive hedging algorithm for stochastic mixed-integer programming. Math. Program. 157(1), 47–67 (2016)
Article MathSciNet MATH Google Scholar
Goujard, G., Watson, J.P., Woodruff, D.L.: Mape_maker: a scenario creator. Energy Syst. to appear (2020)
Gurobi Optimization, LLC: Gurobi optimizer reference manual (2020). http://www.gurobi.com
Hart, W., Watson, J., Woodruff, D.: Python optimization modeling objects (Pyomo). Math. Program. Comput. 3 (2011)
Infanger, G., Morton, D.P.: Cut sharing for multistage stochastic linear programs with interstage dependency. Math. Program. 75, 241–256 (1996)
Article MathSciNet MATH Google Scholar
Jorjani, S., Scott, C., Woodruff, D.: Selection of an optimal subset of sizes. Int. J. Prod. Res. 37(16), 3697–3710 (1999)
Article MATH Google Scholar
Kim, K.: Dspopt.jl. https://github.com/kibaekkim/DSPopt.jl (2020)
King, A.J., Wallace, S.W.: Modeling with Stochastic Programming. Springer, Cham (2012)
Book MATH Google Scholar
Klingman, D., Napier, A., Stutz, J.: NETGEN: a program for generating large scale capacitated assignment, transportation, and minimum cost flow network problems. Manage. Sci. 20(5), 814–821 (1974)
Article MATH Google Scholar
Knueven, B., Ostrowski, J., Watson, J.P.: On mixed-integer programming formulations for the unit commitment problem. INFORMS J. Comput. 32(4), 857–876 (2020)
MathSciNet MATH Google Scholar
L. Ding, S.A., Shapiro, A.: A python package for multi-stage stochastic programming. Tech. Rep., Optim. (2019)
Märkert, A., Gollmer, R.: User’s guide to ddsip-ac package for the dual decomposition of two-stage stochastic programs with mixed-integer recourse. Department of Mathematics, University of Duisburg-Essen, Tech. rep. (2008)
Mitra, S., Garcia-Herreros, P., Grossmann, I.E.: A cross-decomposition scheme with integrated primaldual multi-cuts for two-stage stochastic programming investment planning problems. Math. Program. 157, 95–119 (2016)
Article MathSciNet MATH Google Scholar
Palani, A.M., Wu, H., Morcos, M.M.: A Frank-Wolfe progressive hedging algorithm for improved lower bounds in stochastic scuc. IEEE Access 7, 99398–99406 (2019)
Article Google Scholar
Rockafellar, R.T., Wets, R.J.B.: Scenarios and policy aggregation in optimization under uncertainty. Math. Oper. Res. 16(1), 119–147 (2004)
Article MathSciNet MATH Google Scholar
Schultz, R., Tiedemann, S.: Conditional value-at-risk in stochastic programs with mixed-integer recourse. Math. Program. 105(2–3), 365–386 (2005)
MathSciNet MATH Google Scholar
Van Slyke, R.M., Wets, R.: L-shaped linear programs with applications to optimal control and stochastic programming. SIAM J. Appl. Math. 17(4), 638–663 (1969)
Article MathSciNet MATH Google Scholar
Watson, J., Woodruff, D.: Progressive hedging innovations for a class of stochastic mixed-integer resource allocation problems. CMS 8, 355–370 (2011)
Article MathSciNet MATH Google Scholar
Watson, J.P., Woodruff, D., Hart, W.: PySP: modeling and solving stochastic programs in Python. Math. Program. Comput. 3, 219–260 (2011)
Article MathSciNet MATH Google Scholar
Woodruff, D.L., Knight, B.C., Chen, X., Cazaux, S.: aircond: An example for optimization under uncertainty. https://github.com/DLWoodruff/aircond (2022)

Download references

Acknowledgements

The authors would like to thank Dr. Mehdi Hemmati for providing the network design instances used in the paper and codebase. This work was authored in part by the National Renewable Energy Laboratory, operated by Alliance for Sustainable Energy, LLC, for the U.S. Department of Energy (DOE) under Contract No. DE-AC36-08GO28308. Funding provided in part by the U.S. Department of Energy Advanced Research Projects Agency - Energy. The views expressed in the article do not necessarily represent the views of the DOE or the U.S. Government. The U.S. Government retains and the publisher, by accepting the article for publication, acknowledges that the U.S. Government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this work, or allow others to do so, for U.S. Government purposes. A portion of this research was performed using computational resources sponsored by the Department of Energy’s Office of Energy Efficiency and Renewable Energy and located at the National Renewable Energy Laboratory. This document was prepared as an account of work sponsored by an agency of the United States government. Neither the United States government nor Lawrence Livermore National Security, LLC, nor any of their employees makes any warranty, expressed or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States government or Lawrence Livermore National Security, LLC. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States government or Lawrence Livermore National Security, LLC, and shall not be used for advertising or product endorsement purposes. This work was performed in part under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC., a wholly owned subsidiary of Honeywell International, Inc., for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525. This paper describes objective technical results and analysis. Any subjective views or opinions that might be expressed in the paper do not necessarily represent the views of the U.S. Department of Energy or the United States Government.

Funding

J.P. Watson was supported by the US Department of Energy’s Advanced Grid Modeling (AGM) program, part of the Office of Electricity. D. Mildebrath was supported by the United States Department of Defense through the National Defense Science and Engineering Graduate Fellowship (NDSEG) Program. C. Muir’s work was supported by a U.S. NSF Graduate Research Fellowship.

Author information

Authors and Affiliations

Computational Science Center, National Renewable Energy Laboratory, Golden, CO, 80401, USA
Bernard Knueven
Department of Computational and Applied Mathematics, Rice University, Houston, TX, 77005, USA
David Mildebrath
School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA, 30332, USA
Christopher Muir
Center for Computing Research, Sandia National Laboratories, Albuquerque, NM, 87185, USA
John D. Siirola
Center for Applied Scientific Computing and Global Security Directorate, Lawrence Livermore National Laboratory, Livermore, CA, 94550, USA
Jean-Paul Watson
Graduate School of Management, UC Davis, Davis, CA, 95616, USA
David L. Woodruff

Authors

Bernard Knueven
View author publications
You can also search for this author in PubMed Google Scholar
David Mildebrath
View author publications
You can also search for this author in PubMed Google Scholar
Christopher Muir
View author publications
You can also search for this author in PubMed Google Scholar
John D. Siirola
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Paul Watson
View author publications
You can also search for this author in PubMed Google Scholar
David L. Woodruff
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David L. Woodruff.

Ethics declarations

Conflict of interest

The authors have no conflicts of interest.

Code availability

The software is available at https://github.com/Pyomo/mpi-sppy.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

A Details on the NetDes instances

We use the following network design problem (termed “NetDes”) for computational testing. Given a directed graph $G=(V,E)$, the decision maker selects a subset of the arcs E to build. Once the arcs have been built, two sets of uncertain parameters are realized: a demand value for each node (which may be positive or negative, i.e., a demand or a supply), and a capacity for each arc. The decision maker next selects the amount of flow to be sent along each arc that was built in the first stage, in order to satisfy the realized demand with the minimum expected cost, while respecting the capacity limit on each edge. This problem can be formulated as the following mixed-integer program:

$$\begin{aligned} \min \;&\sum _{e\in {E}}c_ex_e+\sum _{\xi \in \varXi }\Pr (\xi )\sum _{e\in {E}}f_e(\xi ){y_e(\xi )} \end{aligned}$$

(11a)

$$\begin{aligned} \mathrm {s.t.}\;&\sum _{e\in {N^+_v}}y_e(\xi )-\sum _{e\in {N^-_v}}y_e(\xi )=d_v(\xi ){} & {} \forall \;v\in {V},\xi \in \varXi \end{aligned}$$

(11b)

$$\begin{aligned}&y_e(\xi )\le {u_e(\xi )}x_e{} & {} \forall \;e\in {E},\xi \in \varXi \end{aligned}$$

(11c)

$$\begin{aligned}&x_e\in \{0,1\}{} & {} \forall \;e\in {E}\end{aligned}$$

(11d)

$$\begin{aligned}&y_e(\xi )\ge 0{} & {} \forall \;e\in {E},\xi \in \varXi \end{aligned}$$

(11e)

Here, $x_e=1$ if and only if edge $e\in {E}$ is built in the first stage, and $y_e(\xi )$ represents the amount of flow sent on arc $e\in {E}$ under scenario $\xi \in \varXi $, which occurs with probability $\Pr (\xi )$. The parameter $c_e$ is the cost to build edge $e\in {E}$, and $f_e(\xi )$ is the cost to send one unit of flow along arc $e\in {E}$ under scenario $\xi \in \varXi $. The set $N^+_v$ (resp. $N^-_v$) is the set of directed edges entering (resp. exiting) node $v\in {V}$. The constraints (11b) enforce conservation of flow for each node $v\in {V}$ and scenario $\xi \in \varXi $, where $d_v(\xi )$ is the “demand” (or, possibly, supply) of the node v under scenario $\xi \in \varXi $. The constraints (11c) ensure that we do not send any flow on an arc which we did not build, and further that the flow on each arc is less than the capacity $u_e(\xi )$.

The network design instances used in this paper were generated using the NETDES algorithm [20]. All instances are feasible under all scenarios—that is, there always exists a subset of edges which may be selected in the first stage that allows for a flow which satisfies the demand in all scenarios $\xi \in \varXi $.

B Scalability and overhead for small examples

To study the use of the library on shared-memory computers, we make use of the farmer instance from [3] modified using two instance creation parameters cropsmult and numscens. The original problem has three crops and three scenarios. The scalable instances have cropsmult sets of the original three crops with the same non-stochastic characteristics as in the original problem. Scenarios are also created in groups of three with a pseudo random number that is uniformly distributed added to the values for the original three scenarios. The problems have no practical interest, but they are easy to describe, scalable in number of scenarios, and size of scenario sub-problems and non-trivial to solve.

Table 5 gives results for $cropsmult=100$ and $numsens=2048$. Rather than solving individual scenarios as sub-problems, the hub PH algorithm solves bundles of 32 scenarios (see, e.g., [7, 29] for more discussion of bundling with PH). The spokes are Lagrangian and xhat-shuffle-looper. We limited the gurobi solver to one thread on each rank because we are not interested in solving the problem, just in looking at scalability. The PH algorithm is allowed to run for 10 and 100 iterations. We conducted our experiments on a workstation with 48 dual threaded Intel Xeon workstation operating at 2.1GHz, and 1TB of RAM. The column "Hub Ranks" gives the number ranks assigned to the hub, so the total number of ranks is three times that because there are two spokes.

Table 5 Fixed number of sub-problems, using bundling and varying the number of ranks for the scalable farmer instance with $cropsmult=100$ and $numsens=2048$

Full size table

We now look at results of runs on a compute node with 24 dual threaded Intel CPUs operating at 2.3 GHz. Table 6 shows experiments with $cropsmult=1000$ and the number of processors set at three times the number of scenarios so that there is one scenario per rank in each of the three cylinders: hub, lower bound, upper bound. Our interest is not in solving the problem but in checking overhead and scalability on small systems.

Table 6 One sub-problem per stratum (i.e., one hub rank per sub-problem) for the scalable farmer instance with $cropsmult=1000$

Full size table

The results in Tables 5 and 6 give information about overhead and scalability with respect to the number of sub-problems. There is also a practical question related to scalability with respect to the number of spokes. The effort to exchange data to spokes increases linearly with the number of spokes, so for a large enough number of spokes, the hub would spend most of its time writing and reading from spoke buffers and only some of its time working. In such cases, it would be necessary to implement a helper for the hub whose job would be to take data from the hub and distribute it to spokes.

The current implementation does not have such a helper and Table 7 illustrates that it is probably not needed for modest numbers of spokes. These experiments were done on a compute node with 24 dual threaded Intel CPUs operating at 2.3 GHz using scalable farmer with $cropsmult=1000$ and the number of processors set at 1 plus the number spokes so there are multiple scenarios per rank. For every experiment we have a PH hub cylinder and a Lagrangian lower bound spoke. We vary the number of xhat-shuffle-looper upper bound spokes. Our interest is not in solving the problem but in checking overhead and scalability on small systems. All experiments ran for 100 PH iterations. We see that increasing the number of spokes from 2 to 8 increases the runtime by less than 10%.

Table 7 Time in seconds for 100 iterations for the scalable farmer problem with $cropsmult=1000$

Full size table

We note the coefficient of variation in run times for replicated experiments is less than 2%, so there is not much value in replicating these experiments.

C Details concerning aircond experiments

The aircond model is a multi-stage production planning problem with inventory, backorders (with linear and quadratic penalties), and production costs used for testing and experimentation. Full details are provided at https://github.com/DLWoodruff/aircond. This example makes use of a very new feature of mpi-sppy referred to as proper bundles. These bundles are created before the cylinder system runs, pickled (i.e., serialized), and stored for use during the execution by the cylinder system. Bundle pickling is parallelized and can use all ranks (e.g. 600 ranks for the 1 M experiments) and takes about 40 s for the experiments reported in this paper.

In the interest of repeatability, a slightly condensed form of the slurm script for 100k scenarios is given in Fig. 4.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Knueven, B., Mildebrath, D., Muir, C. et al. A parallel hub-and-spoke system for large-scale scenario-based optimization under uncertainty. Math. Prog. Comp. 15, 591–619 (2023). https://doi.org/10.1007/s12532-023-00247-3

Download citation

Received: 14 November 2020
Accepted: 13 March 2023
Published: 14 August 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s12532-023-00247-3

Keywords

Mathematics Subject Classification

90C15

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A parallel hub-and-spoke system for large-scale scenario-based optimization under uncertainty

Abstract

Access this article

Similar content being viewed by others

An Embarrassingly Parallel Method for Large-Scale Stochastic Programs

A novel parallel combinatorial algorithm for multiparametric programming

Algorithm Portfolios and Teams in Parallel Optimization

Availability of data and material

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Code availability

Additional information

Publisher's Note

Appendices

A Details on the NetDes instances

B Scalability and overhead for small examples

C Details concerning aircond experiments

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

A parallel hub-and-spoke system for large-scale scenario-based optimization under uncertainty

Abstract

Access this article

Similar content being viewed by others

An Embarrassingly Parallel Method for Large-Scale Stochastic Programs

A novel parallel combinatorial algorithm for multiparametric programming

Algorithm Portfolios and Teams in Parallel Optimization

Availability of data and material

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Code availability

Additional information

Publisher's Note

Appendices

A Details on the NetDes instances

B Scalability and overhead for small examples

C Details concerning aircond experiments

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation