Method for solving constrained 0-1 quadratic programming problems based on pointer network and reinforcement learning

Gu, Shenshen; Zhuang, Yuxi

doi:10.1007/s00521-022-07604-8

Method for solving constrained 0-1 quadratic programming problems based on pointer network and reinforcement learning

S.I.: Interpretation of Deep Learning
Published: 30 July 2022

Volume 35, pages 9973–9993, (2023)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Shenshen Gu¹ &
Yuxi Zhuang¹

570 Accesses
Explore all metrics

Abstract

The constrained 0-1 quadratic programming problem (CBQP) is an important problem of integer programming, and many combinatorial optimization problems can be converted to CBQP problem. Because BQP is NP-hard problem, the solving time and accuracy of traditional optimization algorithm are very dependent on the size of the problem, and the local optimal solution obtained by the heuristic algorithm is unstable. Deep learning algorithm has great advantages in solving such problems. In this paper, for the CBQP problem with linear constraints, we creatively apply two algorithms and models to solve it: the graph pointer network model (GPN) trained by hierarchical reinforcement learning (HRL), and the multi-head attention-based pointer network model trained by Advantage Actor-Critic (A2C), which greatly improves the solving speed, accuracy and constraint satisfaction rate of CBQP problems of different scales. At the same time, the bidirectional mask mechanism is innovatively introduced into the network so that the constraint satisfaction rate of the solution is very high. For the two algorithms, this paper solved the 0-1 knapsack (BKP) problem and the quadratic knapsack (QKP) problem, which are equivalent to the CBQP problem, and compared the results of the CBQP problem with different data distribution and scales. The experiment shows that no matter the objective function of the CBQP problem is linear or nonlinear, different data set distribution, or the scale, the pointer network trained by reinforcement learning in this paper has better results than traditional optimization algorithms in solving time, accuracy, stability and constraint satisfaction rate, and with the increase in the size of the problem, this advantage becomes more obvious.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Reinforcement Learning for the Knapsack Problem

Decomposed Multi-objective Method Based on Q-Learning for Solving Multi-objective Combinatorial Optimization Problem

Learning and fine-tuning a generic value-selection heuristic inside a constraint programming solver

Article Open access 23 November 2024

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Artificial Intelligence

Data availability

The data set generated during the current study is available from the corresponding author on reasonable request.

References

Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473
Beasley JE (1998) Heuristic algorithms for the unconstrained binary quadratic programming problem. Tech. rep., Citeseer
Bello I, Pham H, Le QV, et al (2016) Neural combinatorial optimization with reinforcement learning. arXiv preprint arXiv:1611.09940
Bengio Y, Lodi A, Prouvost A (2021) Machine learning for combinatorial optimization: a methodological tour d’horizon. Eur J Oper Res 290(2):405–421
Article MATH MathSciNet Google Scholar
Chua LO (1978) Nonlinear circuit theory. NASA STI/Recon Tech Rep N 79(12):346
Google Scholar
Delahaye D, Chaimatanan S, Mongeau M (2019) Simulated annealing: from basics to applications. Handbook of metaheuristics. Springer, Cham, pp 1–35
Google Scholar
Gasse M, Chételat D, Ferroni N, et al (2019) Exact combinatorial optimization with graph convolutional neural networks. Adv Neural Inf Process Syst 32
Golden BL, Stewart WR (1985) The traveling salesman problem. a guided tour of combinatorial optimization, chapter empirical analysis of heuristics
Gu S, Yang Y (2018) A pointer network based deep learning algorithm for the max-cut problem. In: International conference on neural information processing, Springer, pp 238–248
Gu S, Hao T, Yao H (2020) A pointer network based deep learning algorithm for unconstrained binary quadratic programming problem. Neurocomputing 390:1–11
Article Google Scholar
Halim AH, Ismail I (2019) Combinatorial optimization: comparison of heuristic algorithms in travelling salesman problem. Arch Comput Methods Eng 26(2):367–380
Article MathSciNet Google Scholar
Heal M (2016) A quadratic programming formulation to find the maximum independent set of any graph. In: 2016 international conference on computational science and computational intelligence (CSCI), IEEE, pp 1368–1370
Hu H, Zhang X, Yan X, et al (2017) Solving a new 3d bin packing problem with deep reinforcement learning method. arXiv preprint arXiv:1708.05930
Khalil E, Dai H, Zhang Y, et al (2017) Learning combinatorial optimization algorithms over graphs. Adv Neural Inf Process Syst 30
Kool W, Van Hoof H, Welling M (2018) Attention, learn to solve routing problems! arXiv preprint arXiv:1803.08475
Laporte G (1992) The vehicle routing problem: an overview of exact and approximate algorithms. Eur J Oper Res 59(3):345–358
Article MATH Google Scholar
Laughhunn D (1970) Quadratic binary programming with application to capital-budgeting problems. Oper Res 18(3):454–461
Article MATH Google Scholar
Li L, Zhou L, Yang C et al (2017) A novel combinatorial optimization algorithm for energy management strategy of plug-in hybrid electric vehicle. J Franklin Inst 354(15):6588–6609
Article MATH MathSciNet Google Scholar
Ma Q, Ge S, He D, et al (2019) Combinatorial optimization by graph pointer networks and hierarchical reinforcement learning. arXiv preprint arXiv:1911.04936
Martello S, Pisinger D, Vigo D (2000) The three-dimensional bin packing problem. Oper Res 48(2):256–267
Article MATH MathSciNet Google Scholar
Mirjalili S (2019) Genetic algorithm. Evolutionary algorithms and neural networks. Springer, Cham, pp 43–55
Chapter MATH Google Scholar
Nazari M, Oroojlooy A, Snyder L, et al (2018) Reinforcement learning for solving the vehicle routing problem. Adv Neural Inf Process Syst 31
Vaswani A, Shazeer N, Parmar N, et al (2017) Attention is all you need. Adv Neural Inf Process Syst 30
Vinyals O, Fortunato M, Jaitly N (2015) Pointer networks. Adv Neural Inf Process Syst 28
Williams RJ (1992) Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach Learn 8(3):229–256
Article MATH Google Scholar
Xu X, Rong H, Trovati M et al (2018) Cs-pso: chaotic particle swarm optimization algorithm for solving combinatorial optimization problems. Soft Comput 22(3):783–795
Article Google Scholar

Download references

Acknowledgements

The work described in the paper was supported by the National Science Foundation of China under Grants 61876105.

Author information

Authors and Affiliations

School of Mechatronic Engineering and Automation, Shanghai University, 99 Shangda Road, Shanghai, 200444, China
Shenshen Gu & Yuxi Zhuang

Authors

Shenshen Gu
View author publications
You can also search for this author inPubMed Google Scholar
Yuxi Zhuang
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Shenshen Gu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Gu, S., Zhuang, Y. Method for solving constrained 0-1 quadratic programming problems based on pointer network and reinforcement learning. Neural Comput & Applic 35, 9973–9993 (2023). https://doi.org/10.1007/s00521-022-07604-8

Download citation

Received: 06 March 2022
Accepted: 01 July 2022
Published: 30 July 2022
Issue Date: May 2023
DOI: https://doi.org/10.1007/s00521-022-07604-8

Keywords

Part of a collection:

S.I.: Interpretation of Deep Learning: Prediction, Representation, Modeling and Utilization (vol 35, issue 14)

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Method for solving constrained 0-1 quadratic programming problems based on pointer network and reinforcement learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Reinforcement Learning for the Knapsack Problem

Decomposed Multi-objective Method Based on Q-Learning for Solving Multi-objective Combinatorial Optimization Problem

Learning and fine-tuning a generic value-selection heuristic inside a constraint programming solver

Explore related subjects

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now