Abstract
The constrained 0-1 quadratic programming problem (CBQP) is an important problem of integer programming, and many combinatorial optimization problems can be converted to CBQP problem. Because BQP is NP-hard problem, the solving time and accuracy of traditional optimization algorithm are very dependent on the size of the problem, and the local optimal solution obtained by the heuristic algorithm is unstable. Deep learning algorithm has great advantages in solving such problems. In this paper, for the CBQP problem with linear constraints, we creatively apply two algorithms and models to solve it: the graph pointer network model (GPN) trained by hierarchical reinforcement learning (HRL), and the multi-head attention-based pointer network model trained by Advantage Actor-Critic (A2C), which greatly improves the solving speed, accuracy and constraint satisfaction rate of CBQP problems of different scales. At the same time, the bidirectional mask mechanism is innovatively introduced into the network so that the constraint satisfaction rate of the solution is very high. For the two algorithms, this paper solved the 0-1 knapsack (BKP) problem and the quadratic knapsack (QKP) problem, which are equivalent to the CBQP problem, and compared the results of the CBQP problem with different data distribution and scales. The experiment shows that no matter the objective function of the CBQP problem is linear or nonlinear, different data set distribution, or the scale, the pointer network trained by reinforcement learning in this paper has better results than traditional optimization algorithms in solving time, accuracy, stability and constraint satisfaction rate, and with the increase in the size of the problem, this advantage becomes more obvious.















Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Data availability
The data set generated during the current study is available from the corresponding author on reasonable request.
References
Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473
Beasley JE (1998) Heuristic algorithms for the unconstrained binary quadratic programming problem. Tech. rep., Citeseer
Bello I, Pham H, Le QV, et al (2016) Neural combinatorial optimization with reinforcement learning. arXiv preprint arXiv:1611.09940
Bengio Y, Lodi A, Prouvost A (2021) Machine learning for combinatorial optimization: a methodological tour d’horizon. Eur J Oper Res 290(2):405–421
Chua LO (1978) Nonlinear circuit theory. NASA STI/Recon Tech Rep N 79(12):346
Delahaye D, Chaimatanan S, Mongeau M (2019) Simulated annealing: from basics to applications. Handbook of metaheuristics. Springer, Cham, pp 1–35
Gasse M, Chételat D, Ferroni N, et al (2019) Exact combinatorial optimization with graph convolutional neural networks. Adv Neural Inf Process Syst 32
Golden BL, Stewart WR (1985) The traveling salesman problem. a guided tour of combinatorial optimization, chapter empirical analysis of heuristics
Gu S, Yang Y (2018) A pointer network based deep learning algorithm for the max-cut problem. In: International conference on neural information processing, Springer, pp 238–248
Gu S, Hao T, Yao H (2020) A pointer network based deep learning algorithm for unconstrained binary quadratic programming problem. Neurocomputing 390:1–11
Halim AH, Ismail I (2019) Combinatorial optimization: comparison of heuristic algorithms in travelling salesman problem. Arch Comput Methods Eng 26(2):367–380
Heal M (2016) A quadratic programming formulation to find the maximum independent set of any graph. In: 2016 international conference on computational science and computational intelligence (CSCI), IEEE, pp 1368–1370
Hu H, Zhang X, Yan X, et al (2017) Solving a new 3d bin packing problem with deep reinforcement learning method. arXiv preprint arXiv:1708.05930
Khalil E, Dai H, Zhang Y, et al (2017) Learning combinatorial optimization algorithms over graphs. Adv Neural Inf Process Syst 30
Kool W, Van Hoof H, Welling M (2018) Attention, learn to solve routing problems! arXiv preprint arXiv:1803.08475
Laporte G (1992) The vehicle routing problem: an overview of exact and approximate algorithms. Eur J Oper Res 59(3):345–358
Laughhunn D (1970) Quadratic binary programming with application to capital-budgeting problems. Oper Res 18(3):454–461
Li L, Zhou L, Yang C et al (2017) A novel combinatorial optimization algorithm for energy management strategy of plug-in hybrid electric vehicle. J Franklin Inst 354(15):6588–6609
Ma Q, Ge S, He D, et al (2019) Combinatorial optimization by graph pointer networks and hierarchical reinforcement learning. arXiv preprint arXiv:1911.04936
Martello S, Pisinger D, Vigo D (2000) The three-dimensional bin packing problem. Oper Res 48(2):256–267
Mirjalili S (2019) Genetic algorithm. Evolutionary algorithms and neural networks. Springer, Cham, pp 43–55
Nazari M, Oroojlooy A, Snyder L, et al (2018) Reinforcement learning for solving the vehicle routing problem. Adv Neural Inf Process Syst 31
Vaswani A, Shazeer N, Parmar N, et al (2017) Attention is all you need. Adv Neural Inf Process Syst 30
Vinyals O, Fortunato M, Jaitly N (2015) Pointer networks. Adv Neural Inf Process Syst 28
Williams RJ (1992) Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach Learn 8(3):229–256
Xu X, Rong H, Trovati M et al (2018) Cs-pso: chaotic particle swarm optimization algorithm for solving combinatorial optimization problems. Soft Comput 22(3):783–795
Acknowledgements
The work described in the paper was supported by the National Science Foundation of China under Grants 61876105.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Gu, S., Zhuang, Y. Method for solving constrained 0-1 quadratic programming problems based on pointer network and reinforcement learning. Neural Comput & Applic 35, 9973–9993 (2023). https://doi.org/10.1007/s00521-022-07604-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-07604-8