Heavy-Tails and Randomized Restarting Beam Search in Goal-Oriented Neural Sequence Decoding

Cohen, Eldan; Beck, J. Christopher

doi:10.1007/978-3-030-78230-6_8

Eldan Cohen⁹ &
J. Christopher Beck⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12735))

Included in the following conference series:

International Conference on Integration of Constraint Programming, Artificial Intelligence, and Operations Research

1446 Accesses

Abstract

Recent work has demonstrated that neural sequence models can successfully solve combinatorial search problems such as program synthesis and routing problems. In these scenarios, the beam search algorithm is typically used to produce a set of high-likelihood candidate sequences that are evaluated to determine if they satisfy the goal criteria. If none of the candidates satisfy the criteria, the beam search can be restarted with a larger beam size until a satisfying solution is found. Inspired by works in combinatorial and heuristic search, we investigate whether heavy-tailed behavior can be observed in the search effort distribution of complete beam search in goal-oriented neural sequence decoding. We analyze four goal-oriented decoding tasks and find that the search effort of beam search exhibits fat- and heavy-tailed behavior. Following previous work on heavy-tailed behavior in search, we propose a randomized restarting variant of beam search. We conduct extensive empirical evaluation, comparing different randomization techniques and restart strategies, and show that the randomized restarting variant solves some of the hardest instances faster and outperforms the baseline.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Relative Value Function Based Learning Beam Search for the Longest Common Subsequence Problem

Learning Beam Search: Utilizing Machine Learning to Guide Beam Search for Solving Combinatorial Optimization Problems

Guiding Enumerative Program Synthesis with Large Language Models

Notes

1.
Obtained from github.com/wouterkool/attention-learn-to-route.
2.
This notion of constrainedness matches the notion of resource-constrainedness previously used to study planning in resource-constrained environments [29].
3.
Obtained from github.com/Hippogriff/CSGNet.
4.
Obtained from github.com/nyu-dl/conditional-molecular-design-ssvae.
5.
All appendices appear in tidel.mie.utoronto.ca/pubs/rr-beam-appendix.pdf.
6.
Note that we are not aware of any direct connection between noise injection in training to increase robustness and our use of noise injection in testing to introduce randomness in the decoding process. However, it might be interesting to consider whether there is some underlying connection.

References

Applegate, D., Bixby, R., Chvatal, V., Cook, W.: Concorde TSP solver (2006)
Google Scholar
Balog, M., Gaunt, A., Brockschmidt, M., Nowozin, S., Tarlow, D.: Deepcoder: learning to write programs. In: International Conference on Learning Representations (ICLR) (2017)
Google Scholar
Bickerton, G.R., Paolini, G.V., Besnard, J., Muresan, S., Hopkins, A.L.: Quantifying the chemical beauty of drugs. Nat. Chem. 4(2), 90 (2012)
Article Google Scholar
Cho, K.: Noisy parallel approximate decoding for conditional recurrent language model. arXiv preprint arXiv:1605.03835 (2016)
Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: EMNLP (2014)
Google Scholar
Chopra, S., Auli, M., Rush, A.M.: Abstractive sentence summarization with attentive recurrent neural networks. In: North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), pp. 93–98 (2016)
Google Scholar
Cohen, E., Beck, J.C.: Fat- and heavy-tailed behavior in satisficing planning. In: AAAI Conference on Artificial Intelligence (AAAI), pp. 6136–6143 (2018)
Google Scholar
Cohen, E., Beck, J.C.: Local minima, heavy tails, and search effort for GBFS. In: International Joint Conferences on Artificial Intelligence (IJCAI), pp. 4708–4714 (2018)
Google Scholar
Deudon, M., Cournut, P., Lacoste, A., Adulyasak, Y., Rousseau, L.-M.: Learning heuristics for the TSP by policy gradient. In: van Hoeve, W.-J. (ed.) CPAIOR 2018. LNCS, vol. 10848, pp. 170–181. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93031-2_12
Chapter Google Scholar
Gehring, J., Auli, M., Grangier, D., Yarats, D., Dauphin, Y.N.: Convolutional sequence to sequence learning. In: International Conference on Machine Learning (ICML), pp. 1243–1252 (2017)
Google Scholar
Gomes, C.: Randomized backtrack search. In: Milano, M. (ed.) Constraint and Integer Programming: Toward a Unified Methodology, vol. 27, pp. 233–291. Springer, Heidelberg (2003). https://doi.org/10.1007/978-1-4419-8917-8_8
Chapter Google Scholar
Gomes, C.P., Fernández, C., Selman, B., Bessière, C.: Statistical regimes across constrainedness regions. Constraints 10(4), 317–337 (2005). https://doi.org/10.1007/s10601-005-2807-z
Article MathSciNet MATH Google Scholar
Gomes, C.P., Selman, B., Crato, N.: Heavy-tailed distributions in combinatorial search. In: Smolka, G. (ed.) CP 1997. LNCS, vol. 1330, pp. 121–135. Springer, Heidelberg (1997). https://doi.org/10.1007/BFb0017434
Chapter Google Scholar
Gomes, C.P., Selman, B., Crato, N., Kautz, H.: Heavy-tailed phenomena in satisfiability and constraint satisfaction problems. J. Autom. Reason. 24(1), 67–100 (2000). https://doi.org/10.1023/A:1006314320276
Article MathSciNet MATH Google Scholar
Gomes, C.P., Selman, B., Kautz, H., et al.: Boosting combinatorial search through randomization. In: National Conference on Artificial Intelligence (AAAI), vol. 98, pp. 431–437 (1998)
Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press (2016). http://www.deeplearningbook.org
Helsgaun, K.: An extension of the Lin-Kernighan-Helsgaun TSP solver for constrained traveling salesman and vehicle routing problems. Roskilde University, Roskilde (2017)
Google Scholar
Jin, W., Barzilay, R., Jaakkola, T.: Junction tree variational autoencoder for molecular graph generation. In: International Conference on Machine Learning (ICML), pp. 2323–2332 (2018)
Google Scholar
Jin, W., Yang, K., Barzilay, R., Jaakkola, T.: Learning multimodal graph-to-graph translation for molecule optimization. In: International Conference on Learning Representations (ICLR) (2018)
Google Scholar
Kang, S., Cho, K.: Conditional molecular design with deep generative models. J. Chem. Inf. Model. 59(1), 43–52 (2018)
Article Google Scholar
Kautz, H., Horvitz, E., Ruan, Y., Gomes, C., Selman, B.: Dynamic restart policies. In: National Conference on Artificial Intelligence (AAAI), pp. 674–681 (2002)
Google Scholar
Khalil, E., Dai, H., Zhang, Y., Dilkina, B., Song, L.: Learning combinatorial optimization algorithms over graphs. In: Conference on Neural Information Processing Systems (NeurIPS), pp. 6348–6358 (2017)
Google Scholar
Kool, W., van Hoof, H., Welling, M.: Attention, learn to solve routing problems! In: International Conference on Learning Representations (ICLR) (2019)
Google Scholar
Kool, W., Van Hoof, H., Welling, M.: Stochastic beams and where to find them: The gumbel-top-k trick for sampling sequences without replacement. In: International Conference on Machine Learning (ICML), pp. 3499–3508 (2019)
Google Scholar
Lample, G., Charton, F.: Deep learning for symbolic mathematics. In: International Conference on Learning Representations (2019)
Google Scholar
Landrum, G.: RDKit: open-source cheminformatics. http://www.rdkit.org
Liu, Y., Wu, Z., Ritchie, D., Freeman, W.T., Tenenbaum, J.B., Wu, J.: Learning to describe scenes with programs. In: International Conference on Learning Representations (ICLR) (2018)
Google Scholar
Luby, M., Sinclair, A., Zuckerman, D.: Optimal speedup of Las Vegas algorithms. Inf. Process. Lett. 47(4), 173–180 (1993)
Article MathSciNet Google Scholar
Nakhost, H., Hoffmann, J., Müller, M.: Resource-constrained planning: a Monte Carlo random walk approach. In: International Conference on Automated Planning and Scheduling (ICAPS) (2012)
Google Scholar
Nazari, M., Oroojlooy, A., Snyder, L., Takác, M.: Reinforcement learning for solving the vehicle routing problem. In: Conference on Neural Information Processing Systems (NeurIPS), pp. 9839–9849 (2018)
Google Scholar
Poole, B., Sohl-Dickstein, J., Ganguli, S.: Analyzing noise in autoencoders and deep networks. arXiv preprint arXiv:1406.1831 (2014)
Resnick, S.I.: Heavy-Tail Phenomena: Probabilistic and Statistical Modeling. Springer, Heidelberg (2007). https://doi.org/10.1007/978-0-387-45024-7
Book MATH Google Scholar
Sharma, G., Goyal, R., Liu, D., Kalogerakis, E., Maji, S.: CSGNet: neural shape parser for constructive solid geometry. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5515–5523 (2018)
Google Scholar
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
MathSciNet MATH Google Scholar
Sterling, T., Irwin, J.J.: ZINC 15-ligand discovery for everyone. J. Chem. Inf. Model. 55(11), 2324–2337 (2015)
Article Google Scholar
Tian, Y., et al.: Learning to infer and execute 3D shape programs. In: International Conference on Learning Representations (ICLR) (2019)
Google Scholar
Vinyals, O., Toshev, A., Bengio, S., Erhan, D.: Show and tell: lessons learned from the 2015 MSCOCO image captioning challenge. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 652–663 (2017)
Article Google Scholar
Walsh, T.: Search in a small world. In: International Joint Conference on Artificial Intelligence (IJCAI), pp. 1172–1177 (1999)
Google Scholar
Weininger, D.: Smiles, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci. 28(1), 31–36 (1988)
Article Google Scholar
Wildman, S.A., Crippen, G.M.: Prediction of physicochemical parameters by atomic contributions. J. Chem. Inf. Comput. Sci. 39(5), 868–873 (1999)
Article Google Scholar
Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3–4), 229–256 (1992). https://doi.org/10.1007/BF00992696
Article MATH Google Scholar
Zhang, W.: Complete anytime beam search. In: National Conference on Artificial Intelligence (AAAI), pp. 425–430 (1998)
Google Scholar
Zohar, A., Wolf, L.: Automatic program synthesis of long programs with a learned garbage collector. In: Conference on Neural Information Processing Systems (NeurIPS), pp. 2094–2103 (2018)
Google Scholar

Download references

Acknowledgements

We thank the anonymous reviewers for their valuable feedback. This work was supported by the Natural Sciences and Engineering Research Council of Canada.

Author information

Authors and Affiliations

Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, Canada
Eldan Cohen & J. Christopher Beck

Authors

Eldan Cohen
View author publications
You can also search for this author in PubMed Google Scholar
J. Christopher Beck
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Eldan Cohen .

Editor information

Editors and Affiliations

Monash University, Melbourne, VIC, Australia
Peter J. Stuckey

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cohen, E., Beck, J.C. (2021). Heavy-Tails and Randomized Restarting Beam Search in Goal-Oriented Neural Sequence Decoding. In: Stuckey, P.J. (eds) Integration of Constraint Programming, Artificial Intelligence, and Operations Research. CPAIOR 2021. Lecture Notes in Computer Science(), vol 12735. Springer, Cham. https://doi.org/10.1007/978-3-030-78230-6_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-78230-6_8
Published: 17 June 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-78229-0
Online ISBN: 978-3-030-78230-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics