ABSTRACT
Gradient descent methods have long been the de facto standard for training deep neural networks. Millions of training samples are fed into models with billions of parameters, which are slowly updated over hundreds of epochs. Recently, it's been shown that large, randomly initialized neural networks contain subnetworks that perform as well as fully trained models. This insight offers a promising avenue for training future neural networks by simply pruning weights from large, random models. However, this problem is combinatorically hard and classical algorithms are not efficient at finding the best subnetwork. In this paper, we explore how quantum algorithms could be formulated and applied to this neuron selection problem. We introduce several methods for local quantum neuron selection that reduce the entanglement complexity that large scale neuron selection would require, making this problem more tractable for current quantum hardware.
- Scott Aaronson. 2018. Introduction to Quantum Information Science. https://www.scottaaronson.com/qclec.pdfGoogle Scholar
- MD SAJID ANIS et al. 2021. Qiskit: An Open-source Framework for Quantum Computing. Google ScholarCross Ref
- B. Apolloni, C. Carvalho, and D. de Falco. 1989. Quantum stochastic optimization. Stochastic Processes and their Applications 33, 2 (1989), 233--244. Google ScholarCross Ref
- Davis Blalock, Jose Javier Gonzalez Ortiz, Jonathan Frankle, and John Guttag. 2020. What is the State of Neural Network Pruning?. In Proceeding of the Machine Learning and Systems Conference.Google Scholar
- Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. Google ScholarCross Ref
- Stephen G. Brush. 1967. History of the Lenz-Ising Model. Reviews of Modern Physics 39, 4 (Oct. 1967), 883--893. Google ScholarCross Ref
- Hugh Collins and Kortney Easterly. 2021. IBM Unveils Breakthrough 127-Qubit Quantum Processor. https://newsroom.ibm.com/2021-11-16-IBM-Unveils-Breakthrough-127-Qubit-Quantum-ProcessorGoogle Scholar
- Zihang Dai, Hanxiao Liu, Quoc V. Le, and Mingxing Tan. 2021. CoAtNet: Marrying Convolution and Attention for All Data Sizes. arXiv:2106.04803 [cs.CV]Google Scholar
- Diego de Falco, B. Apolloni, and Nicolò Cesa-Bianchi. 1988. A numerical implementation of quantum annealing.Google Scholar
- James Diffenderfer and Bhavya Kailkhura. 2021. Multi-Prize Lottery Ticket Hypothesis: Finding Accurate Binary Neural Networks by Pruning A Randomly Weighted Network. Google ScholarCross Ref
- Gerald M. Edelman. 1987. Neural Darwinism : the theory of neuronal group selection. Basic Books. http://www.worldcat.org/isbn/9780465049349Google Scholar
- Edward Farhi, Jeffrey Goldstone, and Sam Gutmann. 2014. A Quantum Approximate Optimization Algorithm. Google ScholarCross Ref
- Jonathan Frankle and Michael Carbin. 2019. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks. arXiv:1803.03635 [cs.LG]Google Scholar
- Jonathan Frankle, Gintare Karolina Dziugaite, Daniel M. Roy, and Michael Carbin. 2019. Linear Mode Connectivity and the Lottery Ticket Hypothesis. Google ScholarCross Ref
- Adam Gaier and David Ha. 2019. Weight Agnostic Neural Networks. arXiv:1906.04358 [cs.LG]Google Scholar
- Lov K. Grover. 1996. A Fast Quantum Mechanical Algorithm for Database Search. In Proceedings of the Twenty-Eighth Annual ACM Symposium on Theory of Computing (Philadelphia, Pennsylvania, USA) (STOC '96). Association for Computing Machinery, New York, NY, USA, 212--219. Google ScholarDigital Library
- Laszlo Gyongyosi and Sandor Imre. 2019. A Survey on quantum computing technology. Computer Science Review 31 (2019), 51--71. Google ScholarDigital Library
- Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the Knowledge in a Neural Network. Google ScholarCross Ref
- Herbert Jaeger and Harald Haas. 2004. Harnessing Nonlinearity: Predicting Chaotic Systems and Saving Energy in Wireless Communication. Science 304, 5667 (2004), 78--80. arXiv:https://www.science.org/doi/pdf/10.1126/science.1091277 Google ScholarCross Ref
- Yann LeCun, John Denker, and Sara Solla. 1989. Optimal Brain Damage. In Advances in Neural Information Processing Systems, D. Touretzky (Ed.), Vol. 2. Morgan-Kaufmann. https://proceedings.neurips.cc/paper/1989/file/6c9882bbac1c7093bd25041881277658-Paper.pdfGoogle Scholar
- Eran Malach, Gilad Yehudai, Shai Shalev-Shwartz, and Ohad Shamir. 2020. Proving the Lottery Ticket Hypothesis: Pruning is All You Need. Google ScholarCross Ref
- Catherine McGeoch and Pau Farré. 2022. Advantage Processor Overview. https://www.dwavesys.com/media/3xvdipcn/14-1048aa_advantage_processor_overview.pdfGoogle Scholar
- Laurent Orseau, Marcus Hutter, and Omar Rivasplata. 2020. Logarithmic Pruning is All You Need. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 2925--2934. https://proceedings.neurips.cc/paper/2020/file/1e9491470749d5b0e361ce4f0b24d037-Paper.pdfGoogle Scholar
- Ankit Pensia, Shashank Rajput, Alliot Nagle, Harit Vishwakarma, and Dimitris Papailiopoulos. 2020. Optimal Lottery Tickets via SubsetSum: Logarithmic Over-Parameterization is Sufficient. Google ScholarCross Ref
- Alberto Peruzzo, Jarrod McClean, Peter Shadbolt, Man-Hong Yung, Xiao-Qi Zhou, Peter J. Love, Alán Aspuru-Guzik, and Jeremy L. O'Brien. 2014. A variational eigenvalue solver on a photonic quantum processor. Nature Communications 5, 1 (jul 2014). Google ScholarCross Ref
- Vivek Ramanujan, Mitchell Wortsman, Aniruddha Kembhavi, Ali Farhadi, and Mohammad Rastegari. 2020. What's Hidden in a Randomly Weighted Neural Network?. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
- Michele Sasdelli and Tat-Jun Chin. 2021. Quantum Annealing Formulation for Binary Neural Networks. Google ScholarCross Ref
- Shai Shalev-Shwartz, Ohad Shamir, and Shaked Shammah. 2017. Failures of Gradient-Based Deep Learning. Google ScholarCross Ref
- Darrell Whitley, Renato Tinós, and Francisco Chicano. 2015. Optimal Neuron Selection: NK Echo State Networks for Reinforcement Learning. Google ScholarCross Ref
- Mitchell Wortsman, Ali Farhadi, and Mohammad Rastegari. 2019. Discovering Neural Wirings. Google ScholarCross Ref
- Xiaohua Zhai, Alexander Kolesnikov, Neil Houlsby, and Lucas Beyer. 2021. Scaling Vision Transformers. arXiv:2106.04560 [cs.CV]Google Scholar
- Hattie Zhou, Janice Lan, Rosanne Liu, and Jason Yosinski. 2019. Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask. Google ScholarCross Ref
Index Terms
- Quantum neuron selection: finding high performing subnetworks with quantum algorithms
Recommendations
Quantum perceptron over a field and neural network architecture selection in a quantum computer
In this work, we propose a quantum neural network named quantum perceptron over a field (QPF). Quantum computers are not yet a reality and the models and algorithms proposed in this work cannot be simulated in actual (or classical) computers. QPF is a ...
Neural networks with quantum gated nodes
With a view to investigating similarities in aspects of biological neural networks with quantum ones, so that quantum machines can be developed in future with some of the advantages of biological systems of information processing where a certain amount ...
Quantum neuron with real weights
AbstractThis paper proposes a new model of a real weights quantum neuron exploiting the so-called quantum parallelism which allows for an exponential speedup of computations. The quantum neurons were trained in a classical–quantum approach, ...
Comments