Abstract
In this paper, we investigate the learnability of some hypothesis sets for regression and binary classification defined by quantum circuits. The analysis is based on concepts and results from quantum computing (Solovay–Kitaev theorem) and statistical learning theory (covering numbers and Rademacher complexity). The obtained learning bounds depend polynomially on the parameters defining the circuits set, namely the number of qubits and the number of 1 and 2 qubits gates used for their implementation. Our setting is quite general: no realisability assumptions are made, and any 1 and 2 qubits gates are allowed. Finally, we compare the current bounds with others found in the literature and discuss their implications for classification and regression on quantum data.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Aaronson, S.: The learnability of quantum states. Proc. R. Soc. A Math. Phys. Eng. Sci. A 463, 3089–3114 (2007)
Aaronson, S., Chen, X., Hazan, E., Kale, S.: Online learning of quantum states. In: Proceedings of the 32nd international conference on Neural Information Processing Systems (NIPS’18), pp. 8976–8986 (2018)
Aaronson, S.: Shadow tomography of quantum states. SIAM J. Comput. 49(5), STOC18-368–STOC18-394 (2020)
Alvarez-Rodriguez, U., Lamata, L., Escandell-Montero, P., Martín-Guerrero, J.D., Solano, E.: Supervised Quantum Learning without Measurements. Sci. Rep. 7, 1–9 (2017)
Ambainis, A., Iwama, K., Kawachi, A., Raymond, R., Yamashita, S.: Improved algorithms for quantum identification of Boolean oracles. Theor. Comput. Sci. 378(1), 41–53 (2007)
Ambainis, A., Iwama, K., Nakanishi, M., Nishimura, H., Raymond, R., Tani, S., Yamashita, S.: Quantum query complexity of almost all functions with fixed on-set size. Comput. Complex. 25, 723–735 (2016)
Anthony, M.M., Bartlett, P.: Learning in Neural Networks: Theoretical Foundations. Cambridge University Press, USA (1999)
Arunachalam, S., de Wolf, R., Column, G.: A survey of quantum learning theory. ACM SIGACT News 48, 41–44 (2017)
Arunachalam, S., de Wolf, R.: optimal quantum sample complexity of learning algorithms. J. Mach. Learn. Res. 19, 1–36 (2018)
Atici, A., Servedio, R.A.: Improved bounds on quantum learning algorithms. Quantum Inf. Process. 4, 355–386 (2005)
Atici, A., Servedio, R.A.: Quantum algorithms for learning and testing juntas. Quantum Inf. Process. 6, 323–348 (2007)
Babbush, R., Love, P., Aspuru-Guzik, A.: Adiabatic quantum simulation of quantum chemistry. Sci. Rep. 4, 6603 (2014). https://urldefense.proofpoint.com/v2/url?u=https-3A__doi.org_10.1038_srep06603&d=DwIDaQ&c=vh6FgFnduejNhPPD0fl_yRaSfZy8CWbWnIf4XJhSqx8&r=eIE3I0XpWWrhwtq0qhyjYYVSdRw0yjTwnJuvumozR6g&m=ktVzayvb14pbX9VNfqgr9cr1m-Nc4-BjWGbh_tcuitA&s=sa8a5Wizs1VC2m9HxRnErTrwBW-zikobBlVES52sTgU&e=
Banchi, L., Pereira, J., Pirandola, S.: Generalization in quantum machine learning: a quantum information perspective, arXiv preprint, arXiv:2102.08991 (2021)
Bartlett, P.L., Long, P.M.: Prediction, learning, uniform convergence, and scale-sensitive dimensions. J. Comput. Syst. Sci. 56(2), 174–190 (1998)
Anthony, M., Bartlett, P.L.: Function learning from interpolation. Comb. Probab. Comput. 9(3), 213–225 (2000)
Bartlett, P.L., Mendelson, S.: Rademacher and Gaussian complexities: risk bounds and structural results. J. Mach. Learn. Res. 3, 463–482 (2002)
Belovs, A.: Quantum algorithms for learning symmetric juntas via adversary bound. Comput. Complex. 24(2), 255–293 (2015)
Biamonte, J., et al.: Quantum machine learning. Nature 549, 195–202 (2017)
Bshouty, N.H., Jackson, J.C.: Learning DNF over the uniform distribution using a quantum example oracle. SIAM J. Comput. 28, 1136–1153 (1999)
Bu, K., Koh, D.E., Li, L., Luo, Q., Zhang, Y.: On the statistical complexity of quantum circuits, ” arXiv preprint, arXiv:2101.06154 (2021)
Bu, K., Koh, D.E., Li, L., Luo, Q., Zhang, Y.: Effects of quantum resources on the statistical complexity of quantum circuits, ” arXiv preprint, arXiv:2102.03282 (2021)
Bu, K., Koh, D.E., Li, L., Luo, Q., Zhang, Y.: Rademacher complexity of noisy quantum circuits, ” arXiv preprint, arXiv:2103.03139 (2021)
Boucheron, S., Bousquet, O., Lugosi, G.: Theory of classification: a survey of some recent advances. ESAIM Probab. Stat. 9, 323–375 (2005)
Cao, Y., Romero, J., Aspuru-Guzik, A.: Potential of quantum computing for drug discovery. IBM J. Res. Dev. 62, 6:1-6:20 (2018)
Caro, M.C., Datta, I.: Pseudo-dimension of quantum circuits. Quantum Mach. Intell. 2(2), 1–14 (2020)
Chen, J., Nurdin, H.I.: Learning nonlinear input–output maps with dissipative quantum systems. Quantum Inf. Process. 18, 1–36 (2019)
Cheng, H.-C., Hsieh, M.-H., Yeh, P.-C.: The learnability of unknown quantum measurements. Quantum Inf. Process. 16(7–8), 615–656 (2016)
Chung, K.-M., Lin, H.-H., Luo, Q., Zhang, Y.: Sample efficient algorithms for learning quantum channels in PAC model and the approximate state discrimination problem, arXiv preprint, arXiv:1810.10938 (2018)
Dawson, C.M., Nielsen, M.A.: The solovay-kitaev algorithm. Quantum Inf. Comput. 6, 81–95 (2005)
Denil, M, De Freitas.: N. Toward the implementation of a quantum RBM. In: Neural Information Processing Systems (NIPS) Conf. on Deep Learning and Unsupervised Feature Learning Workshop, vol. 5. (2011). https://urldefense.proofpoint.com/v2/url?u=https-3A__ora.ox.ac.uk_objects_uuid-3Aea79d085-2D6f08-2D4341-2D9af1-2D5a3972542bfa&d=DwIDaQ&c=vh6FgFnduejNhPPD0fl_yRaSfZy8CWbWnIf4XJhSqx8&r=eIE3I0XpWWrhwtq0qhyjYYVSdRw0yjTwnJuvumozR6g&m=ktVzayvb14pbX9VNfqgr9cr1m-Nc4-BjWGbh_tcuitA&s=ia3yZdyHbICwhMaxmh6yKZCNsFqMB3jqRSYMnZgVQ7A&e=
Dudley, R.M.: Universal Donsker classes and metric entropy. Ann Probab 15, 1306–1326 (1987)
Dumoulin, V., Goodfellow, I.J., Courville, A., Bengio, Y.: On the challenges of physical implementations of RBMs, In: Proceedings of the 28th AAAI Conference on Artificial Intelligence (2014)
Freeman, A.J.: Materials by design and the exciting role of quantum computation/simulation. J. Comput. Appl. Math. 149, 27–56 (2002)
Giannozzi, P., et al.: QUANTUM ESPRESSO: a modular and open-source software project for quantum simulations of materials. J. Phys. Condens. Matter 21(39), 395502 (2009). https://urldefense.proofpoint.com/v2/url?u=https-3A__iopscience.iop.org_article_10.1088_0953-2D8984_21_39_395502_meta&d=DwIDaQ&c=vh6FFnduejNhPPD0fl_yRaSfZy8CWbWnIf4XJhSqx8&r=eIE3I0XpWWrhwtq0qhyjYYVSdRw0yjTwnJuvumozR6g&m=ktVzayvb14pbX9VNfqgr9cr1m-Nc4-BjWGbh_tcuitA&s=XrJ91XMEzZcEqAqd4Ejb-05npqK3o13jxooZvSBknqY&e=
Goldberg, P.W., Jerrum, M.R.: Bounding the Vapnik-Chervonenkis dimension of concept classes parameterized by real numbers. Mach. Learn. 18(2–3), 131–148 (1995)
Grilo, A.B., Kerenidis, I., Zijlstra, T.: Learning-with-errors problem is easy with quantum samples. Phys. Rev. A 99(3), 032314 (2019)
Horn, R.A., Johnson, C.R.: Matrix Analysis, 2nd edn. Cambridge University Press, Cambridge and NewYork (2012). https://urldefense.proofpoint.com/v2/url?u=https-3A__dl.acm.org_doi_book_10.5555_2422911&d=DwIDaQ&c=vh6FgFnduejNhPPD0fl_yRaSfZy8CWbWnIf4XJhSqx8&r=eIE3I0XpWWrhwtq0qhyjYYVSdRw0yjTwnJuvumozR6g&m=ktVzayvb14pbX9VNfqgr9cr1m-Nc4-BjWGbh_tcuitA&s=FpGr4ZCFMqt8zm2P6s-Pcr6B9x1_hZiVCmEWwYJ6IkY&e=
Kiani, B.T., Lloyd, S., Maity, R.: Learning Unitaries by Gradient Descent, arXiv preprint, arXiv:2001.11897 (2020)
Kitaev, A.Y.: Quantum computations: algorithms and error correction. Russ. Math. Surv. 6, 1191–1249 (1997)
Koltchinskii V., Panchenko D.: Rademacher processes and bounding the risk of function learning. In: Giné E., Mason D.M., Wellner J.A. (eds) High Dimensional Probability II. Progress in Probability, vol 47, pp 443–45. Birkhäuser, Boston, MA (2000). https://link.springer.com/chapter/10.1007/978-1-4612-1358-1_29
Koltchinskii, V., Panchenko, D.: Rademacher penalties and structural risk minimization. IEEE Trans. Inf. Theory 47(5), 1902–1914 (2001)
Kothari, R.: An optimal quantum algorithm for the oracle identification problem, In: 31st international Symposium on Theoretical Aspects of Computer Science (STACS 2014), pp. 482–493 (2014)
Lust-Piquard, F., Pisier, G.: Non Commutative Khintchine and Paley inequalities. Arkiv for Matematik 29(1–2), 241–260 (1991)
Ma, H., Govoni, M., Galli, G.: Quantum simulations of materials on near-term quantum computers. NPJ Comput. Mater. 6(1), 1–8 (2020)
McArdle, S., Endo, S., Aspuru-Guzik, A., Benjamin, S.C., Yuan, X.: Quantum computational chemistry. Rev. Modern Phys. 92(1), 015003 (2020)
Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning, 2nd edn. MIT Press, Cambridge (2018)
Neyshabur, B., Tomioka, R., Srebro, N.: Norm-based capacity control in neural networks, In: Proceedings of the 28th conference on learning theory, PMLR, vol. 40, pp. 1376-1401 (2015)
Nielsen, M.A., Chuang, I.: Quantum Computation and Quantum Information. Cambridge University Press, Cambridge and New York (2010)
Preskill, J.: Lecture Notes for Physics 229: Quantum Information and Computation. CreateSpace Independent Publishing Platform, Scotts Valley (2015)
Rebentrost, P., Mohseni, M., Lloyd, S.: Quantum support vector machine for big data classification. Phys. Rev. Lett. 113, 130503 (2014)
Rocchetto, A.: Stabiliser states are efficiently PAC-learnable. Quant. Info. Comput. 18(7–8), 541–552 (2018). https://urldefense.proofpoint.com/v2/url?u=https-3A__dl.acm.org_doi_abs_10.5555_3370256.3370257&d=DwIDaQ&c=vh6FgFnduejNhPPD0fl_yRaSfZy8CWbWnIf4XJhSqx8&r=eIE3I0XpWWrhwtq0qhyjYYVSdRw0yjTwnJuvumozR6g&m=ktVzyvb14pbX9VNfqgr9cr1m-Nc4-BjWGbh_tcuitA&s=KDei-kYroVbDvNr7ZdfCnW9rRU1Ng5cJlKh0JCaeIsg&e=
Rocchetto1, A., Aaronson, S., Severini, S., Carvacho, G., Poderini, D., Agresti, I., Bentivegna, M., Sciarrino, F. : Experimental learning of quantum states. Sci. Adv. 5(3), eaau1946 (2019)
Servedio, R.A., Gortler, S.J.: Equivalences and separations between quantum and classical learnability. SIAM J. Comput. 33(5), 1067–1092 (2004)
Shalev-Shwartz, S., Ben-David, S.: Understanding Machine Learning. Cambridge University Press, Cambridge and New York (2014)
Sudakov, V.N.: Gaussian random processes and measures of solid angles in Hilbert space. Doklady Akademii Nauk SSSR 197, 412–415 (1971). (Russian)
Warren, H.E.: Lower bounds for approximation by nonlinear manifolds. Trans. Am. Math. Soc. 133, 167–178 (1968)
Zhang, C.: An improved lower bound on query complexity for quantum PAC learning. Inf. Process. Lett. 111, 40–45 (2010)
Zhao, Z., Fitzsimons, J.K., Fitzsimons, J.F.: Quantum-assisted Gaussian process regression. Phys. Rev. A 99(5), 052331 (2019)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Additional proofs
In this appendix, we present the proofs of the theorems from Sect. 4 which were not proved there. For convenience, the theorems are restated.
Theorem 5
Let \(\mathcal {H}_{d,\gamma }\) be the hypothesis set defined by Eq. (3) and S a sample of size n. Then, we have
where \(3 < c \le 4\) is the constant from Solovay–Kitaev theorem.
Proof
For any family of circuits \(\mathcal {C}_{d,\gamma }\), we will define a finite family \(\mathcal {C}_{d,\gamma , \epsilon '}\), for some \(\epsilon >0\), in the following way: \(\mathcal {C}_{d,\gamma , \epsilon '}\) is the finite set of circuits of minimum cardinality such that for any circuit C in \(\mathcal {C}_{d,\gamma }\), there is a circuit \(C_{\epsilon '}\) in \(\mathcal {C}_{d,\gamma , \epsilon '}\) that has the same positions for all the gates, and for each gate of C the analog gate in \(C_\epsilon '\) is an \(\epsilon '\) approximation of it (in operator norm). By universality property [48], such a construction is possible. The circuits in \(\mathcal {C}_{d,\gamma , \epsilon '}\) will be called approximation circuits.
Let as call the architecture of a circuit a specific placement of each gate. For the set of all possible architectures for circuits of size at most \(\gamma \) on \(q = log_{2}d\) qubits we use the notation \(\mathcal {M}_{d,\gamma }\). We will now focus our attention on circuits with a fixed architecture \(\mathcal {N} \in \mathcal {M}_{d,\gamma }\), denoted by \(\mathcal {C}_{d,\gamma }^{\mathcal {N}}\).
By a method analog to the one presented in the first paragraph, we can define for a fixed architecture \(\mathcal {N} \in \mathcal {M}_{d,\gamma }\), the corresponding family of approximation circuits \(\mathcal {C}_{d,\gamma , \epsilon '}^{\mathcal {N}}\). For each set of circuits, we have also an associated family of unitary operators, \(\mathcal {U}_{d,\gamma }^{\mathcal {N}}\) and \(\mathcal {U}_{d,\gamma ,\epsilon '}^{\mathcal {N}}\), respectively (see Eq. (2)).
For a sample S, the hypothesis sets defined based on the two families of circuits, \(\mathcal {C}_{d,\gamma }^{\mathcal {N}}\) and \(\mathcal {C}_{d,\gamma , \epsilon '}^{\mathcal {N}}\), will generate the following subsets of \([0,1]^{n}\):
A similar set can be defined for \(\mathcal {C}_{d,\gamma }\):
For a fixed \(\epsilon \), let us take \(\epsilon ' = \frac{\epsilon }{2\sqrt{n}\gamma }\). For some \(u \in \mathcal {A}_{d,\gamma ,S}^{\mathcal {N}}\), we choose \(u' \in \mathcal {A}_{d,\gamma ,\epsilon ',S}^{\mathcal {N}}\) to be the closest vector to it (in Euclidean metric). By the definition of Euclidean metric, we have:
Using elementary algebra, we get
The first factor from each term of the right hand side of Eq. (32) can be bounded as follows:
where the first inequality is based on triangle inequality, the equality is by distributivity of inner product, and the last inequality follows from the definition of operator norm, taking into account that the vectors \({\langle {0}|}^{\otimes q}\) and \({|{\phi _i}\rangle }\) have unit length.
The matrices U and \(U'\) can be expressed as \(U = U'_\gamma U_{\gamma - 1}\ldots U_{1}\) and \(U' = U'_\gamma U'_{\gamma - 1}\ldots U'_{1}\), respectively, where \(U_i\) and \(U'_i\) are the transformations corresponding to the layer i in the two circuits. By our assumption that on each layer we only have one 2-qubits local gate, any \(U_i\) can be written as
where \(U_{gi}\) is the 4-dimensional unitary matrix associated with the gate of the layer i.
By the same considerations, we have
\(U'_{gi}\) being the unitary matrix associated to the gate of the layer i in the approximation circuit.
Introducing the notation \(E_i = U_i - U'_i\), and using the distributivity of tensor product over sums, the property \({\Vert A \otimes B\Vert } = {\Vert A\Vert }{\Vert B\Vert }, \forall A,B\) arbitrary matrices, and the fact that the operator norm of the identity matrix is 1, we can write
The last inequality is by construction.
Using the notation \(U_{\gamma 2} = U_\gamma U_{\gamma - 1}\ldots U_{2}\) and \(U'_{\gamma 2} = U'_\gamma U'_{\gamma - 1}\ldots U'_{2}\), we have the following chain of equalities (the strategy is inspired by [48], Section 4.5.3)
By induction, we have
Going back to Eq. (32), we observe that the second factor from each term of the right hand side can be bonded as
and using Eq. (35), the Eq. (32) reduces to
Therefore, remembering that \(\epsilon ' = \frac{\epsilon }{2\sqrt{n}\gamma }\), we have
We have shown that the set \(\mathcal {A}_{d,\gamma ,\epsilon ',S}^{\mathcal {N}}\) is a covering for \(\mathcal {A}_{d,\gamma ,S}^{\mathcal {N}}\) at scale \(\epsilon \). From this point all that remains is to find the size of the minimum set of approximation circuits.
By Solovay–Kitaev theorem ([39], see also Theorem 1), it is known that to approximate a 2-qubits gate to an error of at most \(\epsilon '\), \(C_{1}log^{c}\frac{1}{\epsilon '}\) universal gates are needed (where \(C_1\) is some constant). Since for a fixed architecture we have \(\gamma \) 2-qubits gates in the circuit, the total number of elements from the universal set needed will be:
Because a universal set of gates with 8 elements exists (see Sect. 3.2.1), the set \(\mathcal {A}_{d,\gamma ,S}^{\mathcal {N}}\) can be covered by a set \(\mathcal {A}_{d,\gamma ,\epsilon ',S}^{\mathcal {N}}\), having the cardinality
The number of architectures is upper bounded by \(q^{\gamma }\), because on each layer we have one 2-qubits gate that can be placed in \(q-1\) ways (the gates are restricted to act on consecutive qubits). Therefore, a minimum covering of set \(\mathcal {A}_{d,\gamma ,S}\) will have its cardinality bounded by the product between the number of architectures and the covering number of the set associated with a fixed architecture:
where we have used the fact that \(q \ge 8\). By taking the logarithm in Eq. (41), we have
and this inequality implies the conclusion. \(\square \)
In order to prove Theorem 6, we make use of the following lemma:
Lemma 1
With the notation and assumptions from Sect. 3, it holds that
for an arbitrary \(0 < C_2 \le 1\), and \(k \in {\mathbb {N}}^{*}\).
Proof
By Theorem 5, there is a constant \(C_3>0\) such that
Taking into consideration that \(c \le 4\), and the elementary inequality \(log^{c}(ab) = (log(a) + log(b))^c \le log^{c}(a)log^{c}(b)\), true for any \(a,b>0\) and large enough, we have the following chain of inequalities that implies the conclusion (\(C_5\) is just another constant):
\(\square \)
Theorem 6
The empirical Rademacher complexity of the class of functions \(\mathcal {H}_{d,\gamma }\), defined by Eq. (3) is asymptotically bounded as follows:
where \(3 < c \le 4\) is the constant from Solovay–Kitaev theorem.
Proof
By applying Theorem 2, we have
for any \(M \in {\mathbb {N}}^*\), and \(s = sup_{a \in \mathcal {A}_{d,\gamma ,S}}{\Vert a\Vert }\).
Taking \(M \rightarrow \infty \), the inequality becomes
Because for any \(h \in \mathcal {H}_{d,\gamma }\) and \({|{\phi }\rangle } \in \mathbb {C}^d\), \(0 \le h({|{\phi }\rangle }) \le 1\), and the hypothesis set is rich enough, we have \(s = C_2\sqrt{n}\), for some constant \(0 < C_2 \le 1\). The inequality becomes
Using Lemma 1, the inequality reduces to
(\(C_4>0\) is another constant).
Using the fact that \(\sum _{k=1}^{\infty }k^{2}2^{-k} = 6\), we arrive at the conclusion:
\(\square \)
Notations
Table 2 summarises the notation used in this paper.
Rights and permissions
About this article
Cite this article
Popescu, C.M. Learning bounds for quantum circuits in the agnostic setting. Quantum Inf Process 20, 286 (2021). https://doi.org/10.1007/s11128-021-03225-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11128-021-03225-7