Abstract
Clustering problems in image recognition are recurrent in unsupervised machine learning. The K-means algorithm is a simple and popular algorithm for solving the clustering problem. However, for example, the calculation of the distance between the cluster center and data is time-consuming in the centroid calculation of the K-means algorithm, in particular, for large data sizes. In the present paper, we investigate the possibility of quantum computation to speedup K-means algorithm for large data sizes. We describe a quantum-enhanced K-means algorithm from which centroid calculations are removed. For mean and distance calculations of vector data, we propose a quantum subroutine based on quantum entanglement with a potential speedup to make the speed of the proposed subroutine comparable to its classical counterpart. The proposed K-means algorithm is evaluated on three datasets: synthetic, Iris, and image datasets. The numerical experimental results show that the clustering performance of the proposed algorithm is comparable to that of the classical K-means algorithm.
Similar content being viewed by others
Data Availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.
Notes
The Qiskit version was as follows: ’qiskit-terra: 0.15.2’, ’qiskit-aer: 0.6.1’, ’qiskit-ignis: 0.4.0’, ’qiskit-ibmq-provider: 0.9.0’, ’qiskit-aqua: 0.7.5’, and ’qiskit: 0.21.0’. The Python version was 3.8.5.
The Scikit-learn version was 0.23.2.
https://www.pakutaso.com/nature/flower/ (last accessed on Sep., 2021)
Box-plots are interpreted as follows. The center line in the box denotes the median of the data. The top and bottom edges of the box denote the third and first quartile points, respectively. On the upper (lower) side, the horizontal line denotes the maximum (minimum) point in the range between (the first quartile point \(-\) 1.5 \( \times \) (the third—the first quartile points)) and (the third quartile point + 1.5 \( \times \) (the third—the first quartile points)). The circles denote larger or smaller points than the horizontal lines on the upper or lower sides, respectively, and indicate the outlier values.
For the t test and the Wilcoxon signed rank test, the null hypothesis states that \( mean_{1} = mean_{2} \).
We modified the code shown in the Web site (https://www.sejuku.net/blog/64365).
References
Aïmeur, E., Brassard, G., Gambs, S.: Machine learning in a quantum world. In: Lamontagne, L., Marchand, M. (eds.) Advances in artificial intelligence, pp. 431–442. Springer, Berlin (2006)
Aleksandrowicz, G., Alexander, T., Barkoutsos, P., Bello, L., Ben-Haim, Y., Bucher, D., Cabrera-Hernández, F.J., Carballo-Franquis, J., Chen, A., Chen, C.F., Chow, J.M., Córcoles-Gonzales, A.D., Cross, A.J., Cross, A., Cruz-Benito, J., Culver, C., González, S.D.L.P., Torre, E.D.L., Ding, D., Dumitrescu, E., Duran, I., Eendebak, P., Everitt, M., Sertage, I.F., Frisch, A., Fuhrer, A., Gambetta, J., Gago, B.G., Gomez-Mosquera, J., Greenberg, D., Hamamura, I., Havlicek, V., Hellmers, J., Herok, Ł., Horii, H., Hu, S., Imamichi, T., Itoko, T., Javadi-Abhari, A., Kanazawa, N., Karazeev, A., Krsulich, K., Liu, P., Luh, Y., Maeng, Y., Marques, M., Martin-Fernández, F.J., McClure, D.T., McKay, D., Meesala, S., Mezzacapo, A., Moll, N., Rodríguez, D.M., Nannicini, G., Nation, P., Ollitrault, P., O’Riordan, L.J., Paik, H., Pérez, J., Phan, A., Pistoia, M., Prutyanov, V., Reuter, M., Rice, J., Davila, A.R., Rudy, R.H.P., Ryu, M., Sathaye, N., Schnabel, C., Schoute, E., Setia, K., Shi, Y., Silva, A., Siraichi, Y., Sivarajah, S., A.Smolin, J., Soeken, M., Takahashi, H., Tavernelli, I., Taylor, C., Taylour, P., Trabing, K., Treinish, M., Turner, W., Vogt-Lee, D., Vuillot, C., Wildstrom, J.A., Wilson, J., Winston, E., Wood, C., Wood, S., Worner, S., Akhalwaya, I.Y., Zoufal, C.: Qiskit: An Open-source Framework for Quantum Computing (2019). https://doi.org/10.5281/zenodo.2562111
Arunachalam, S., de Wolf, R.: Optimal quantum sample complexity of learning algorithms. J. Mach. Learn. Res. 19(71), 1–36 (2018)
Baritompa, W.P., Bulger, D.W., Wood, G.R.: Grover’s quantum algorithm applied to global optimization. SIAM J. Opt. 15(4), 1170–1184 (2005). https://doi.org/10.1137/040605072
Benedetti, M., Lloyd, E., Sack, S., Fiorentini, M.: Parameterized quantum circuits as machine learning models. Quant. Sci. Technol. 4(4), 043001 (2019). https://doi.org/10.1088/2058-9565/ab4eb5
Biamonte, J., Wittek, P., Pancotti, N., Rebentrost, P., Wiebe, N., Lloyd, S.: Quantum machine learning. Nature 549(7671), 195–202 (2017). https://doi.org/10.1038/nature23474
Biau, G., Devroye, L., Lugosi, G.: On the performance of clustering in hilbert spaces. IEEE Trans. Inf. Theory 54(2), 781–790 (2008). https://doi.org/10.1109/TIT.2007.913516
Bishop, C.M.: Pattern recognition and machine learning. Springer, Berlin (2006)
Brassard, G., Dupuis, F., Gambs, S., Tapp, A.: An optimal quantum algorithm to approximate the mean and its application for approximating the median of a set of points over an arbitrary distance. arXiv:1106.4267 [quant-ph] (2011)
Dürr, C., Høyer, P.: A quantum algorithm for finding the minimum. arXiv:quant-ph/9607014 (1996)
Goel, A., Tung, C., Lu, Y.H., Thiruvathukal, G.K.: A survey of methods for low-power deep learning and computer vision. In: 2020 IEEE 6th world forum on Internet of Things (WF-IoT), pp. 1–6 (2020). https://doi.org/10.1109/WF-IoT48130.2020.9221198
Grover, L.K.: A fast quantum mechanical algorithm for database search. In: Proceedings of the twenty-eighth annual ACM symposium on theory of computing, STOC ’96, pp. 212–219. Association for computing machinery, New York, NY, USA (1996). https://doi.org/10.1145/237814.237866
Kerenidis, I., Landman, J., Luongo, A., Prakash, A.: q-means: A quantum algorithm for unsupervised machine learning. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in neural information processing systems, vol. 32. Curran Associates, Inc. (2019)
Khan, S.U., Awan, A.J., Vall-llosera, G.: K-means clustering on noisy intermediate scale quantum computers. arXiv preprint arXiv:1909.12183 (2019)
Kopczyk, D.: Quantum machine learning for data scientists. arXiv preprint arXiv:1804.10068 (2018)
Lloyd, S., Mohseni, M., Rebentrost, P.: Quantum algorithms for supervised and unsupervised machine learning. arXiv preprint arXiv:1307.0411 (2013)
Lloyd, S., Mohseni, M., Rebentrost, P.: Quantum principal component analysis. Nat. Phys. 10(9), 631–633 (2014). https://doi.org/10.1038/nphys3029
Nielsen, M.A., Chuang, I.L.: Quantum computation and quantum information, 10th edn. Cambridge University Press, USA (2011)
Rebentrost, P., Mohseni, M., Lloyd, S.: Quantum support vector machine for big data classification. Phys. Rev. Lett. 113, 130503 (2014). https://doi.org/10.1103/PhysRevLett.113.130503
Rosenberg, A., Hirschberg, J.: V-measure: A conditional entropy-based external cluster evaluation measure. In: Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL), pp. 410–420. Association for computational linguistics, Prague, Czech Republic (2007)
Schuld, M., Sinayskiy, I., Petruccione, F.: An introduction to quantum machine learning. Contemp. Phys. 56(2), 172–185 (2015). https://doi.org/10.1080/00107514.2014.964942
Shor, P.: Algorithms for quantum computation: discrete logarithms and factoring. In: Proceedings 35th annual symposium on foundations of computer science, pp. 124–134 (1994). https://doi.org/10.1109/SFCS.1994.365700
Valiant, L.G.: A theory of the learnable. Commun ACM 27(11), 1134–1142 (1984). https://doi.org/10.1145/1968.1972
Wiebe, N., Kapoor, A., Svore, K.M.: Quantum algorithms for nearest-neighbor methods for supervised and unsupervised learning. Quant. Inf. Comput. 15(3–4), 316–356 (2015)
Acknowledgements
The author would like to thank the anonymous reviewers for their valuable comments and suggestions on the manuscript.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The author declares that there are no conflicts of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
An implementation of the classical K-means algorithm
A classical K-means algorithm can be implemented in Python as followsFootnote 6:
An implementation of Q-MIN
A quantum subroutine qsub_min for Q-MIN for two qubits is implemented in Python and Qiskit [2] as follows:
Detailed derivation of Eq. 8
The state \( \vert \psi _{3} \rangle \) is derived as follows:
Implementation of the proposed algorithm
A quantum subroutine qsub_md for the proposed algorithm for \( m = 2 \) and \( d = 2 \) is implemented in Python and Qiskit as follows:
Implementation of the proposed quantum-enhanced K-means algorithm
Using the quantum subroutines (qsub_md and qsub_min), the proposed algorithm is implemented as follows:
Other results for synthetic, Iris, and image datasets
Rights and permissions
About this article
Cite this article
Ohno, H. A quantum algorithm of K-means toward practical use. Quantum Inf Process 21, 146 (2022). https://doi.org/10.1007/s11128-022-03485-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11128-022-03485-x