The advent of quantum computing poses significant threats to traditional cryptographic methods, necessitating the development of secure communication techniques such as quantum key distribution (QKD). Despite advancements in QKD, including enhanced fiber optic technologies and multi-key distribution systems, substantial challenges persist in the efficient management of cryptographic keys within QKD networks. Existing heuristic approaches, such as greedy algorithms, often fail to address the complex requirements of key allocation, particularly for provisioning end-to-end keys essential for secure communication between distant nodes. This paper introduces a reinforcement learning (RL)-based method for end-to-end key provisioning in QKD networks. The proposed approach dynamically optimizes key allocation using the state and usage patterns of the network. Specifically, the RL framework integrates graph attention networks and long short-term memory networks to model intricate relationships and temporal dependencies within the network. This integration enables a more efficient and adaptive key distribution. Comparative analyses demonstrate that the RL-based method significantly improves session key availability and allocation efficiency. It outperforms traditional greedy algorithms by minimizing session interruptions and reducing unused quantum keys. These results provide valuable information on the practical implementation of RL-based key provisioning strategies in real-world QKD applications.

Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability
The implementation code and data can be requested from the authors
The given QKDN is called Didactic I and it will be used in Sect. 5 to illustrate the effectiveness of the proposed approach.
Akter, M.S.: Quantum cryptography for enhanced network security: a comprehensive survey of research, developments, and future directions. In: 2023 IEEE International Conference on Big Data (BigData), pp. 5408–5417. IEEE Computer Society, Los Alamitos (2023). https://doi.org/10.1109/BigData59044.2023.10386889
Aleksic, S., Hipp, F., Winkler, D., et al.: Perspectives and limitations of QKD integration in metropolitan area networks. Opt. Express 23(8), 10359 (2015). https://doi.org/10.1364/OE.23.010359
Bennett, C.H., Brassard, G.: Quantum cryptography: public key distribution and coin tossing. Theor. Comput. Sci. 560, 7–11 (2014). https://doi.org/10.1016/j.tcs.2014.05.025
Brassard, G., Lütkenhaus, N., Mor, T., et al.: Limitations on practical quantum cryptography. Phys. Rev. Lett. 85(6), 1330–1333 (2000). https://doi.org/10.1103/PhysRevLett.85.1330
Buchanan, W., Woodward, A.: Will quantum computers be the end of public key encryption? J. Cyber Secur. Technol. 1(1), 1–22 (2017). https://doi.org/10.1080/23742917.2016.1226650
Cao, Y., Zhao, Y., Wu, Y., et al.: Time-scheduled quantum key distribution (QKD) over WDM networks. J. Lightwave Technol. 36(16), 3382–3395 (2018). https://doi.org/10.1109/JLT.2018.2834949
Cao, Y., Zhao, Y., Li, J., et al.: Multi-tenant provisioning for quantum key distribution networks with heuristics and reinforcement learning: a comparative study. IEEE Trans. Netw. Serv. Manage. 17(2), 946–957 (2020). https://doi.org/10.1109/TNSM.2020.2964003
Cao, Y., Zhao, Y., Wang, Q., et al.: The evolution of quantum key distribution networks: on the road to the Qinternet. IEEE Commun. Surv. Tutor. 24(2), 839–894 (2022). https://doi.org/10.1109/COMST.2022.3144219
Chen, T.Y., Jiang, X., Tang, S.B., et al.: Implementation of a 46-node quantum metropolitan area network. npj Quant. Inf. (2021). https://doi.org/10.1038/s41534-021-00474-3
Cicconetti, C., Conti, M., Passarella, A.: Request scheduling in quantum networks. IEEE Trans. Quant. Eng. (2021). https://doi.org/10.1109/TQE.2021.3090532
Diamanti, E., Lo, H.K., Qi, B., et al.: Practical challenges in quantum key distribution. npj Quant. Inf. (2016). https://doi.org/10.1038/npjqi.2016.25
El-Orany, F.A.A., Wahiddin, M.R.B., Mat-Nor, M.A., et al.: Quantum key distribution in terms of the Greenberger–Horne–Zeilinger state: multi-key generation. Laser Phys. 20(5), 1210–1214 (2010). https://doi.org/10.1134/S1054660X10090124
Eraerds, P., Walenta, N., Legré, M., et al.: Quantum key distribution and 1 Gbps data encryption over a single fibre. New J. Phys. (2010). https://doi.org/10.1088/1367-2630/12/6/063027
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735
Hu, Y., Chen, M., Yang, Z., et al.: Optimization of resource allocation in multi-cell OFDM systems: a distributed reinforcement learning approach. In: 2020 IEEE 31st Annual International Symposium on Personal, Indoor and Mobile Radio Communications, pp. 1–6. IEEE (2020). https://doi.org/10.1109/PIMRC48278.2020.9217276
Huangi, R., Si, J., Shi, J., et al.: Deep-reinforcement-learning-based resource allocation in ultra-dense network. In: 2021 13th International Conference on Wireless Communications and Signal Processing (WCSP), pp. 1–5. IEEE (2021). https://doi.org/10.1109/WCSP52459.2021.9613186
International Telecommunication Union: Quantum key distribution networks—software-defined networking control. Tech. Rep. ITU-T Y.3805, International Telecommunication Union (2021)
Kaewpuang, R., Xu, M., Lim, W.Y.B., et al.: Cooperative resource management in quantum key distribution (QKD) networks for semantic communication. arXiv Preprint (2022). arXiv:2209.11957
Kaewpuang, R., Xu, M., Niyato, D., et al.: Stochastic qubit resource allocation for quantum cloud computing. In: NOMS 2023-2023 IEEE/IFIP Network Operations and Management Symposium, pp. 1–5. IEEE (2023)
Kosmatos, E., Stavdas, A., Lord, A.: Design and implementation of the QKD control and management layers for access network deployments. In: 49th European Conference on Optical Communications (ECOC 2023), pp. 405–408 (2023). https://doi.org/10.1049/icp.2023.2091
Lancho, D., Martinez, J., Elkouss, D., et al.: QKD in standard optical telecommunications networks. In: Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, pp. 142–149 (2010). https://doi.org/10.1007/978-3-642-11731-2_18
Le, L., Nguyen, T.N.: DQRA: deep quantum routing agent for entanglement routing in quantum networks. IEEE Trans. Quant. Eng. 3, 1–12 (2022). https://doi.org/10.1109/TQE.2022.3148667
Lee, C., Kim, Y., Shim, K., et al.: Key-count differential-based proactive key relay algorithm for scalable quantum-secured networking. J. Opt. Commun. Netw. 15(5), 282–293 (2023). https://doi.org/10.1364/JOCN.478620
Leung, D., Oppenheim, J., Winter, A.: Quantum network communication-the butterfly and beyond. IEEE Trans. Inf. Theory 56(7), 3478–3490 (2010). https://doi.org/10.1109/TIT.2010.2048442
Ma, X., Wang, C., Li, Z., et al.: Multi-party quantum key distribution protocol with new bell states encoding mode. Int. J. Theor. Phys. 60(4), 1328–1338 (2021). https://doi.org/10.1007/s10773-021-04758-4
Mehic, M., Niemiec, M., Rass, S., et al.: Quantum key distribution: a networking perspective. ACM Comput. Surv. (2020). https://doi.org/10.1145/3402192
Nurhadi, A.I., Syambas, N.R.: Quantum key distribution (QKD) protocols: a survey. In: 2018 4th International Conference on Wireless and Telematics (ICWT), pp. 1–5 (2018). https://doi.org/10.1109/ICWT.2018.8527822
Poppe, A., Langer, T., Lorunser, T., et al.: Results from the SECOQC quantum-key-distribution network. In: CLEO/Europe—EQEC 2009—European Conference on Lasers and Electro-Optics and the European Quantum Electronics Conference, pp. 1–1 (2009). https://doi.org/10.1109/CLEOE-EQEC.2009.5192790
Pozzi, M.G., Herbert, S.J., Sengupta, A., et al.: Using reinforcement learning to perform qubit routing in quantum compilers. ACM Trans. Quant. Comput. 3(2), 1–25 (2022). https://doi.org/10.1145/3520434
Schmitt-Manderbach, T., Weier, H., Fürst, M., et al.: Experimental demonstration of free-space decoy-state quantum key distribution over 144 km. Phys. Rev. Lett. 98(1), 10504 (2007). https://doi.org/10.1103/PhysRevLett.98.010504
Schulman, J., Wolski, F., Dhariwal, P., et al.: Proximal policy optimization algorithms. CoRR (2017). arXiv:1707.06347
Shahidinejad, A., Abawajy, J.: An all-inclusive taxonomy and critical review of blockchain-assisted authentication and session key generation protocols for iot. ACM Comput. Surv. 56(7), 1–38 (2024). https://doi.org/10.1145/3645087
Sharbaf, M.S.: Quantum cryptography: a new generation of information technology security system. In: ITNG 2009—6th International Conference on Information Technology: New Generations, pp. 1644–1648 (2009). https://doi.org/10.1109/ITNG.2009.173
Sharma, P., Gupta, S., Bhatia, V., et al.: Deep reinforcement learning-based routing and resource assignment in quantum key distribution-secured optical networks. IET Quant. Commun. 4(3), 136–145 (2023). https://doi.org/10.1049/qtc2.12063
Shor, P.: Algorithms for quantum computation: discrete logarithms and factoring. In: Proceedings 35th Annual Symposium on Foundations of Computer Science, pp. 124–134 (1994). https://doi.org/10.1109/SFCS.1994.365700
SujayKumar Reddy, M., Chandra Mohan, B.: Comprehensive analysis of BB84, a quantum key distribution protocol. arXiv Preprint (2023). arXiv:2312.05609
Tham, M.L., Iqbal, A., Chang, Y.C.: Deep reinforcement learning for resource allocation in 5G communications. In: 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 1852–1855. IEEE (2019). https://doi.org/10.1109/APSIPAASC47483.2019.9023112
Tysowski, P.K., Ling, X., Lütkenhaus, N., et al.: The engineering of a scalable multi-site communications system utilizing quantum key distribution (QKD). CoRR (2017). arXiv:1712.02617
Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. arXiv Preprint (2023). arXiv:1706.03762
Veličković, P., Cucurull, G., Casanova, A., et al.: Graph attention networks. arXiv Preprint (2018). arXiv:1710.10903
Wang, H., Zhao, Y., Nag, A., et al.: End-to-end quantum key distribution (QKD) from metro to access networks. In: 2020 16th International Conference on the Design of Reliable Communication Networks DRCN 2020, pp. 1–5 (2020). https://doi.org/10.1109/DRCN48652.2020.1570611062
Wang, W., Tamaki, K., Curty, M.: Measurement-device-independent quantum key distribution with leaky sources. Sci. Rep. (2021). https://doi.org/10.1038/s41598-021-81003-2
Wenning, M., Samonaki, M., Patri, S.K., et al.: Multi-layer optimization for QKD and key management networks. J. Opt. Commun. Netw. 15(11), 938–947 (2023). https://doi.org/10.1364/JOCN.503612
Wootters, W.K., Zurek, W.H.: A single quantum cannot be cloned. Nature 299(5886), 802–803 (1982). https://doi.org/10.1038/299802a0
Xiong, J., Zhang, Q., Gatto, A., et al.: Adaptive entanglement routing for quantum networks with cutoff. In: 2023 19th International Conference on Network and Service Management (CNSM), pp. 1–5. IEEE (2023)
Xu, K., Hu, W., Leskovec, J., et al.: How powerful are graph neural networks? CoRR (2018). arXiv:1810.00826
Yao, J., Wang, Y., Li, Q., et al.: An efficient routing protocol for quantum key distribution networks. Entropy 24(7), 911 (2022). https://doi.org/10.3390/e24070911
Zhang, Q., Xu, F., Chen, Y.A., et al.: Large scale quantum key distribution: challenges and solutions. Opt. Express 26(18), 24260 (2018). https://doi.org/10.1364/oe.26.024260
Zhang, Q., Ayoub, O., Gatto, A., et al.: Routing, channel, key-rate and time-slot assignment for QKD in optical networks. IEEE Trans. Netw. Serv. Manag. (2023). https://doi.org/10.1109/TNSM.2023.3290920
Zhao, J., Li, Q., Hong, Y., et al.: MetaRockETC: adaptive encrypted traffic classification in complex network environments via time series analysis and meta-learning. IEEE Trans. Netw. Serv. Manage. (2024). https://doi.org/10.1109/TNSM.2024.3350080
Zuo, Y., Zhao, Y., Xiaosong, Y., et al.: Reinforcement learning-based resource allocation in quantum key distribution networks. In: 2020 Asia Communications and Photonics Conference (ACP) and International Conference on Information Photonics and Optical Communications (IPOC), pp. 1–3 (2020)
This research was supported by the Korea Institute of Science and Technology Information (KISTI) (No. K25L5M2C2, 50%) and the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (No. 2023R1A2C1003143, 25%; No. 2018R1A6A1A03025526, 25%).
Author information
Authors and Affiliations
The submission is an original work and has not been submitted for publication elsewhere.
Corresponding author
Ethics declarations
Conflict of interest
The authors have no conflict of interest to declare. All co-authors have seen and agree with the contents of the manuscript and there is no financial interest to report. We certify that the submission is original work and is not under review at any other publication.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Seok, Y., Kim, JB., Han, YH. et al. Deep Reinforcement Learning-Driven Optimization of End-to-End Key Provision in QKD Systems. J Netw Syst Manage 33, 30 (2025). https://doi.org/10.1007/s10922-025-09902-7
DOI: https://doi.org/10.1007/s10922-025-09902-7