Continual Model-Based Reinforcement Learning for Data Efficient Wireless Network Optimisation

Hasan, Cengis; Agapitos, Alexandros; Lynch, David; Castagna, Alberto; Cruciata, Giorgio; Wang, Hao; Milenovic, Aleksandar

doi:10.1007/978-3-031-43427-3_18

Cengis Hasan¹³,
Alexandros Agapitos¹³,
David Lynch¹³,
Alberto Castagna¹³,
Giorgio Cruciata¹³,
Hao Wang¹³ &
…
Aleksandar Milenovic¹³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14174))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

862 Accesses

Abstract

We present a method that addresses the pain point of long lead-time required to deploy cell-level parameter optimisation policies to new wireless network sites. Given a sequence of action spaces represented by overlapping subsets of cell-level configuration parameters provided by domain experts, we formulate throughput optimisation as Continual Reinforcement Learning of control policies. Simulation results suggest that the proposed system is able to shorten the end-to-end deployment lead-time by two-fold compared to a reinitialise-and-retrain baseline without any drop in optimisation gain.

C. Hasan, A. Agapitos and D. Lynch—Authors with equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The names of CPs, PCs, and EPs cannot be disclosed.

References

Balevi, E., Andrews, J.G.: Online antenna tuning in heterogeneous cellular networks with deep reinforcement learning. IEEE Trans. Cogn. Commun. Netw. 5(4), 1113–1124 (2019). https://doi.org/10.1109/TCCN.2019.2933420
Article Google Scholar
Biza, K., Tsamardinos, I., Triantafillou, S.: Out-of-sample tuning for causal discovery. IEEE Trans. Neural Netw. Learn. Syst. 1–11 (2022). https://doi.org/10.1109/TNNLS.2022.3185842
Bothe, S., Masood, U., Farooq, H., Imran, A.: Neuromorphic AI empowered root cause analysis of faults in emerging networks. In: IEEE International Black Sea Conference on Communications and Networking, BlackSeaCom 2020, Odessa, Ukraine, 26–29 May 2020. pp. 1–6. IEEE (2020). https://doi.org/10.1109/BlackSeaCom48709.2020.9235002
Bouton, M., Farooq, H., Forgeat, J., Bothe, S., Shirazipour, M., Karlsson, P.: Coordinated reinforcement learning for optimizing mobile networks (Sep 2021)
Google Scholar
Calabrese, F.D., Wang, L., Ghadimi, E., Peters, G., Hanzo, L., Soldati, P.: Learning radio resource management in rans: Framework, opportunities, and challenges. IEEE Commun. Mag. 56, 138–145 (2018)
Article Google Scholar
Chua, K., Calandra, R., McAllister, R., Levine, S.: Deep reinforcement learning in a handful of trials using probabilistic dynamics models. In: Advances in Neural Information Processing Systems 31 (2018)
Google Scholar
Colombo, D., Maathuis, M.H., Kalisch, M., Richardson, T.S.: Learning high-dimensional directed acyclic graphs with latent and selection variables. Ann. Stat. 40(1), 294–321 (2012)
Article MathSciNet MATH Google Scholar
Dandanov, N., Al-Shatri, H., Klein, A., Poulkov, V.: Dynamic self-optimization of the antenna tilt for best trade-off between coverage and capacity in mobile networks. Wirel. Pers. Commun. 92(1), 251–278 (2017). https://doi.org/10.1007/s11277-016-3849-9
Article Google Scholar
Devin, C., Gupta, A., Darrell, T., Abbeel, P., Levine, S.: Learning modular neural network policies for multi-task and multi-robot transfer. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 2169–2176. IEEE (2017)
Google Scholar
Eckhardt, H., Klein, S., Gruber, M.: Vertical antenna tilt optimization for lte base stations. In: VTC Spring, pp. 1–5. IEEE (2011). http://dblp.uni-trier.de/db/conf/vtc/vtc2011s.html#EckhardtKG11
Eisenblätter, A., Geerdes, H.: Capacity optimization for UMTS: bounds and benchmarks for interference reduction. In: Proceedings of the IEEE 19th International Symposium on Personal, Indoor and Mobile Radio Communications, PIMRC 2008, 15–18 September 2008, Cannes, French Riviera, France, pp. 1–6. IEEE (2008). https://doi.org/10.1109/PIMRC.2008.4699919
Isele, D., Cosgun, A.: Selective experience replay for lifelong learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)
Google Scholar
Khetarpal, K., Riemer, M., Rish, I., Precup, D.: Towards continual reinforcement learning: a review and perspectives. J. Artif. Intell. Res. 75, 1401–1476 (2020)
Article MathSciNet MATH Google Scholar
Kirkpatrick, J., et al.: Overcoming catastrophic forgetting in neural networks. Proc. Natl. Acad. Sci. 114(13), 3521–3526 (2017)
Article MathSciNet MATH Google Scholar
Li, Z., Hoiem, D.: Learning without forgetting. IEEE Trans. Pattern Anal. Mach. Intell. 40(12), 2935–2947 (2017)
Article Google Scholar
Mendez, J.A., van Seijen, H., Eaton, E.: Modular lifelong reinforcement learning via neural composition. arXiv preprint arXiv:2207.00429 (2022)
Nasir, Y.S., Guo, D.: Multi-agent deep reinforcement learning for dynamic power allocation in wireless networks. IEEE J. Sel. Areas Commun. 37(10), 2239–2250 (2019). https://doi.org/10.1109/JSAC.2019.2933973
Article Google Scholar
Ogarrio, J.M., Spirtes, P., Ramsey, J.: A hybrid causal search algorithm for latent variable models. In: Proceedings of the Eighth International Conference on Probabilistic Graphical Models, 06–09 Sep, vol. 52, pp. 368–379. PMLR (2016)
Google Scholar
Partov, B., Leith, D.J., Razavi, R.: Utility fair optimization of antenna tilt angles in lte networks. IEEE/ACM Trans. Netw. 23(1), 175–185 (2015). https://doi.org/10.1109/TNET.2013.2294965
Article Google Scholar
Raghu, V.K., et al.: Comparison of strategies for scalable causal discovery of latent variable models from mixed data. Inter. J. Data Sci. Anal. 6, 33–45 (2018)
Article MathSciNet Google Scholar
Rolnick, D., Ahuja, A., Schwarz, J., Lillicrap, T., Wayne, G.: Experience replay for continual learning. In: Advances in Neural Information Processing Systems 32 (2019)
Google Scholar
Rusu, A.A., et al.: Policy distillation. arXiv preprint arXiv:1511.06295 (2015)
Rusu, A.A., et al.: Progressive neural networks. arXiv preprint arXiv:1606.04671 (2016)
Saini, S.K., Dhamnani, S., Ibrahim, A.A., Chavan, P.: Multiple treatment effect estimation using deep generative model with task embedding. In: The World Wide Web Conference, pp. 1601–1611 (2019)
Google Scholar
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)
Schwarz, J., et al.: Progress & Compress: A scalable framework for continual learning. In: International Conference on Machine Learning, pp. 4528–4537. PMLR (2018)
Google Scholar
Shafin, R.S.B., et al.: Self-tuning sectorization: deep reinforcement learning meets broadcast beam optimization. IEEE Trans. Wirel. Commun. 19(6), 4038–4053 (2020). https://doi.org/10.1109/TWC.2020.2979446
Article Google Scholar
Shi, C., Blei, D., Veitch, V.: Adapting Neural Networks for the Estimation of Treatment Effects. In: Advances in Neural Information Processing Systems 32 (2019)
Google Scholar
Traoré, Ret al.: Discorl: continual reinforcement learning via policy distillation. arXiv preprint arXiv:1907.05855 (2019)
Vannella, F., Jeong, J., Proutière, A.: Off-policy learning for remote electrical tilt optimization. In: 92nd IEEE Vehicular Technology Conference, VTC Fall 2020, Victoria, BC, Canada, 18 November -16 December 2020. pp. 1–5. IEEE (2020). https://doi.org/10.1109/VTC2020-Fall49728.2020.9348456
Yin, H., Pan, S.J.: Knowledge transfer for deep reinforcement learning with hierarchical experience replay. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)
Google Scholar
Yoon, J., Yang, E., Lee, J., Hwang, S.J.: Lifelong learning with dynamically expandable networks. arXiv preprint arXiv:1708.01547 (2017)
Zhao, W., Queralta, J.P., Westerlund, T.: Sim-to-real transfer in deep reinforcement learning for robotics: a survey. In: 2020 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 737–744 (2020).https://doi.org/10.1109/SSCI47803.2020.9308468

Download references

Author information

Authors and Affiliations

Huawei Ireland Research Center, Dublin, Ireland
Cengis Hasan, Alexandros Agapitos, David Lynch, Alberto Castagna, Giorgio Cruciata, Hao Wang & Aleksandar Milenovic

Authors

Cengis Hasan
View author publications
You can also search for this author in PubMed Google Scholar
Alexandros Agapitos
View author publications
You can also search for this author in PubMed Google Scholar
David Lynch
View author publications
You can also search for this author in PubMed Google Scholar
Alberto Castagna
View author publications
You can also search for this author in PubMed Google Scholar
Giorgio Cruciata
View author publications
You can also search for this author in PubMed Google Scholar
Hao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Aleksandar Milenovic
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cengis Hasan .

Editor information

Editors and Affiliations

CENTAI, Turin, Italy
Gianmarco De Francisci Morales
NYU and Two Sigma, New York, NY, USA
Claudia Perlich
Netflix, Los Angeles, CA, USA
Natali Ruchansky
Telefonica Research, Barcelona, Spain
Nicolas Kourtellis
Politecnico di Torino, Turin, Italy
Elena Baralis
CENTAI, Turin, Italy
Francesco Bonchi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hasan, C. et al. (2023). Continual Model-Based Reinforcement Learning for Data Efficient Wireless Network Optimisation. In: De Francisci Morales, G., Perlich, C., Ruchansky, N., Kourtellis, N., Baralis, E., Bonchi, F. (eds) Machine Learning and Knowledge Discovery in Databases: Applied Data Science and Demo Track. ECML PKDD 2023. Lecture Notes in Computer Science(), vol 14174. Springer, Cham. https://doi.org/10.1007/978-3-031-43427-3_18

Download citation

DOI: https://doi.org/10.1007/978-3-031-43427-3_18
Published: 17 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43426-6
Online ISBN: 978-3-031-43427-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community (opens in a new tab)

Continual Model-Based Reinforcement Learning for Data Efficient Wireless Network Optimisation