Skip to main content
Log in

Boosting in-transit entertainment: deep reinforcement learning for intelligent multimedia caching in bus networks

  • Neural Networks
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Multimedia content delivery in advanced networks faces exponential growth in data volumes, rendering existing solutions obsolete. This research investigates deep reinforcement learning (DRL) for autonomous optimization without extensive datasets. The work analyzes two prominent DRL algorithms, i.e., Dueling Deep Q-Network (DDQN) and Deep Q-Network (DQN) for multimedia delivery in simulated bus networks. DDQN utilizes a novel “dueling” architecture to estimate state value and action advantages, accelerating learning separately. DQN employs deep neural networks to approximate optimal policies. The environment simulates urban buses with passenger file requests and cache sizes modeled on actual data. Comparative analysis evaluates cumulative rewards and losses over 1500 training episodes to analyze learning efficiency, stability, and performance. Results demonstrate DDQN’s superior convergence and 32% higher cumulative rewards than DQN. However, DQN showed potential for gains over successive runs despite inconsistencies. It establishes DRL’s promise for automated decision-making while revealing enhancements to improve DQN. Further research should evaluate generalizability across problem domains, investigate hybrid models, and test physical systems. DDQN emerged as the most efficient algorithm, highlighting DRL’s potential to enable intelligent agents that optimize multimedia delivery.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data availability

The data that support the findings of this study are available from the corresponding author, upon reasonable request.

References

  • Akkaya I, Andrychowicz M, Chociej M, Litwin M, McGrew B, Petron A, Paino A, Plappert M, Powell G, Ribas R, Schneider J (2019) Solving Rubik’s cube with a robot hand. arXiv:1910.07113

  • Ali M, Yin B, Kumar A, Sheikh AM et al (2020) Reduction of multiplications in convolutional neural networks. In: 2020 39th Chinese control conference (CCC). IEEE, pp 7406–7411. https://doi.org/10.23919/CCC50068.2020.9188843

  • Al-Quzweeni AN, Lawey AQ, Elgorashi TE, Elmirghani JM (2021) Optimized energy efficient virtualization and content caching in 5G networks. arXiv:2102.01001

  • Andrychowicz M, Wolski F, Ray A, Schneider J, Fong R, Welinder P, McGrew B, Tobin J, Pieter Abbeel O, Zaremba W (2017) Hindsight experience replay. In: Advances in neural information processing systems, vol 30

  • Andrychowicz OM, Baker B, Chociej M, Jozefowicz R, McGrew B, Pachocki J, Petron A, Plappert M, Powell G, Ray A, Schneider J (2020) Learning dexterous in-hand manipulation. The Int J Robot Res 39(1):3–20

    Article  Google Scholar 

  • Aslam MS, Qaisar I, Majid A, Shamrooz S (2023) Adaptive event‐triggered robust H∞ control for Takagi–Sugeno fuzzy networked Markov jump systems with time‐varying delay. Asian J Control 25(1):213–228

  • Chu T, Wang J, Codecà L, Li Z (2019) Multi-agent deep reinforcement learning for large-scale traffic signal control. IEEE Trans Intell Transp Syst 21(3):1086–1095

    Article  Google Scholar 

  • Dou H, Liu Y, Chen S et al (2023) A hybrid CEEMD-GMM scheme for enhancing the detection of traffic flow on highways. Soft Comput 27:16373–16388. https://doi.org/10.1007/s00500-023-09164-y

    Article  Google Scholar 

  • Fu C, Xu X, Zhang Y, Lyu Y, Xia Y, Zhou Z, Wu W (2022) Memory-enhanced deep reinforcement learning for UAV navigation in 3D environment. Neural Comput Appl 34(17):14599–14607

    Article  Google Scholar 

  • Haarnoja T, Zhou A, Hartikainen K, Tucker G, Ha S, Tan J, Kumar V, Zhu H, Gupta A, Abbeel P, Levine S (2018) Soft actor-critic algorithms and applications. arXiv:1812.05905

  • Hazrat B, Yin B, Kumar A, Ali M, Zhang J, Yao J (2023) Jerk-bounded trajectory planning for rotary flexible joint manipulator: an experimental approach. Soft Comput 27(7):4029–4039. https://doi.org/10.1007/s00500-023-07923-5

    Article  Google Scholar 

  • Hussein A, Gaber MM, Elyan E, Jayne C (2017) Imitation learning: a survey of learning methods. ACM Comput Surv: CSUR 50(2):1–35

    Article  Google Scholar 

  • Iqbal MJ, Farhan M, Ullah F, Srivastava G, Jabbar S (2023) Intelligent multimedia content delivery in 5G/6G networks: a reinforcement learning approach. Trans Emerg Telecommun Technol e4842. https://doi.org/10.1002/ett.4842

  • Juliani A, Berges VP, Teng E, Cohen A, Harper J, Elion C, Goy C, Gao Y, Henry H, Mattar M, Lange D (2018) Unity: a general platform for intelligent agents. arXiv:1809.02627

  • Kiran BR, Sobh I, Talpaert V, Mannion P, Al Sallab AA, Yogamani S, Pérez P (2021) Deep reinforcement learning for autonomous driving: a survey. IEEE Trans Intell Transp Syst 23(6):4909–4926

    Article  Google Scholar 

  • Kumar A, Shaikh AM, Li Y et al (2021) Pruning filters with L1-norm and capped L1-norm for CNN compression. Appl Intell 51:1152–1160. https://doi.org/10.1007/s10489-020-01894-y

    Article  Google Scholar 

  • Li Y (2017) Deep reinforcement learning: an overview. arXiv:1701.07274

  • Luong NC, Hoang DT, Gong S, Niyato D, Wang P, Liang YC, Kim DI (2019) Applications of deep reinforcement learning in communications and networking: a survey. IEEE Commun Surv Tutor 21(4):3133–3174

    Article  Google Scholar 

  • Ma B, Liu Z, Zhao W, Yuan J, Long H, Wang X, Yuan Z (2023) Target tracking control of UAV through deep reinforcement learning. IEEE Trans Intell Transp Syst 24:5983–6000

    Article  Google Scholar 

  • Mao H, Alizadeh M, Menache I, Kandula S (2016) November. Resource management with deep reinforcement learning. In: Proceedings of the 15th ACM workshop on hot topics in networks, pp 50–56

  • Mao H, Zhang Z, Xiao Z, Gong Z, Ni Y (2020) Learning multi-agent communication with double attentional deep reinforcement learning. Auton Agent Multi Agent Syst 34:1–34

    Article  Google Scholar 

  • Mirhoseini A, Goldie A, Yazgan M, Jiang J, Songhori E, Wang S, Lee YJ, Johnson E, Pathak O, Bae S, Nazi A (2020) Chip placement with deep reinforcement learning. arXiv:2004.10746

  • Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M (2013) Playing atari with deep reinforcement learning. arXiv:1312.5602

  • Morgulev E, Azar OH, Bar-Eli M (2020) Searching for momentum in NBA triplets of free throws. J Sports Sci 38(4):390–398

    Article  Google Scholar 

  • Peng B, Sun Q, Li SE, Kum D, Yin Y, Wei J, Gu T (2021) End-to-end autonomous driving through dueling double deep Q-network. Automot Innov 4:328–337

    Article  Google Scholar 

  • Qiao G, Leng S, Maharjan S, Zhang Y, Ansari N (2019) Deep reinforcement learning for cooperative content caching in vehicular edge computing and networks. IEEE Internet Things J 7(1):247–257

    Article  Google Scholar 

  • Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms. arXiv:1707.06347

  • Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M, Dieleman S (2016) Mastering the game of Go with deep neural networks and tree search. Nature 529(7587):484–489

    Article  Google Scholar 

  • Suha SA, Sanam TF (2022) A deep convolutional neural network-based approach for detecting burn severity from skin burn images. Mach Learn Appl 9:100371

    Google Scholar 

  • Wang S, Liu H, Gomes PH, Krishnamachari B (2018) Deep reinforcement learning for dynamic multichannel access in wireless networks. IEEE Trans Cogn Commun Netw 4(2):257–265

    Article  Google Scholar 

  • Wang T, Bao X, Clavera I, Hoang J, Wen Y, Langlois E, Zhang S, Zhang G, Abbeel P, Ba J (2019) Benchmarking model-based reinforcement learning. arXiv:1907.02057

  • Wang L, Zhai Q, Yin B et al (2019) Second-order convolutional network for crowd counting. In: Proceedings of SPIE 11198, fourth international workshop on pattern recognition, 111980T (31 July 2019). https://doi.org/10.1117/12.2540362

  • Wu P, Partridge J, Anderlini E, Liu Y, Bucknall R (2021) Near-optimal energy management for plug-in hybrid fuel cell and battery propulsion using deep reinforcement learning. Int J Hydrogen Energy 46(80):40022–40040

    Article  Google Scholar 

  • Xiong Z, Zhang Y, Niyato D, Deng R, Wang P, Wang L-C (2019) Deep reinforcement learning for mobile 5G and beyond: fundamentals, applications, and challenges. IEEE Veh Technol Mag 14(2):44–52. https://doi.org/10.1109/MVT.2019.2903655

    Article  Google Scholar 

  • Xu H, Sun Y, Gao J, Guo J (2022) Intelligent edge content caching: a deep recurrent reinforcement learning method. Peer-to-Peer Netw Appl 15(6):2619–2632

    Article  Google Scholar 

  • Xu H, Sun Z, Cao Y et al (2023) A data-driven approach for intrusion and anomaly detection using automated machine learning for the Internet of Things. Soft Comput. https://doi.org/10.1007/s00500-023-09037-4

    Article  Google Scholar 

  • Yang Q, Yoo SJ (2018) Optimal UAV path planning: sensing data acquisition over IoT sensor networks using multi-objective bio-inspired algorithms. IEEE Access 6:13671–13684

    Article  Google Scholar 

  • Yao W, Guo Y, Wu Y, Guo J (2017) Experimental validation of fuzzy PID control of flexible joint system in presence of uncertainties. In: 2017 36th Chinese control conference (CCC). IEEE, pp 4192–4197. https://doi.org/10.23919/ChiCC.2017.8028015

  • Ye H, Li GY (2018) Deep reinforcement learning for resource allocation in V2V communications. In: 2018 IEEE ICC

  • Yin B, Khan J, Wang L, Zhang J, Kumar A (2019) Real-time lane detection and tracking for advanced driver assistance systems. In: 2019 Chinese control conference (CCC). IEEE, pp 6772–6777. https://doi.org/10.23919/ChiCC.2019.8866334

  • Yin B, Aslam MS et al (2023) A practical study of active disturbance rejection control for rotary flexible joint robot manipulator. Soft Comput 27:4987–5001. https://doi.org/10.1007/s00500-023-08026-x

    Article  Google Scholar 

  • Zhang P, Zhu X, Xie M (2021) A model-based reinforcement learning approach for maintenance optimization of degrading systems in a large state space. Comput Ind Eng 161:107622

    Article  Google Scholar 

  • Zhou Y, Li B, Lin TR (2022) Maintenance optimisation of multicomponent systems using hierarchical coordinated reinforcement learning. Reliab Eng Syst Saf 217:108078

    Article  Google Scholar 

Download references

Funding

This study was funded by (1) Guangxi vocational education teaching reform research project, China, Exploration and Practice of the chain Teaching system of “Six In One” in Higher Vocational Colleges under the perspective of “entrepreneurship and Innovation”, Project number: GXGZJG2021B096 and (2) The project of Improving the Basic Scientific Research Ability of Young and Middle-aged Teachers in Colleges and Universities in Guangxi, China, Research on the Development of “Smart Car Service” Operation Management Training System Based on Wechat, Project number: 2022KY1235.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Incheol Shin.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed consent

Informed consent was obtained from all the individual participants included in the study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lan, D., Shin, I. Boosting in-transit entertainment: deep reinforcement learning for intelligent multimedia caching in bus networks. Soft Comput 27, 19359–19375 (2023). https://doi.org/10.1007/s00500-023-09354-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-023-09354-8

Keywords

Navigation