Skip to main content

Advertisement

Transformer-based reinforcement learning for optical cavity temperature control system

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

The accuracy of laser gas detection technology is influenced by the temperature of the optical cavity. Traditional control methods suffer from inadequacies in fully considering the coupling effects between features and the time delay in heat transfer. To address these issues, a method combining Transformer and reinforcement learning (RL) has been proposed. By using Transformer, this method generates enhanced features that are then used by the RL algorithm for iterative learning, aiming to optimize the control strategy. Additionally, a dual attention mechanism is introduced to enhance the model’s comprehension of the complex dynamics within the optical cavity. This study represents the first application of Transformer in the field of temperature control, paving the way for the utilization of advanced machine-learning techniques in optical cavity temperature regulation. Experimental results confirm the proposed method’s efficiency and long-term effectiveness in ensuring precise temperature control, demonstrating its potential in managing the complex cross-coupling effects within temperature control systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Algorithm 1
Fig. 5
Fig. 6
Algorithm 2
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data Availability

The datasets generated during the study are available on reasonable request.

References

  1. Qu Z, Werhahn O, Ebert V (2018) Thermal boundary layer effects on line-of-sight tunable diode laser absorption spectroscopy (tdlas) gas concentration measurements. Appl Spectrosc 72(6):853–862. https://doi.org/10.1177/00037028177521

    Article  Google Scholar 

  2. Jelle BP (2011) Traditional, state-of-the-art and future thermal building insulation materials and solutions-properties, requirements and possibilities. Energy Buildings 43(10):2549–2563

    Article  MATH  Google Scholar 

  3. Tan S, Wang S, Saraf S, Lipa JA (2017) Pico-kelvin thermometry and temperature stabilization using a resonant optical cavity. Opt Express 25(4):3578–3593. https://doi.org/10.1364/OE.25.003578

    Article  MATH  Google Scholar 

  4. Argence B, Prevost E, Lévèque T, Le Goff R, Bize S, Lemonde P, Santarelli G (2012) Prototype of an ultra-stable optical cavity for space applications. Opt Express 20(23):25409–25420. https://doi.org/10.1364/OE.20.025409

    Article  Google Scholar 

  5. Shuvo MS, Ishtiaq F, Jamee T, Das J, Saha S (2022) Analysis of internal cooling system in a vented cavity using p, pi, pid controllers. Results Eng 15:100579

    Article  Google Scholar 

  6. Arfaoui J, Feki E, Mami A (2015) Pid and fuzzy logic optimized controller for temperature control in a cavity of refrigeration. In: IREC2015 the sixth international renewable energy congress, IEEE, pp 1–6

  7. Mei L, Zhengze C, Keyu Z, Ruixiong H, Rui Y, Liangrui S, Minjing S, Yongcheng J, Shaopeng L, Jiyuan Z et al (2024) Automation of superconducting cavity cooldown process using two-layer surrogate model and model predictive control method. Cryogenics 139:103824

    Article  Google Scholar 

  8. Najafabadi HA, Ozalp N (2018) Aperture size adjustment using model based adaptive control strategy to regulate temperature in a solar receiver. Sol Energy 159:20–36

    Article  MATH  Google Scholar 

  9. Akbari E, Karami A, Ashjaee M (2018) A comparison between radial basis function (rbf) and adaptive neuro-fuzzy inference system (anfis) to model the free convection in an open round cavity. Heat Transfer—Asian Research 47(7):869–886

    Article  Google Scholar 

  10. Dong S-J, Li Y-Z, Wang J, Wang J (2012) Fuzzy incremental control algorithm of loop heat pipe cooling system for spacecraft applications. Comput Math Appl 64(5):877–886

    Article  MATH  Google Scholar 

  11. Chen Q, Xu J, Chen H (2012) A new design method for organic rankine cycles with constraint of inlet and outlet heat carrier fluid temperatures coupling with the heat source. Appl Energy 98:562–573. https://doi.org/10.1016/j.apenergy.2012.04.035

    Article  MATH  Google Scholar 

  12. Lyu C, Xu M, Lu X, Tian B, Chen B, Xiong B, Cheng B (2023) Research on thermal-humidity-force coupling characteristics of mass concrete structures under temperature control. Constr Build Mater 398:132540. https://doi.org/10.1016/j.conbuildmat.2023.132540

    Article  Google Scholar 

  13. Yan Z, Kreidieh AR, Vinitsky E, Bayen AM, Wu C (2022) Unified automatic control of vehicular systems with reinforcement learning. IEEE Trans Autom Sci Eng 20(2):789–804

    Article  MATH  Google Scholar 

  14. Yu L, Sun Y, Xu Z, Shen C, Yue D, Jiang T, Guan X (2020) Multi-agent deep reinforcement learning for hvac control in commercial buildings. IEEE Trans Smart Grid 12(1):407–419

    Article  MATH  Google Scholar 

  15. Walraven E, Spaan MT, Bakker B (2016) Traffic flow optimization: A reinforcement learning approach. Eng Appl Artif Intell 52:203–212

    Article  MATH  Google Scholar 

  16. Wu X, Chen H, Wang J, Troiano L, Loia V, Fujita H (2020) Adaptive stock trading strategies with deep reinforcement learning methods. Inf Sci 538:142–158

    Article  MathSciNet  Google Scholar 

  17. Liu J, Tsai B-Y, Chen D-S (2023) Deep reinforcement learning based controller with dynamic feature extraction for an industrial claus process. J Taiwan Inst Chem Eng 146:104779

  18. Guo S, Zou L, Chen H, Qu B, Chi H, Philip SY, Chang Y (2023) Sample efficient offline-to-online reinforcement learning. IEEE Trans Know Data Eng

  19. Zhang B, Ghias AM, Chen Z (2022) A double-deck deep reinforcement learning-based energy dispatch strategy for an integrated electricity and district heating system embedded with thermal inertial and operational flexibility. Energy Rep 8:15067–15080

    Article  MATH  Google Scholar 

  20. Huang G, Zhao P, Zhang G (2022) Real-time battery thermal management for electric vehicles based on deep reinforcement learning. IEEE Internet Things J 9(15):14060–14072

    Article  MATH  Google Scholar 

  21. Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv:1412.3555. https://doi.org/10.48550/arXiv.1412.3555

  22. Shi T, Xu C, Dong W, Zhou H, Bokhari A, Klemeš JJ, Han N (2023) Research on energy management of hydrogen electric coupling system based on deep reinforcement learning. Energy 282:128174

    Article  Google Scholar 

  23. Qiu Z-c, Yang Y, Zhang X-m (2022) Reinforcement learning vibration control of a multi-flexible beam coupling system. Aerospace Sci Technol 129:107801

  24. Fujii F, Kaneishi A, Nii T, Maenishi R, Tanaka S (2021) Self-tuning two degree-of-freedom proportional-integral control system based on reinforcement learning for a multiple-input multiple-output industrial process that suffers from spatial input coupling. Processes 9(3):487

    Article  Google Scholar 

  25. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Advan Neural Inform Process Syst 30

  26. Elman JL (1990) Finding structure in time. Cogn Sci 14(2):179–211

    Article  MATH  Google Scholar 

  27. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  MATH  Google Scholar 

  28. Patwardhan N, Marrone S, Sansone C (2023) Transformers in the real world: A survey on nlp applications. Information 14(4):242. https://doi.org/10.3390/info14040242

    Article  Google Scholar 

  29. Liu Z, Lv Q, Yang Z, Li Y, Lee CH, Shen L (2023) Recent progress in transformer-based medical image analysis. Comput Biology Med:107268. https://doi.org/10.1016/j.compbiomed.2023.107268

  30. Liu Z, Ning J, Cao Y, Wei Y, Zhang Z, Lin S, Hu H (2022) Video swin transformer. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3202–3211

  31. Zhou X, Lin W, Kumar R, Cui P, Ma Z (2022) A data-driven strategy using long short term memory models and reinforcement learning to predict building electricity consumption. Appl Energy 306:118078. https://doi.org/10.1016/j.apenergy.2021.118078

    Article  Google Scholar 

  32. Yang S, Chen B (2023) Effective surrogate gradient learning with high-order information bottleneck for spike-based machine intelligence. IEEE Trans Neural Netw Learn Syst

  33. Lazaric A, Restelli M, Bonarini A (2007) Reinforcement learning in continuous action spaces through sequential monte carlo methods. Advan Neural Inform Process Syst 20

  34. Van Hasselt H, Wiering MA (2009) Using continuous action spaces to solve discrete problems. In: 2009 International joint conference on neural networks, IEEE, pp 1149–1156. https://doi.org/10.1109/IJCNN.2009.5178745

  35. Xu Y, Wei Y, Jiang K, Chen L, Wang D, Deng H (2023) Action decoupled sac reinforcement learning with discrete-continuous hybrid action spaces. Neurocomputing 537:141–151. https://doi.org/10.1016/j.neucom.2023.03.054

    Article  MATH  Google Scholar 

  36. Hausknecht M, Stone P (2015) Deep reinforcement learning in parameterized action space. arXiv:1511.04143. https://doi.org/10.48550/arXiv.1511.04143

  37. Masson W, Ranchod P, Konidaris G (2016) Reinforcement learning with parameterized actions. In: Proceedings of the AAAI conference on artificial intelligence, vol 30 . https://doi.org/10.1609/aaai.v30i1.10226

  38. Lillicrap TP, Hunt JJ, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D (2015) Continuous control with deep reinforcement learning. arXiv:1509.02971. https://doi.org/10.48550/arXiv.1509.02971

  39. Xiong J, Wang Q, Yang Z, Sun P, Han L, Zheng Y, Fu H, Zhang T, Liu J, Liu H (2018) Parametrized deep q-networks learning: Reinforcement learning with discrete-continuous hybrid action space. arXiv:1810.06394. https://doi.org/10.48550/arXiv.1810.06394

  40. Fan Z, Su R, Zhang W, Yu Y (2019) Hybrid actor-critic reinforcement learning in parameterized action space. arXiv:1903.01344. https://doi.org/10.48550/arXiv.1903.01344

  41. Wan S, Li T, Fang B, Yan K, Hong J, Li X (2023) Bearing fault diagnosis based on multi-sensor information coupling and attentional feature fusion. IEEE Trans Instrum Meas. https://doi.org/10.1109/TIM.2023.3269115

  42. Yu M, Niu D, Zhao J, Li M, Sun L, Yu X (2023) Building cooling load forecasting of ies considering spatiotemporal coupling based on hybrid deep learning model. Appl Energy 349:121547. https://doi.org/10.1016/j.apenergy.2023.121547

    Article  Google Scholar 

  43. Tong F, Liu L, Xie X, Hong Q, Li L (2022) Respiratory sound classification: from fluid-solid coupling analysis to feature-band attention. IEEE Access 10:22018–22031. https://doi.org/10.1109/ACCESS.2022.3151789

    Article  Google Scholar 

  44. Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H (2019) Dual attention network for scene segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3146–3154

  45. Liu H, Liu F, Fan X, Huang D (2021) Polarized self-attention: Towards high-quality pixel-wise regression. arXiv:2107.00782. https://doi.org/10.48550/arXiv.2107.00782

  46. Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: Convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 3–19

  47. Bellman R (1957) A markovian decision process. J Math Mech:679–684

  48. Andrychowicz OM, Baker B, Chociej M, Jozefowicz R, McGrew B, Pachocki J, Petron A, Plappert M, Powell G, Ray A et al (2020) Learning dexterous in-hand manipulation. Int J Robot Res 39(1):3–20

    Article  Google Scholar 

  49. Yang S, Wang H, Chen B (2023) Sibols: robust and energy-efficient learning for spike-based machine intelligence in information bottleneck framework. IEEE Trans Cogn Develop Syst

  50. Schulman J, Moritz P, Levine S, Jordan M, Abbeel P (2015) High-dimensional continuous control using generalized advantage estimation. arXiv:1506.02438. https://doi.org/10.48550/arXiv.1506.02438

  51. Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms. arXiv:1707.06347. https://doi.org/10.48550/arXiv.1707.06347

  52. Lin T-Y, RoyChowdhury A, Maji S (2015) Bilinear cnn models for fine-grained visual recognition. In: Proceedings of the IEEE international conference on computer vision, pp 1449–1457

  53. Parisotto E, Song F, Rae J, Pascanu R, Gulcehre C, Jayakumar S, Jaderberg M, Kaufman RL, Clark A, Noury S et al (2020) Stabilizing transformers for reinforcement learning. In: International conference on machine learning, PMLR, pp 7487–7498

  54. Du X, Chen H, Wang C, Xing Y, Yang J, Philip SY, Chang Y, He L (2024) Robust multi-agent reinforcement learning via bayesian distributional value estimation. Pattern Recogn 145:109917

    Article  MATH  Google Scholar 

  55. Yang S, Pang Y, Wang H, Lei T, Pan J, Wang J, Jin Y (2023) Spike-driven multi-scale learning with hybrid mechanisms of spiking dendrites. Neurocomputing 542:126240

  56. Yang S, Chen B (2023) Snib: improving spike-based machine learning using nonlinear information bottleneck. IEEE Trans Syst, Man, Cybern: Syst

  57. Ding S, Zhao X, Xu X, Sun T, Jia W (2019) An effective asynchronous framework for small scale reinforcement learning problems. Appl Intell 49:4303–4318

    Article  MATH  Google Scholar 

  58. Zhao X, Ding S, An Y, Jia W (2019) Applications of asynchronous deep reinforcement learning based on dynamic updating weights. Appl Intell 49:581–591

Download references

Acknowledgements

This work is supported by Shanghai Science and Technology Innovation Action Plan under Grant No.22142200102, the National Natural Science Foundation of China under Grant No.52075310, 61603238.

Author information

Authors and Affiliations

Authors

Contributions

Hongli Zhang, Shulin Liu: Provide research ideas and review manuscript. Yufan Lu: Development, validation of algorithms, and drafting of the manuscript. Chi Wang, Jian Peng: Provide experimental equipment. Cheng Huang, Wei Dou: Supervise the execution of the study. Weiheng Cheng: Provides technical assistance. All authors were involved in reviewing and editing the manuscript, providing critical feedback, and contributing to the final version.

Corresponding authors

Correspondence to Chi Wang, Wei Dou or Shulin Liu.

Ethics declarations

Competing Interests

The authors declare that there are no competing financial, professional, or personal interests that are relevant to the content of this article.

Ethical and Informed Consent for Data Used

All participants involved in this study provided informed consent.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, H., Lu, Y., Wang, C. et al. Transformer-based reinforcement learning for optical cavity temperature control system. Appl Intell 55, 83 (2025). https://doi.org/10.1007/s10489-024-05943-8

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10489-024-05943-8

Keywords