ABSTRACT
The Logical Execution Time paradigm is a promising approach for achieving time-deterministic communication on multi-core CPUs. Task scheduling under this paradigm is a variant of the Multi-Row Facility Layout Problem, which is known to be NP-hard. In this paper, we propose using reinforcement learning to reduce the overall path latency among all scheduled runnables while adhering to other constraints, such as schedulability, load balance, and data contention control. The neural networks, also known as agents, are trained on a real-world automotive powertrain project. We compare two schedules generated by the agents to the current one and one produced by a genetic algorithm. The agent trained with the Proximal Policy Optimization algorithm demonstrated the best performance. Additionally, we investigate the generalization ability of the agents against software updates, and the results show that our agents are well-generalized.
- Kunal Agrawal, Sanjoy Baruah, Zhishan Guo, Jing Li, and Sudharsan Vaidhun. 2020. Hard-real-time routing in probabilistic graphs to minimize expected delay. In 2020 IEEE Real-Time Systems Symposium (RTSS). IEEE, 63–75.Google ScholarCross Ref
- Thomas Bäck, David B Fogel, and Zbigniew Michalewicz. 2018. Evolutionary computation 1: Basic algorithms and operators. CRC press.Google ScholarCross Ref
- Matthias Becker, Dakshina Dasari, Saad Mubeen, Moris Behnam, and Thomas Nolte. 2017. End-to-end timing analysis of cause-effect chains in automotive embedded systems. Journal of Systems Architecture 80 (2017), 104–113.Google ScholarDigital Library
- Alessandro Biondi and Marco Di Natale. 2018. Achieving predictable multicore execution of automotive applications using the LET paradigm. In 2018 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS). IEEE, 240–250.Google ScholarCross Ref
- Zitong Bo, Ying Qiao, Chang Leng, Hongan Wang, Chaoping Guo, and Shaohui Zhang. 2021. Developing real-time scheduling policy by deep reinforcement learning. In 2021 IEEE 27th Real-Time and Embedded Technology and Applications Symposium (RTAS). IEEE, 131–142.Google ScholarCross Ref
- Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. 2016. OpenAI Gym. arXiv:arXiv:1606.01540Google Scholar
- Maxwell Brown, Nicole Tellado, Sammy Kubesch, Denise Lainez, and Abdulmohsen Albelushi. 2023. Weighted Sum Method. https://optimization.cbe.cornell.edu/index.php?title=Weighted_sum_method.Google Scholar
- Tim Brys, Anna Harutyunyan, Peter Vrancx, Ann Nowé, and Matthew E Taylor. 2017. Multi-objectivization and ensembles of shapings in reinforcement learning. Neurocomputing 263 (2017), 48–59.Google ScholarCross Ref
- Sandeep Chinchali, Pan Hu, Tianshu Chu, Manu Sharma, Manu Bansal, Rakesh Misra, Marco Pavone, and Sachin Katti. 2018. Cellular network traffic scheduling with deep reinforcement learning. In Thirty-second AAAI conference on artificial intelligence.Google ScholarCross Ref
- Denis Claraz, Max J Friese, Hermann von Hasseln, and Ralph Mader. 2022. A dynamic Reference Architecture to achieve planned Determinism for Automotive Applications. In ERTS2022.Google Scholar
- Djork-Arné Clevert, Thomas Unterthiner, and Sepp Hochreiter. 2015. Fast and accurate deep network learning by exponential linear units (elus). arXiv preprint arXiv:1511.07289 (2015).Google Scholar
- Paul Emberson and Iain Bate. 2009. Stressing search with scenarios for flexible solutions to real-time task allocation problems. IEEE Transactions on Software Engineering 36, 5 (2009), 704–718.Google ScholarDigital Library
- Nico Feiertag, Kai Richter, Johan Nordlander, and Jan Jonsson. 2009. A compositional framework for end-to-end path delay calculation of automotive systems under different path semantics. In IEEE Real-Time Systems Symposium: 30/11/2009-03/12/2009. IEEE Communications Society.Google Scholar
- Anja Fischer, Frank Fischer, and Philipp Hungerländer. 2019. New exact approaches to row layout problems. Mathematical Programming Computation 11 (2019), 703–754.Google ScholarCross Ref
- Félix-Antoine Fortin, François-Michel De Rainville, Marc-André Gardner, Marc Parizeau, and Christian Gagné. 2012. DEAP: Evolutionary Algorithms Made Easy. Journal of Machine Learning Research 13 (jul 2012), 2171–2175.Google ScholarDigital Library
- Kai-Björn Gemlau, Johannes Schlatow, Mischa Möstl, and Rolf Ernst. 2017. Compositional Analysis of the WATERS Industrial Challenge 2017.Google Scholar
- Robert Glaubius, Terry Tidwell, Christopher Gill, and William D Smart. 2012. Real-time scheduling via reinforcement learning. arXiv preprint arXiv:1203.3481 (2012).Google Scholar
- Robert Glaubius, Terry Tidwell, William D Smart, and Christopher Gill. 2008. Scheduling design and verification for open soft real-time systems. In 2008 Real-Time Systems Symposium. IEEE, 505–514.Google ScholarDigital Library
- Aditya Gopalan and Shie Mannor. 2015. Thompson sampling for learning parameterized markov decision processes. In Conference on Learning Theory. PMLR, 861–898.Google Scholar
- Arne Hamann, Dakshina Dasari, Simon Kramer, Michael Pressler, and Falk Wurst. 2017. Communication centric design in complex automotive embedded systems. In 29th Euromicro Conference on Real-Time Systems (ECRTS 2017). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik.Google Scholar
- Julien Hennig, Hermann von Hasseln, Hassan Mohammad, Stefan Resmerita, Stefan Lukesch, and Andreas Naderlinger. 2016. Towards parallelizing legacy embedded control software using the LET programming paradigm. In 2016 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS). IEEE Computer Soc., 51.Google ScholarCross Ref
- Thomas A Henzinger, Benjamin Horowitz, and Christoph Meyer Kirsch. 2001. Giotto: A time-triggered language for embedded programming. In International Workshop on Embedded Software. Springer, 166–184.Google ScholarCross Ref
- Shengyi Huang and Santiago Ontañón. 2020. A closer look at invalid action masking in policy gradient algorithms. arXiv preprint arXiv:2006.14171 (2020).Google Scholar
- Shingo Igarashi, Tasuku Ishigooka, Tatsuya Horiguchi, Ryotaro Koike, and Takuya Azumi. 2020. Heuristic contention-free scheduling algorithm for multi-core processor using LET model. In 2020 IEEE/ACM 24th International Symposium on Distributed Simulation and Real Time Applications (DS-RT). IEEE, 1–10.Google ScholarCross Ref
- "Infineon Technologies AG" 2020. "Infineon AURIX TC3xx User Manual". "Infineon Technologies AG". "Rev. 1.6".Google Scholar
- Tobias Klaus, Matthias Becker, Wolfgang Schröder-Preikschat, and Peter Ulbrich. 2021. Constrained data-age with job-level dependencies: How to reconcile tight bounds and overheads. In 2021 IEEE 27th Real-Time and Embedded Technology and Applications Symposium (RTAS). IEEE, 66–79.Google ScholarCross Ref
- Alix Kordon and Ning Tang. 2020. Evaluation of the Age Latency of a Real-Time Communicating System using the LET paradigm. In ECRTS 2020, Vol. 165. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik.Google Scholar
- Ravi Kothari and Diptesh Ghosh. 2012. A competitive genetic algorithm for single row facility layout. (2012).Google Scholar
- Andrew Kusiak and Sunderesh S Heragu. 1987. The facility layout problem. European Journal of operational research 29, 3 (1987), 229–251.Google Scholar
- Ralph Mader. 2018. Implementation of Logical Execution Time in an AUTOSAR based Embedded Automotive Multi-core Application. In Dagstuhl Seminar 18092.Google Scholar
- Jorge Martinez, Ignacio Sañudo, and Marko Bertogna. 2018. Analytical characterization of end-to-end communication delays with logical execution time. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 37, 11 (2018), 2244–2254.Google ScholarCross Ref
- Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013).Google Scholar
- Chris Nota and Philip S Thomas. 2019. Is the policy gradient a gradient?arXiv preprint arXiv:1906.07073 (2019).Google Scholar
- Hao Yi Ong, Kevin Chavez, and Augustus Hong. 2015. Distributed deep Q-learning. arXiv preprint arXiv:1508.04186 (2015).Google Scholar
- Paolo Pazzaglia, Alessandro Biondi, and Marco Di Natale. 2019. Optimizing the functional deployment on multicore platforms with logical execution time. In 2019 IEEE Real-Time Systems Symposium (RTSS). IEEE, 207–219.Google ScholarCross Ref
- Paolo Pazzaglia, Daniel Casini, Alessandro Biondi, and Marco Di Natale. 2021. Optimal memory allocation and scheduling for dma data transfers under the let paradigm. In 2021 58th ACM/IEEE Design Automation Conference (DAC). IEEE, 1171–1176.Google ScholarDigital Library
- Antonin Raffin, Ashley Hill, Adam Gleave, Anssi Kanervisto, Maximilian Ernestus, and Noah Dormann. 2021. Stable-Baselines3: Reliable Reinforcement Learning Implementations. Journal of Machine Learning Research 22, 268 (2021), 1–8. http://jmlr.org/papers/v22/20-1364.htmlGoogle Scholar
- Stefan Resmerita, Andreas Naderlinger, Manuel Huber, Kenneth Butts, and Wolfgang Pree. 2015. Applying real-time programming to legacy embedded control software. In 2015 IEEE 18th International Symposium on Real-Time Distributed Computing. IEEE, 1–8.Google ScholarDigital Library
- Amir Sadrzadeh. 2012. A genetic algorithm with the heuristic procedure to solve the multi-line layout problem. Computers & Industrial Engineering 62, 4 (2012), 1055–1064.Google ScholarDigital Library
- John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017).Google Scholar
- Richard S Sutton and Andrew G Barto. 2018. Reinforcement learning: An introduction. MIT press.Google ScholarDigital Library
- Hado Van Hasselt, Arthur Guez, and David Silver. 2016. Deep reinforcement learning with double q-learning. In Proceedings of the AAAI conference on artificial intelligence, Vol. 30.Google ScholarCross Ref
- Ziyu Wang, Tom Schaul, Matteo Hessel, Hado Hasselt, Marc Lanctot, and Nando Freitas. 2016. Dueling network architectures for deep reinforcement learning. In International conference on machine learning. PMLR, 1995–2003.Google Scholar
- Risheng Xu, Max J Friese, Hermann von Hasseln, and Dirk Nowotka. 2022. Data Access Time Estimation in Automotive LET Scheduling with Multi-core CPU. In 15th Junior Researcher Workshop on Real-Time Computing 2022. 11.Google Scholar
Index Terms
- Reducing Overall Path Latency in Automotive Logical Execution Time Scheduling via Reinforcement Learning
Recommendations
Minimizing mean weighted tardiness in unrelated parallel machine scheduling with reinforcement learning
We address an unrelated parallel machine scheduling problem with R-learning, an average-reward reinforcement learning (RL) method. Different types of jobs dynamically arrive in independent Poisson processes. Thus the arrival time and the due date of ...
Multi-agent reinforcement learning based on graph convolutional network for flexible job shop scheduling
AbstractWith the development of Internet of manufacturing things, decentralized scheduling in flexible job shop is arousing great attention. To deal with the challenges confronted by personalized manufacturing, such as high level of flexibility, agility ...
Reward Shaping in Episodic Reinforcement Learning
AAMAS '17: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent SystemsRecent advancements in reinforcement learning confirm that reinforcement learning techniques can solve large scale problems leading to high quality autonomous decision making. It is a matter of time until we will see large scale applications of ...
Comments