Abstract
Modeling the energy consumption of low-level code will enable (i) a better understanding of its relationship to execution time and (ii) compiler/runtime optimizations tailored for energy efficiency. But such models need reliable ground truth data to be trained on. We thus attack extracting machine-specific datasets for the energy consumption of basic blocks–a problem with surprisingly few solutions available. Given the impact of execution context on energy, we are interested in recording sequences of basic blocks coupled to corresponding energy measurements. Our design is lightweight and portable; no manual hardware/software instrumentation is required. Its main components are an energy estimation interface with sufficiently high refresh rate, access to an application’s complete execution trace, and LLVM pass-based instrumentation. We extract half a million basic block-energy mappings overall, and achieve a mean whole-program error of \(\sim \)3% on two different machines. This paper demonstrates that commodity resources suffice to perform a very crucial task on the road to energy-optimal computing.
The research work was supported by the Hellenic Foundation for Research and Innovation (HFRI) under the 3rd Call for HFRI PhD Fellowships (Fellowship Number: 61/512200), as well as by the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 101021274.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The source code is available at https://github.com/jimbou/energy_profiling.
- 2.
- 3.
- 4.
- 5.
- 6.
References
Energy-efficient Multi-mode Embedded Systems, pp. 99–131. Springer, US, Boston, MA (2004). https://doi.org/10.1007/0-306-48736-5_5
Abel, A., Reineke, J.: uiCA: accurate throughput prediction of basic blocks on recent intel microarchitectures. In: Proceedings of the 36th ACM International Conference on Supercomputing. ICS 2022, Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3524059.3532396
Aho, A.V., Sethi, R., Ullman, J.D.: Compilers: Principles, Techniques, and Tools, vol. 2. Addison-Wesley Reading, Boston (2007)
Ali Zeinolabedin, S.M., Partzsch, J., Mayr, C.: Analyzing ARM CoreSight ETMV4.x data trace stream with a real-time hardware accelerator. In: 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE), pp. 1606–1609 (2021). https://doi.org/10.23919/DATE51398.2021.9474035
Chen, L., Sultana, S., Sahita, R.: HeNet: a deep learning approach on intel® processor trace for effective exploit detection. In: 2018 IEEE Security and Privacy Workshops (SPW), pp. 109–115 (2018). https://doi.org/10.1109/SPW.2018.00025
Chen, Y., et al.: BHive: a benchmark suite and measurement framework for validating x86–64 basic block performance models. In: 2019 IEEE International Symposium on Workload Characterization (IISWC), pp. 167–177 (2019). https://doi.org/10.1109/IISWC47752.2019.9042166
David, H., Gorbatov, E., Hanebutte, U.R., Khanna, R., Le, C.: RAPL: memory power estimation and capping. In: Proceedings of the 16th ACM/IEEE International Symposium on Low Power Electronics and Design, pp. 189–194. ISLPED 2010, Association for Computing Machinery, New York, NY, USA (2010). https://doi.org/10.1145/1840845.1840883
Desrochers, S., Paradis, C., Weaver, V.M.: A validation of dram RAPL power measurements. In: Proceedings of the Second International Symposium on Memory Systems, pp. 455–470. MEMSYS 2016, Association for Computing Machinery, New York, NY, USA (2016). https://doi.org/10.1145/2989081.2989088
Fog, A.: Instruction tables: list of instruction latencies, throughputs and micro-operation breakdowns for intel, AMD and via CPUs (updated 2022) (2022). http://www.agner.org/optimize/instruction_tables.pdf
Ge, X., Cui, W., Jaeger, T.: Griffin: Guarding control flows using intel processor trace. In: Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 585–598. ASPLOS 2017, Association for Computing Machinery, New York, NY, USA (2017). https://doi.org/10.1145/3037697.3037716
Georgiou, S., Rizou, S., Spinellis, D.: Software development lifecycle for energy efficiency: techniques and tools. ACM Comput. Surv. 52(4), 1–3 (2019). https://doi.org/10.1145/3337773
Hähnel, M., Döbel, B., Völp, M., Härtig, H.: Measuring energy consumption for short code paths using RAPL. SIGMETRICS Perform. Eval. Rev. 40(3), 13–17 (2012). https://doi.org/10.1145/2425248.2425252
Haj-Yahya, J., Mendelson, A., Ben Asher, Y., Chattopadhyay, A.: Power management of modern processors. In: Energy Efficient High Performance Processors. CADM, pp. 1–55. Springer, Singapore (2018). https://doi.org/10.1007/978-981-10-8554-3_1
Jayaseelan, R., Mitra, T., Li, X.: Estimating the worst-case energy consumption of embedded software. In: 12th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS 2006), pp. 81–90 (2006). https://doi.org/10.1109/RTAS.2006.17
Kansal, A., Zhao, F., Liu, J., Kothari, N., Bhattacharya, A.A.: Virtual machine power metering and provisioning. In: Proceedings of the 1st ACM Symposium on Cloud Computing, pp. 39–50. SoCC 2010, Association for Computing Machinery, New York, NY, USA (2010). https://doi.org/10.1145/1807128.1807136
Khan, K.N., Hirki, M., Niemi, T., Nurminen, J.K., Ou, Z.: RAPL in action: experiences in using RAPL for power measurements. ACM Trans. Model. Perform. Eval. Comput. Syst. 3(2), 1–26 (2018). https://doi.org/10.1145/3177754
Lattner, C., Adve, V.: LLVM: a compilation framework for lifelong program analysis & transformation. In: International Symposium on Code Generation and Optimization, 2004. CGO 2004, pp. 75–86 (2004). https://doi.org/10.1109/CGO.2004.1281665
Lattner, C.: LLVM and clang: next generation compiler technology. In: The BSD Conference, vol. 5, pp. 1–20 (2008)
Liu, Y., Shi, P., Wang, X., Chen, H., Zang, B., Guan, H.: Transparent and efficient CFI enforcement with intel processor trace. In: 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA), pp. 529–540 (2017). https://doi.org/10.1109/HPCA.2017.18
Mendis, C., Renda, A., Amarasinghe, D., Carbin, M.: Ithemal: accurate, portable and fast basic block throughput estimation using deep neural networks. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 97, pp. 4505–4515. PMLR, 09–15 June 2019. https://proceedings.mlr.press/v97/mendis19a.html
Mukhanov, L., et al.: ALEA: a fine-grained energy profiling tool. ACM Trans. Archit. Code Optim. 14(1), 1–25 (2017). https://doi.org/10.1145/3050436
Newell, A., Pupyrev, S.: Improved basic block reordering. IEEE Trans. Comput. 69(12), 1784–1794 (2020). https://doi.org/10.1109/TC.2020.2982888
Pallister, J., Kerrison, S., Morse, J., Eder, K.: Data dependent energy modeling for worst case energy consumption analysis. In: Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems, pp. 51–59. SCOPES 2017, Association for Computing Machinery, New York, NY, USA (2017). https://doi.org/10.1145/3078659.3078666
Paradiso, J., Starner, T.: Energy scavenging for mobile and wireless electronics. IEEE Pervasive Comput. 4(1), 18–27 (2005). https://doi.org/10.1109/MPRV.2005.9
Pathak, A., Hu, Y.C., Zhang, M., Bahl, P., Wang, Y.M.: Fine-grained power modeling for smartphones using system call tracing. In: Proceedings of the Sixth Conference on Computer Systems, pp. 153–168. EuroSys 2011, Association for Computing Machinery, New York, NY, USA (2011). https://doi.org/10.1145/1966445.1966460
Pinto, G., Castor, F.: Energy efficiency: a new concern for application software developers. Commun. ACM 60(12), 68–75 (2017). https://doi.org/10.1145/3154384
Salajegheh, M.N.: Software Techniques to Reduce the Energy Consumption of Low-power Devices at the Limits of Digital Abstractions. University of Massachusetts Amherst (2012)
Schöne, R., Ilsche, T., Bielert, M., Velten, M., Schmidl, M., Hackenberg, D.: Energy efficiency aspects of the AMD Zen 2 architecture. In: 2021 IEEE International Conference on Cluster Computing (CLUSTER), pp. 562–571 (2021). https://doi.org/10.1109/Cluster48925.2021.00087
Singh, M.P., Jain, M.K.: Article: evolution of processor architecture in mobile phones. Int. J. Comput. Appl. 90(4), 34–39 (2014)
Stollon, N.: Nexus IEEE 5001. In: On-chip instrumentation, pp. 169–193. Springer, US, Boston, MA (2011). https://doi.org/10.1007/978-1-4419-7563-8_11
Tiwari, V., Malik, S., Wolfe, A., Lee, M.T.C.: Instruction level power analysis and optimization of software. In: Chandrakasan, A.P., Brodersen, R.W. (eds.) Technologies for Wireless Computing, pp. 139–154. Springer, US, Boston, MA (1996). https://doi.org/10.1007/978-1-4613-1453-0_9
Zuo, Z., et al.: JPortal: precise and efficient control-flow tracing for JVM programs with intel processor trace. In: Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, pp. 1080–1094. PLDI 2021, Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3453483.3454096
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Lamprakos, C.P., Bouras, D.S., Catthoor, F., Soudris, D. (2023). Reliable Basic Block Energy Accounting. In: Silvano, C., Pilato, C., Reichenbach, M. (eds) Embedded Computer Systems: Architectures, Modeling, and Simulation. SAMOS 2023. Lecture Notes in Computer Science, vol 14385. Springer, Cham. https://doi.org/10.1007/978-3-031-46077-7_13
Download citation
DOI: https://doi.org/10.1007/978-3-031-46077-7_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46076-0
Online ISBN: 978-3-031-46077-7
eBook Packages: Computer ScienceComputer Science (R0)