Abstract
The rapid expansion of the Internet of Things (IoT) industry highlights the significance of workload characterization when evaluating microprocessors tailored for IoT applications. The streamlined yet comprehensive system stack of an IoT system is highly suitable for synergistic software and hardware co-design. This stack comprises various layers, including programming languages, frameworks, runtime environments, instruction set architectures (ISA), operating systems (OS), and microarchitecture. These layers can be bucketed into three primary categories: the intermediate representation (IR) layer, the ISA layer, and the microarchitecture layer. Consequently, conducting cross-layer workload characterization constitutes the initial stride in IoT design, especially in co-design. In this paper, we use a cross-layer profiling methodology to conduct an exhaustive analysis of IoTBench-an IoT workload benchmark. Each layer’s key metrics, including instruction, data, and branch locality, were meticulously examined. Experimental evaluations were performed on both ARM and X86 architectures. Our findings revealed general patterns in how IoTBench’s metrics fluctuate with different input data. Additionally, we noted that the same metrics could demonstrate varied characteristics across different layers, suggesting that isolated layer analysis might yield incomplete conclusions. Besides, our cross-layer profiling disclosed that the convolution task, characterized by deeply nested loops, significantly amplified branch locality at the microarchitecture layer on the ARM platform. Interestingly, optimization with the GNU C++ compiler (G++), intended to boost performance, had a counterproductive effect, exacerbating the branch locality issue and resulting in performance degradation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abdi, H., Williams, L.J., et al.: Normalizing data. Encyclopedia of research design 1 (2010)
IoT Analytics: State of IoT-Spring 2023 (2023). https://iot-analytics.com/number-connected-iot-devices/
Bienia, C., Kumar, S., Singh, J.P., Li, K.: The PARSEC benchmark suite: characterization and architectural implications. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, pp. 72–81 (2008)
Bruening, D., Zhao, Q., Kleckner, R.: DynamoRIO: dynamic instrumentation tool platform (2020). http://www.dynamorio.org
Chen, S., Luo, C., Gao, W., Wang, L.: IoTBench: a data centrical and configurable IoT benchmark suite. BenchCouncil Trans. Benchmarks Stand. Eval. 2(4), 100091 (2022)
(EEMBC) EMBC: CoreMark Benchmark (2021). https://www.eembc.org/coremark/
Ferdman, M., et al.: Clearing the clouds: a study of emerging scale-out workloads on modern hardware. ACM SIGPLAN Not. 47(4), 37–48 (2012)
Guthaus, M.R., Ringenberg, J.S., Ernst, D., Austin, T.M., Mudge, T., Brown, R.B.: MiBench: a free, commercially representative embedded benchmark suite. In: Proceedings of the Fourth Annual IEEE International Workshop on Workload Characterization, WWC-4 (Cat. No. 01EX538), pp. 3–14. IEEE (2001)
Hoste, K., Eeckhout, L.: Microarchitecture-independent workload characterization. IEEE Micro 27(3), 63–72 (2007)
Laghari, A.A., Wu, K., Laghari, R.A., Ali, M., Khan, A.A.: A review and state of art of Internet of Things (IoT). Arch. Comput. Methods Eng. 1–19 (2021)
Lattner, C., Adve, V.: LLVM: a compilation framework for lifelong program analysis & transformation. In: International Symposium on Code Generation and Optimization, CGO 2004, pp. 75–86. IEEE (2004)
Limaye, A., Adegbija, T.: A workload characterization of the SPEC CPU2017 benchmark suite. In: 2018 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 149–158. IEEE (2018)
Panda, R., Song, S., Dean, J., John, L.K.: Wait of a decade: did SPEC CPU 2017 broaden the performance horizon? In: 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA) (2018)
Pearson, K.: On the theory of contingency and its relation to association and normal correlation (1904)
Poovey, J.A., Conte, T.M., Levy, M., Gal-On, S.: A benchmark characterization of the EEMBC benchmark suite. IEEE Micro 29(5), 18–29 (2009)
Shao, Y.S., Brooks, D.: ISA-independent workload characterization and its implications for specialized architectures. In: 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 245–255. IEEE (2013)
Wang, L., Ren, R., Zhan, J., Jia, Z.: Characterization and architectural implications of big data workloads. In: 2016 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 145–146. IEEE (2016)
Wang, L., et al.: WPC: whole-picture workload characterization. arXiv preprint arXiv:2302.12954 (2023)
Weicker, R.P.: Dhrystone: a synthetic systems programming benchmark. Commun. ACM 27(10), 1013–1030 (1984)
Yokota, T., Ootsu, K., Baba, T.: Introducing entropies for representing program behavior and branch predictor performance. In: Proceedings of the 2007 Workshop on Experimental Computer Science, pp. 17-es (2007)
Acknowledgments
This work is supported by the Strategic Priority Research Program of the Chinese Academy of Sciences, Grant No. XDA0320000 and XDA0320300.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Zhang, F., Wang, C., Luo, C., Wang, L. (2024). Cross-Layer Profiling of IoTBench. In: Hunold, S., Xie, B., Shu, K. (eds) Benchmarking, Measuring, and Optimizing. Bench 2023. Lecture Notes in Computer Science, vol 14521. Springer, Singapore. https://doi.org/10.1007/978-981-97-0316-6_5
Download citation
DOI: https://doi.org/10.1007/978-981-97-0316-6_5
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-0315-9
Online ISBN: 978-981-97-0316-6
eBook Packages: Computer ScienceComputer Science (R0)