Skip to main content

Performance of LLMs on Computing Systems for Deployment in IoT Devices

  • Conference paper
  • First Online:
Advances on Broad-Band Wireless Computing, Communication and Applications (BWCCA 2024)

Abstract

In this study, the authors explore the performance of different Large Language Models such as BART-Base, GPT Neo and DistilGPT-2 on hardware devices. These models are fine-tuned on a general dataset and tested on systems with various computing capabilities, from high-end servers and cloud infrastructures to more resource-constrained embedded devices. The main objective is to determine how fast a model can handle the input when given, the precision of text summarisation and the similarity between the machine-generated translation and the reference translations. The novelty of this research lies in finding the compromise between the speed of processing and the precision in generating the output. This approach aims to determine which model and system performs the best for future deployment in Internet of Things (IoT) devices.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Zhao, W.X., et al: A survey of large language models, Peiyu Liu (2023)

    Google Scholar 

  2. Hoffmann, J., et al.: Training compute-optimal large language models. arXiv preprint arXiv:2203.15556 (2022)

  3. Vailshery, L.S.: Number of internet of things (IoT) connections worldwide from 2022 to 2023, with forecasts from 2024 to 2033 (2024)

    Google Scholar 

  4. Barbella, M., Tortora, G.: Rouge metric evaluation for text summarization techniques. Available at SSRN 4120317 (2022)

    Google Scholar 

  5. Son, J., Kim, B.: Translation performance from the user’s perspective of large language models and neural machine translation systems. Information 14(10), 574 (2023)

    Article  Google Scholar 

  6. Sheng, Y., et al.: Flexgen: high-throughput generative inference of large language models with a single GPU. In: International Conference on Machine Learning, pp. 31094–31116. PMLR (2023)

    Google Scholar 

  7. Dhar, N., Deng, B., Lo, D., Wu, X., Zhao, L., Suo, K.: An empirical analysis and resource footprint study of deploying large language models on edge devices. In: Proceedings of the 2024 ACM Southeast Conference, ACM SE 2024, pp. 69–76, New York, NY, USA, Association for Computing Machinery (2024)

    Google Scholar 

  8. Sikorski, P.,et al.: Deployment of NLP and LLM techniques to control mobile robots at the edge: a case study using GPT-4-turbo and llama, February 2024

    Google Scholar 

  9. Lewis, M., et al.: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension, Bart (2019)

    Google Scholar 

  10. Wang, A., Cho, K.: Bert has a mouth, and it must speak: bert as a markov random field language model. arXiv preprint arXiv:1902.04094 (2019)

  11. Xiong, X., Zheng, M.: GPT-neo-CRV: Elevating information accuracy in GPT-neo with cross-referential validation. Authorea Preprints (2024)

    Google Scholar 

  12. Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR, abs/1810.04805 (2018)

    Google Scholar 

  13. Bisht, T.: iamtarun python code instructions 18k alpaca (2023)

    Google Scholar 

Download references

Acknowledgement

This research was partially supported by FSS project"SmartBits Robotics - Creation and Development of Ideas and Technologies" - 2182.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Theodor-Radu Grumeza .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Grumeza, TR., Lazãr, TA., Fortiş, AE. (2025). Performance of LLMs on Computing Systems for Deployment in IoT Devices. In: Barolli, L. (eds) Advances on Broad-Band Wireless Computing, Communication and Applications. BWCCA 2024. Lecture Notes on Data Engineering and Communications Technologies, vol 231. Springer, Cham. https://doi.org/10.1007/978-3-031-76452-3_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-76452-3_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-76451-6

  • Online ISBN: 978-3-031-76452-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics