Abstract
The building sector accounts for almost 40% of global energy consumption. However, buildings’ Heating, Ventilation, and Air Conditioning (HVAC) systems are susceptible to various faults and defects, causing significant declines in buildings’ energy efficiency. This implies a critical demand for suitable fault detection and diagnosis methods. At the same time, Large Language Models (LLMs) evolved rapidly over the last few years and, especially since the release of ChatGPT in 2022, gained widespread attention. LLMs interpret and process natural-language content as sequence-like data. We utilize LLMs proficiency in dealing with sequential input to handle time series data of buildings’ HVAC systems and develop a novel fault detection method, harnessing LLMs to detect faults in common HVAC systems, thereby helping to mitigate energy wastage in buildings. We use publicly available time series datasets from a collection of European buildings’ most common HVAC systems, serialize them, and pass them to a pre-trained LLM (DistilBERT). By fine-tuning the model with a large number of labeled input data, we enable classification into either binary cases (faulty/fault-free) or multiple fault classes. The performance is assessed using a 5-fold time series cross-validation, yielding an F1-score between 82–99% for the binary fault classification and a macro-averaged F1-score of up to 99% for the multi-class classification tasks. The main advantage of using LLMs for fault detection is that in contrast to conventional fault detection methods, LLMs can naturally deal with noisy input data. This can reduce the required preprocessing steps such as removing randomly missing values, encoding categorical features, and normalization.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Heating Market Report 2021. Technical report, European Heating Industry (EHI), Brussels (2021)
Chen, Z., et al.: A review of data-driven fault detection and diagnostics for building HVAC systems. Appl. Energy 339, 121030 (2023)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: North American Chapter of the Association for Computational Linguistics (2019). https://api.semanticscholar.org/CorpusID:52967399
European Commission: Heat Pumps - action plan to accelerate roll-out across the EU (2023). https://energy.ec.europa.eu/topics/energy-efficiency/heat-pumps_en
Granderson, J., et al.: A labeled dataset for building HVAC systems operating in faulted and fault-free states. Sci. Data 10(1) (2023). https://doi.org/10.1038/s41597-023-02197-w
Granderson, J., Singla, R., Mayhorn, E., Ehrlich, P., Vrabie, D., Frank, S.: Characterization and survey of automated fault detection and diagnostic tools. Technical report, Lawrence Berkeley National Laboratory (2017)
Hegselmann, S., Buendia, A., Lang, H., Agrawal, M., Jiang, X., Sontag, D.: TabLLM: few-shot classification of tabular data with large language models. In: International Conference on Artificial Intelligence and Statistics, pp. 5549–5581 (2023). https://arxiv.org/pdf/2210.10723.pdf
Hitchin, R., Pout, C., Riviere, P.: Assessing the market for air conditioning systems in European buildings. Energy Build. 58, 355–362 (2013). https://doi.org/10.1016/j.enbuild.2012.10.007
Katipamula, S., Brambley, M.R.: Review article: methods for fault detection, diagnostics, and prognostics for building systems–a review, part II. HVAC R Res. 11(2), 169–187 (2005). https://doi.org/10.1080/10789669.2005.10391133
Nejat, P., Jomehzadeh, F., Taheri, M.M., Gohari, M., Muhd, M.Z.: A global review of energy consumption, CO2 emissions and policy in the residential sector (with an overview of the top ten CO2 emitting countries) (2015). https://doi.org/10.1016/j.rser.2014.11.066
Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library (2019). http://arxiv.org/abs/1912.01703
Pedregosa, F., et al: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12(Oct), 2825–2830 (2011)
Rosato, A., Guarino, F., El Youssef, M., Capozzoli, A., Masullo, M., Maffei, L.: Experimental assessment of ground-truth faults in a typical single-duct dual-fan air-handling unit under Mediterranean climatic conditions: impact scenarios of sensors’ offset and fans’ failure. Energy Build. 275 (2022). https://doi.org/10.1016/j.enbuild.2022.112492
Sanh, V., Debut, L., Chaumond, J., Wolf, T.: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. ArXiv (2019). https://arxiv.org/abs/1910.01108
Sun, C., Li, Y., Li, H., Hong, S.: TEST: text prototype aligned embedding to activate LLM’s ability for time series. Preprint arXiv (2023). http://arxiv.org/abs/2308.08241
Vallee, M., Wissocq, T., Gaoua, Y., Lamaison, N.: Generation and evaluation of a synthetic dataset to improve fault detection in district heating and cooling systems. Energy 283, 128387 (2023). https://doi.org/10.1016/j.energy.2023.128387
Vaswani, A., et al.: Attention is All you Need. In: Neural Information Processing Systems, vol. 30. Curran Associates, Inc. (2017). https://api.semanticscholar.org/CorpusID:13756489
Wetter, M., Zuo, W., Nouidui, T.S., Pang, X.: Modelica buildings library. J. Build. Perform. Simul. 7(4), 253–270 (2014). https://doi.org/10.1080/19401493.2013.765506
Wolf, T., et al.: HuggingFace’s transformers: state-of-the-art natural language processing (2019). http://arxiv.org/abs/1910.03771
Zhang, F., Saeed, N., Sadeghian, P.: Deep learning in fault detection and diagnosis of building HVAC systems: a systematic review with meta analysis (2023). https://doi.org/10.1016/j.egyai.2023.100235
Zhou, M., Li, F., Zhang, F., Zheng, J., Ma, Q.: Meta in-context learning: harnessing large language models for electrical data classification. Energies 16(18) (2023). https://doi.org/10.3390/en16186679
Zhou, T., Niu, P., Wang, X., Sun, L., Jin, R.: One fits all: power general time series analysis by pretrained LM. In: Neural Information Processing Systems (2023). http://arxiv.org/abs/2302.11939
Acknowledgments
The reported research has been conducted within the projects NextGES (funded by Zukunftsfonds Steiermark) and ECom4Future (funded by Clean Energy Transition Partnership (FFG 903927)).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Disclosure of Interests
The authors have no competing interests to declare.
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Langer, G., Hirsch, T., Kern, R., Kohl, T., Schweiger, G. (2025). Large Language Models for Fault Detection in Buildings’ HVAC Systems. In: Jørgensen, B.N., Ma, Z.G., Wijaya, F.D., Irnawan, R., Sarjiya, S. (eds) Energy Informatics. EI.A 2024. Lecture Notes in Computer Science, vol 15272. Springer, Cham. https://doi.org/10.1007/978-3-031-74741-0_4
Download citation
DOI: https://doi.org/10.1007/978-3-031-74741-0_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-74740-3
Online ISBN: 978-3-031-74741-0
eBook Packages: Computer ScienceComputer Science (R0)