Automatic detection of indoor occupancy based on improved YOLOv5 model

Wang, Chao; Zhang, Yunchu; Zhou, Yanfei; Sun, Shaohan; Zhang, Hanyuan; Wang, Yepeng

doi:10.1007/s00521-022-07730-3

Automatic detection of indoor occupancy based on improved YOLOv5 model

Original Article
Published: 02 September 2022

Volume 35, pages 2575–2599, (2023)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Chao Wang¹^na1,
Yunchu Zhang¹,
Yanfei Zhou¹^na1,
Shaohan Sun¹^na1,
Hanyuan Zhang¹^na1 &
…
Yepeng Wang^2,3^na1

4549 Accesses
19 Citations
1 Altmetric
Explore all metrics

Abstract

Indoor occupancy detection is essential for energy efficiency control and Coronavirus Disease 2019 traceability. The number and location of people can be accurately identified and determined through classroom surveillance video analysis. This information is used to manage environmental equipment such as HVAC and lighting systems to reduce energy use. However, the mainstream one-stage YOLO algorithm still uses an anchor-based mechanism and couples detection heads to predict. This results in slow model convergence and poor detection performance for densely occluded targets. Therefore, this paper proposed a novel decoupled anchor-free VariFocal loss convolutional network algorithm DFV-YOLOv5 for occupancy detection to tackle these problems. The proposed method uses the YOLOv5 algorithm as a baseline. It uses the anchor-free mechanism to reduce the number of design parameters needing heuristic tuning. Afterwards, to reduce the coupling of the model, speed up the model’s convergence ability, and improve the model detection performance, the detection head is decoupled based on the YOLOv5 model. It can resolve the conflict between classification and regression tasks. In addition, we use the VariFocal loss to assign more weights to difficult data points to optimize the class imbalance problem and use the training target q to measure positive samples, treating positive and negative samples asymmetrically. The total loss function is redesigned, the $L_{1}$ loss is increased, and the ablation experiment verifies the effect of the improved loss. By applying a hybrid activation function of the sigmoid linear unit and rectified linear unit, we improved the model’s nonlinear representation and reduced the model’s inference time. Finally, a classroom dataset was constructed to validate the occupancy detection performance of the model. The proposed model was compared with mainstream target detection models regarding average mean precision, memory allocation, execution time, and the number of parameters on the VOC2012, CrowdHuman and self-built datasets. The experimental results show that the method significantly improves the detection accuracy and robustness, shortens the inference time, and proves the practicality of the algorithm in occupancy detection compared with the mainstream target detection model and related variants of the model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

Dense-stream YOLOv8n: a lightweight framework for real-time crowd monitoring in smart libraries

Article Open access 04 April 2025

A new YOLO-based method for social distancing from real-time videos

Article 07 April 2023

Public Social Distance Monitoring System Using Object Detection YOLO Deep Learning Algorithm

Article 23 September 2023

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data Availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References

Beiter P, Elchinger M, Tian T (2017) 2016 renewable energy data book. Technical report, National Renewable Energy Lab.(NREL), Golden
Xu Q (2019) Research on new energy saving technology of building lighting system. Ph.D. thesis, Suzhou University of Science and Technology
Petersen S, Pedersen TH, Nielsen KU, Knudsen M (2016) Establishing an image-based ground truth for validation of sensor data-based room occupancy detection. Energy Build 130:787–793
Article Google Scholar
Zou J, Zhao Q, Yang W, Wang F (2017) Occupancy detection in the office by analyzing surveillance videos and its application to building energy conservation. Energy Build 152:385–398
Article Google Scholar
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), vol 1. IEEE, pp 886–893
Platt J (1998) Sequential minimal optimization: a fast algorithm for training support vector machines
Zaveri S, Ross S, Erickson V, Kamthe A, Cerpa A (2011) Building energy management systems actuated using wireless camera sensor networks. ACM, New York
Book Google Scholar
Li NJ, Weng CF, Wang WJ, Chen HC, Lee PJ (2013) The people number estimation based on the embedded dsp system with surveillance camera. In: International conference on system science and engineering
Liu D, Guan X, Du Y, Zhao Q (2013) Measuring indoor occupancy in intelligent buildings using the fusion of vision sensors. Meas Sci Technol 24(7):074023
Article Google Scholar
Yang J, Pantazaras A, Chaturvedi KA, Chandran AK, Santamouris M, Lee SE, Tham KW (2018) Comparison of different occupancy counting methods for single system-single zone applications. Energy Build 172:221–234
Article Google Scholar
Sun K, Zhao Q, Zhang Z, Hu X (2022) Indoor occupancy measurement by the fusion of motion detection and static estimation. Energy Build 254:111593
Article Google Scholar
Zheng Y, Bao H, Meng C, Ma N (2020) A method of traffic police detection based on attention mechanism in natural scene. Neurocomputing
Wei Y, Zhang Z, Wang Y, Xu M, Yang Y, Yan S, Wang M (2021) Deraincyclegan: rain attentive cyclegan for single image deraining and rainmaking. IEEE Trans Image Process 30:4788–4801
Article Google Scholar
Yang Y, Zhang W, He Z, Li D (2020) High-speed rail pole number recognition through deep representation and temporal redundancy. Neurocomputing 415:201–214
Article Google Scholar
Zhang Z, Tang Z, Wang Y, Zhang Z, Zhan C, Zha Z, Wang M (2021) Dense residual network: enhancing global dense feature flow for character recognition. Neural Netw 139:77–85
Article Google Scholar
Lin T-Y, Maire M, Belongie S, Bourdev L, Girshick R, Hays J, Perona P, Ramanan D, Zitnick CL, Dollár P (2014) Microsoft COCO: common Objects in Context. arxiv:1405.0312Comment: 1) updated annotation pipeline description and figures; 2) added new section describing datasets splits; 3) updated author list
Everingham M, Gool LV, Williams CKI, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338
Article Google Scholar
Ahmed A, Tangri P, Panda A, Ramani D, Karmakar S (2019) Vfnet: a convolutional architecture for accent classification. IEEE
Zhou X, Koltun V, Krhenbühl P (2021) Probabilistic two-stage detection
Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448
Ren S, He K, Girshick R, Sun J (2017) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
Article Google Scholar
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 779–788
Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: IEEE conference on computer vision and pattern recognition, pp 6517–6525
Redmon J, Farhadi A (2018) Yolov3: an incremental improvement. arXiv e-prints
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) Ssd: Single shot multibox detector. Springer, Cham
Google Scholar
Bochkovskiy, A., Wang, C.Y., Liao, H.: Yolov4: Optimal speed and accuracy of object detection (2020)
Jocher G, Stoken A, Chaurasia A, Borovec J, NanoCode012, TaoXie, Kwon Y, Michael K, Changyu L, Fang J, V, A., Laughing, tkianai, yxNONG, Skalski P, Hogan A, Nadar J, imyhxy, Mammana L, AlexWang1900, Fati C, Montes D, Hajek J, Diaconu L, Minh MT, Marc albinxavi, fatih, oleg, wanghaoyang0106: ultralytics/yolov5: V6.0-YOLOv5n ’Nano’ Models, Roboflow Integration, TensorFlow Export, OpenCV DNN Support. 10.5281/zenodo.5563715
Ying Z, Lin Z, Wu Z, Liang K, Hu X (2022) A modified-yolov5s model for detection of wire braided hose defects. Measurement 190:110683
Article Google Scholar
Wang CY, Bochkovskiy A, Liao H (2020) Scaled-yolov4: scaling cross stage partial network
Zhou X, Wang D, Krhenbühl P (2019) Objects as points
Ge Z, Liu S, Wang F, Li Z, Sun J (2021) YOLOX: exceeding YOLO series in 2021
Yang Z, Liu S, Hu H, Wang L, Lin S (2019) Reppoints: point set representation for object detection. In: 2019 IEEE/CVF international conference on computer vision (ICCV)
Lee KH, Han SU (2021) Convolutional neural network modeling strategy for fall-related motion recognition using acceleration features of a scaffolding structure. Autom Constr 130:103857
Article Google Scholar
Zhou S, Song W (2021) Crack segmentation through deep convolutional neural networks and heterogeneous image fusion. Autom Constr 125:103605
Article Google Scholar
Automatic detection of hardhats worn by construction personnel (2019) A deep learning approach and benchmark dataset. Autom Constr 106:102894
Article Google Scholar
Conti F (2014) Brain-inspired classroom occupancy monitoring on a low-power mobile platform. In: IEEE conference on computer vision and pattern recognition workshops
Tien PW, Wei S, Calautit JK, Darkwa J, Wood C (2020) A vision-based deep learning approach for the detection and prediction of occupancy heat emissions for demand-driven control solutions. Energy Build 226:110386
Article Google Scholar
Meng Y-B, Li T-Y, Liu G-H, Xu S-J, Ji T (2020) Real-time dynamic estimation of occupancy load and an air-conditioning predictive control method based on image information fusion. Build Environ 173:106741
Article Google Scholar
Mutis I, Ambekar A, Joshi V (2020) Real-time space occupancy sensing and human motion analysis using deep learning for indoor air quality control. Autom Constr 116:103237
Article Google Scholar
Choi H, Um CY, Kang K, Kim H, Kim T (2021) Application of vision-based occupancy counting method using deep learning and performance analysis. Energy Buildi 252:111389
Article Google Scholar
Law H, Deng J (2020) Cornernet: detecting objects as paired keypoints. In: Springer US, pp 642–656
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017)Attention is all you need. NIPS’17, pp 6000–6010. Curran Associates Inc., Red Hook
Tan M, Pang R, Le QV (2019) Efficientdet: scalable and efficient object detection
Yun S, Han D, Oh SJ, Chun S, Choe J, Yoo Y (2019) Cutmix: regularization strategy to train strong classifiers with localizable features. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6023–6032
Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 936–944
Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR)
Rezatofighi H, Tsoi N, Gwak J, Sadeghian A, Savarese S (2019) Generalized intersection over union: a metric and a loss for bounding box regression. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Zheng Z, Wang P, Liu W, Li J, Ye R, Ren D (2019) Distance-iou loss: faster and better learning for bounding box regression. arXiv
Nair V, Hinton G (2010) Rectified linear units improve restricted Boltzmann machines Vinod Nair 27:807–814
Lin TY, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. IEEE Trans Pattern Anal Mach Intell 99:2999–3007
Google Scholar
Yu J, Jiang Y, Wang Z, Cao Z, Huang T (2016) Unitbox: an advanced object detection network. In: Proceedings of the 24th ACM international conference on multimedia, pp 516–520
Ji Y, Zhang H, Zhang Z, Liu M (2021) Cnn-based encoder-decoder networks for salient object detection: a comprehensive review and recent advances. Inf Sci 546:835–857
Article MathSciNet Google Scholar
Zhang H, Wang Y, Dayoub F, Sünderhauf N (2021) Varifocalnet: an iou-aware dense object detector. In: 2021 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 8510–8519
Zheng Y, Bao H, Meng C, Ma N (2021) A method of traffic police detection based on attention mechanism in natural scene. Neurocomputing 458:592–601
Article Google Scholar
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2021) An image is worth 16x16 words: transformers for image recognition at scale. In: International conference on learning representations
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision transformer using shifted windows
Chia C, Yanga F, Xub C, Chenga L, Yangc C (2022) A multi-scale thermal-fluid coupling model for onan transformer considering entire circulating oil systems. Int J Electr Power Energy Syst 135:107614
Article Google Scholar
Lin K, Wang L, Liu Z (2021) End-to-end human pose and mesh reconstruction with transformers. In: 2021 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 1954–1963
Mehta S, Paunwala C, Vaidya B (2019) Cnn based traffic sign classification using adam optimizer. In: 2019 international conference on intelligent computing and control systems (ICCS), pp 1293–1298
Tan M, Pang R, Le QV (2020) Efficientdet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10781–10790
Shao S, Zhao Z, Li B, Xiao T, Yu G, Zhang X, Sun J (2018) Crowdhuman: a benchmark for detecting human in a crowd. arXiv:1805.00123

Download references

Funding

This study is partly supported by the National Natural Science Foundation of China (62003191), the Natural Science Foundation of Shandong Province (ZR2020QF072).

Author information

Chao Wang, Yanfei Zhou, Shaohan Sun, Hanyuan Zhang and Yepeng Wang have contributed equally to this work.

Authors and Affiliations

Shandong Key Laboratory of Intelligent Buildings Technology, School of Information and Electrical Engineering, Shandong Jianzhu University, Jinan, 250101, China
Chao Wang, Yunchu Zhang, Yanfei Zhou, Shaohan Sun & Hanyuan Zhang
Shandong Transportation Planning and Design Institute Group Co. Ltd, Jinan, 250098, China
Yepeng Wang
Beijing Institute of Technology, Beijing, 100081, China
Yepeng Wang

Authors

Chao Wang
View author publications
You can also search for this author inPubMed Google Scholar
Yunchu Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Yanfei Zhou
View author publications
You can also search for this author inPubMed Google Scholar
Shaohan Sun
View author publications
You can also search for this author inPubMed Google Scholar
Hanyuan Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Yepeng Wang
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

All authors contributed equally.

Corresponding author

Correspondence to Yunchu Zhang.

Ethics declarations

Conflict of interest

None.

Ethics approval

Confirm.

Consent to participate

Confirm.

Consent for publication

Confirm.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, C., Zhang, Y., Zhou, Y. et al. Automatic detection of indoor occupancy based on improved YOLOv5 model. Neural Comput & Applic 35, 2575–2599 (2023). https://doi.org/10.1007/s00521-022-07730-3

Download citation

Received: 16 February 2022
Accepted: 12 August 2022
Published: 02 September 2022
Issue Date: January 2023
DOI: https://doi.org/10.1007/s00521-022-07730-3

Keywords

Part of a collection:

Computer Science SDG 7: Affordable and Clean Energy

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic detection of indoor occupancy based on improved YOLOv5 model

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Dense-stream YOLOv8n: a lightweight framework for real-time crowd monitoring in smart libraries

A new YOLO-based method for social distancing from real-time videos

Public Social Distance Monitoring System Using Object Detection YOLO Deep Learning Algorithm

Explore related subjects

Data Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethics approval

Consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now