An IoT-enabled real-time overhead view person detection system based on Cascade-RCNN and transfer learning

Ahmad, Misbah; Ahmed, Imran; Jeon, Gwanggil

doi:10.1007/s11554-021-01103-0

An IoT-enabled real-time overhead view person detection system based on Cascade-RCNN and transfer learning

Special Issue paper
Published: 12 April 2021

Volume 18, pages 1129–1139, (2021)
Cite this article

Journal of Real-Time Image Processing Aims and scope Submit manuscript

Misbah Ahmad¹,
Imran Ahmed¹ &
Gwanggil Jeon²

778 Accesses
Explore all metrics

Abstract

Internet of things (IoT) is transforming technological evolution in several practical applications. These applications range from smart cities, smart healthcare to intelligent video surveillance, where the primary interest is person monitoring and detection. The amalgamation of Artificial Intelligence (AI) and IoT-based techniques maintain a balance between computational cost and efficiency that is essential for next-generation IoT networks. In this context, a real-time IoT-enabled people detection system is introduced. The developed system performs image processing task over the cloud using an internet connection, thus reduces the computational cost by processing high-resolution images over the cloud. For person detection, a pre-trained Cascade RCNN, a deep learning approach is used. It is an object detection architecture, seeks to address discrediting performance with increased Intersection over Union (IoU) thresholds. As the architecture is pre-trained with COCO data set and the person body’s appearance in overhead perspective is significantly different; thus, additional training is performed to enhance the detection results. Taking advantage of transfer learning architecture is trained for overhead person images, and the newly trained feature layer is added to the existing architecture. Experimental outcomes reveal that additional training increases the detection architecture’s performance with an accuracy rate of 0.96.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

LiteFace: A Light-Weight Multi-person Face Detection Model

An Multi-feature Fusion Object Detection System for Mobile IoT Devices and Edge Computing

Efficient Object Detection Model for Edge Devices

References

Zou, Z., Shi, Z., Guo, Y., Ye, J.: Object detection in 20 years: a survey. arXiv preprint. arXiv: 1905.05055 (2019)
Yao, R., Lin, G., Xia, S., Zhao, J., Zhou, Y.: Video object segmentation and tracking: a survey. arXiv preprint. arXiv: 1904.09172 (2019)
Zhou, S., Ke, M., Qiu, J., Wang, J.: A survey of multi-object video tracking algorithms. In: Abawajy, J., Choo, K.K.R., Islam, R., Xu, Z., Atiquzzaman, M. (eds.) International conference on applications and techniques in cyber security and intelligence ATCI 2018, pp. 351–369. Springer, Cham (2019)
Ahmad, M., Ahmed, I., Khan, F.A., Qayum, F., Aljuaid, H.: Convolutional neural network-based person tracking using overhead views. Int. J. Distrib. Sens. Netw. 16(6), 1550147720934738 (2020)
Article Google Scholar
Ahmed, I., Ahmad, M., Nawaz, M., Haseeb, K., Khan, S., Jeon, G.: Efficient topview person detector using point based transformation and lookup table. Comput. Commun. 147, 188 (2019)
Article Google Scholar
Ahmed, I., Din, S., Jeon, G., Piccialli, F.: Exploring deep learning models for overhead view multiple object detection. IEEE Internet Things J. 7(7), 5737 (2020)
Article Google Scholar
Ahmed, I., Adnan, A.: A robust algorithm for detecting people in overhead views. Clust. Comput. 21(1), 633 (2018). https://doi.org/10.1007/s10586-017-0968-3
Article MathSciNet Google Scholar
Vera, P., Monjaraz, S., Salas, J.: Counting pedestrians with a zenithal arrangement of depth cameras. Mach. Vis. Appl. 27(2), 303 (2016)
Article Google Scholar
Ertler, C., Possegger, H., Opitz, M., Bischof, H.: Pedestrian detection in RGB-D images from an elevated viewpoint. In: Kropatsch, W., Janusch, I., Artner, N. (eds.) Proceedings of the 22nd computer vision winter workshop, TU Wien, pattern recongition and image processing group, Austria (2017)
Ahmad, M., Ahmed, I., Ullah, K., Khan, I., Adnan, A.: Robust background subtraction based person′s counting from overhead view. In 2018 9th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON). IEEE, pp. 746–752 (2018)
Kristoffersen, M., Dueholm, J., Gade, R., Moeslund, T.: Pedestrian counting with occlusion handling using stereo thermal cameras. Sensors 16(1), 62 (2016)
Article Google Scholar
Burbano, A., Bouaziz, S., Vasiliu, M.: 3D-sensing distributed embedded system for people tracking and counting. In: 2015 International conference on computational science and computational intelligence (CSCI), pp. 470–475 (2015)
Tseng, T., Liu, A., Hsiao, P., Huang, C., Fu, L.: Real-time people detection and tracking for indoor surveillance using multiple top-view depth cameras. In: 2014 IEEE/RSJ international conference on intelligent robots and systems, pp. 4077–4082 (2014)
García, J., Gardel, A., Bravo, I., Lázaro, J.L., Martínez, M., Rodríguez, D.: Directional people counter based on head tracking. IEEE Trans. Ind. Electron. 60(9), 3991 (2013)
Article Google Scholar
Ahmed, I., Ahmad, A., Piccialli, F., Sangaiah, A.K., Jeon, G.: A robust features-based person tracker for overhead views in industrial environment. IEEE Internet Things J. 5(3), 1598 (2018)
Article Google Scholar
Rauter, M.: Reliable human detection and tracking in top-view depth images. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) workshops, pp. 529–534 (2013)
Ullah, K., Ahmed, I., Ahmad, M., Khan, I.: Comparison of person tracking algorithms using overhead view implemented in OpenCV. In: 2019 9th Annual information technology, electromechanical engineering and microelectronics conference (IEMECON) (IEEE), pp. 284–289 (2019)
Cai, Z., Vasconcelos, N.: Cascade r-cnn: delving into high quality object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6154–6162 (2018)
Iguernaissi, R., Merad, D., Drap, P.: People counting based on kinect depth data. In: Proceedings of the 7th international conference on pattern recognition applications and methods—volume 1: ICPRAM. INSTICC (SciTePress), pp. 364–370 (2018). https://doi.org/10.5220/0006585703640370
Perng, J., Wang, T., Hsu, Y., Wu, B.: The design and implementation of a vision-based people counting system in buses. In: 2016 International conference on system science and engineering (ICSSE), pp. 1–3 (2016)
Hsu, T.-W., Yang, Y.-H., Yeh, T.-H., Liu, A.-S., Fu, L.-C., Zeng, Y.-C.: Privacy free indoor action detection system using top-view depth camera based on key-poses. In: 2016 IEEE international conference on systems, man, and cybernetics (SMC), pp. 004058–004063 (2016)
Ahmad, M., Ahmed, I., Ullah, K., Khan, I., Khattak, A., Adnan, A.: Person detection from overhead view: a survey. Int. J. Adv. Comput. Sci. Appl. (2019). https://doi.org/10.14569/IJACSA.2019.0100470
Article Google Scholar
Ozturk, O., Yamasaki, T., Kiyoharu, A.: Tracking of humans and estimation of body/head orientation from top-view single camera for visual focus of attention analysis. In: 2009 IEEE 12th international conference on computer vision workshops, ICCV Workshops, pp. 1020–1027 (2009)
Wu, C.J., Houben, S., Marquardt, N.: EagleSense: tracking people and devices in interactive spaces using real-time top-view depth-sensing. In: Proceedings of the 2017 CHI conference on human factors in computing systems (Association for Computing Machinery, New York, NY, USA), CHI ’17, pp. 3929–3942 (2017). https://doi.org/10.1145/3025453.3025562
Wetzel, J., Laubenheimer, A., Heizmann, M.: Joint probabilistic people detection in overlapping depth images. IEEE Access 8, 28349 (2020)
Article Google Scholar
Van Oosterhout, T., Bakkes, S., Kröse, B.J. et al.: Head detection in stereo data for people counting and segmentation. In: VISAPP, pp. 620–625 (2011)
Wateosot, C., Suvonvorn, N. et al.: Top-view based people counting using mixture of depth and color information. In: The second Asian conference on information systems, ACIS (Citeseer), (2013)
Gao, C., Liu, J., Feng, Q., Lv, J.: People-flow counting in complex environments by combining depth and color information. Multimed. Tools Appl. 75(15), 9315 (2016). https://doi.org/10.1007/s11042-016-3344-z
Article Google Scholar
Mukherjee, S., Saha, B., Jamal, I., Leclerc, R., Ray, N.: Anovel framework for automatic passenger counting. In: 2011 18th IEEE international conference on image processing, pp. 2969–2972 (2011)
Nakatani, R., Kouno, D., Shimada, K., Endo, T.: A person identification method using a top-view head image from an overhead camera. JACIII 16(6), 696 (2012)
Article Google Scholar
Velipasalar, S., Tian, Y., Hampapur, A.: Automatic counting of interacting people by using a single uncalibrated camera. In: 2006 IEEE international conference on multimedia and expo, pp. 1265–1268 (2006)
Yu, S., Chen, X., Sun, W., Xie, D.: A robust method for detecting and counting people. In: 2008 International conference on audio, language and image processing, pp. 1545–1549 (2008)
Yahiaoui, T., Meurie, C., Khoudour, L., Cabestaing, F.: A people counting system based on dense and close stereovision. In: Elmoataz, A., Lezoray, O., Nouboud, F., Mammass, D. (eds.) Image Signal Process., pp. 59–66. Springer, Berlin (2008)
Chapter Google Scholar
Cao, J., Sun, L., Odoom, M.G., Luan, F., Song, X.: Counting people by using a single camera without calibration. In: 2016 Chinese control and decision conference (CCDC), pp. 2048–2051 (2016)
Ahmed, I., Carter, J.N.: A robust person detector for overhead views. In: Proceedings of the 21st international conference on pattern recognition (ICPR2012). IEEE, pp. 1483–1486 (2012)
Choi, T.W., Kim, D.H., Kim, K.H.: Human detection in top-view depth image. Contemp. Eng. Sci. 9(11), 547 (2016)
Article Google Scholar
Pang, Y., Yuan, Y., Li, X., Pan, J.: Efficient HOG human detection. Signal Process. 91(4), 773 (2011)
Article Google Scholar
Ahmed, I., Ahmad, M., Adnan, A., Ahmad, A., Khan, M.: Person detector for different overhead views using machine learning. Int. J. Mach. Learn. Cybern. 10(10), 2657 (2019). https://doi.org/10.1007/s13042-019-00950-5
Article Google Scholar
Ullah, K., Ahmed, I., Ahmad, M., Rahman, A.U., Nawaz, M., Adnan, A.: Rotation invariant person tracker using top view. J. Ambient Intell. Humaniz. Comput., pp. 1–17 (2019)
Migniot, C., Ababsa, F.: Hybrid 3D–2D human tracking in a top view. J. Real Time Image Process. 11(4), 769 (2016)
Article Google Scholar
Ahmad, M., Ahmed, I., Adnan, A.: Overhead view person detection using YOLO. In: 2019 IEEE 10th Annual ubiquitous computing, electronics mobile communication conference (UEMCON), pp. 0627–0633 (2019)
Ahmad, M., Ahmed, I., Ullah, K., Ahmad, M.: A deep neural network approach for top view people detection and counting. In: 2019 IEEE 10th annual ubiquitous computing, electronics mobile communication conference (UEMCON), pp. 1082–1088 (2019)
Du, D., Qi, Y., Yu, H., Yang, Y., Duan, K., Li, G., Zhang, W., Huang, Q., Tian, Q.: The unmanned aerial vehicle benchmark: object detection and tracking. In: Proceedings of the European conference on computer vision (ECCV) (2018)
Zhu, P., Wen, L., Du, D., Bian, X., Ling, H., Hu, Q., Wu, H., Nie, Q., Cheng, H., Liu, C. et al.: VisDrone-VDT2018: the vision meets drone video detection and tracking challenge results. In: Proceedings of the European conference on computer vision (ECCV) (2018)
Qi, Y., Zhang, S., Zhang, W., Su, L., Huang, Q., Yang, M.H.: Learning attribute-specific representations for visual tracking. In: Proceedings of the AAAI conference on artificial intelligence, vol. 33, pp. 8835–8842 (2019)
Ahmed, I., Ahmad, M., Khan, F.A., Asif, M.: Comparison of deep-learning-based segmentation models: using top view person images. IEEE Access 8, 136361–136373 (2020)
Ahmed, I., Din, S., Jeon, G., Piccialli, F., Fortino, G.: Towards collaborative robotics in top view surveillance: a framework for multiple object tracking by detection using deep learning. IEEE/CAA J. Autom. Sin. (2020). https://doi.org/10.1109/JAS.2020.1003453
Article Google Scholar
Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) Computer vision—ECCV 2014, pp. 740–755. Springer, Cham (2014)
Chapter Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (eds.) Advances in neural information processing systems, vol 28, pp. 91–99. Curran Associates Inc. (2015)
Google Scholar
Dollár, P., Welinder, P., Perona, P.: Cascaded pose regression. In: 2010 IEEE computer society conference on computer vision and pattern recognition IEEE, pp. 1078–1085 (2010)
Yan, J., Lei, Z., Yi, D., Li, S.: Learn to combine multiple hypotheses for accurate face alignment. In: Proceedings of the IEEE international conference on computer vision workshops, pp. 392–396 (2013)
Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp. 1440–1448 (2015)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp. 2961–2969 (2017)

Download references

Acknowledgements

This work was supported under the framework of international cooperation program managed by the National Research Foundation of Korea (2019K1A3A1A8011295711).

Author information

Authors and Affiliations

Centre for Excellence in Information Technology, IMSciences Peshawar, 1-A, Sector E-5, Phase VII, Hayatabad, Peshawar, Pakistan
Misbah Ahmad & Imran Ahmed
Department of Embedded Systems Engineering, Incheon National University, Incheon, Korea
Gwanggil Jeon

Authors

Misbah Ahmad
View author publications
You can also search for this author inPubMed Google Scholar
Imran Ahmed
View author publications
You can also search for this author inPubMed Google Scholar
Gwanggil Jeon
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Gwanggil Jeon.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ahmad, M., Ahmed, I. & Jeon, G. An IoT-enabled real-time overhead view person detection system based on Cascade-RCNN and transfer learning. J Real-Time Image Proc 18, 1129–1139 (2021). https://doi.org/10.1007/s11554-021-01103-0

Download citation

Received: 02 January 2021
Accepted: 30 March 2021
Published: 12 April 2021
Issue Date: August 2021
DOI: https://doi.org/10.1007/s11554-021-01103-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An IoT-enabled real-time overhead view person detection system based on Cascade-RCNN and transfer learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

LiteFace: A Light-Weight Multi-person Face Detection Model

An Multi-feature Fusion Object Detection System for Mobile IoT Devices and Edge Computing

Efficient Object Detection Model for Edge Devices

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now