Event-Based Visual Sensing for Human Motion Detection and Classification at Various Distances

Colonnier, Fabien; Seeralan, Aravind; Zhu, Longwei

doi:10.1007/978-3-031-26431-3_7

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13763))

Included in the following conference series:

Pacific-Rim Symposium on Image and Video Technology

327 Accesses

Abstract

In Human Research and Rescue scenarios, it is useful to be able to distinguish persons in distress from rescuers. Assuming people requiring help would wave to attract attention, human motion is thus a significant cue to identify person in needs. Therefore, in this paper, we aim at detecting and classifying human motion at different depths with low resolution. The task is fulfilled thanks to an event-based sensor and a Spiking Neural Network (SNN). The event-based sensor has been chosen as a suitable device to register motion specifically. While SNN is appropriate to process the event-based data, it is also a suitable algorithm to be implemented in low-power neuromorphic device, allowing for a longer operating time. In this study, we gather new data with similar classes to the IBM DVS Gesture dataset at various distances. We show we can achieve an accuracy up to 91.5% on a validation set obtained at different depths and lighting conditions from the training set. We also show that having an Region of Interest detection leads to better accuracy compare to a full frame model on untrained distances.

This research was supported by Programmatic grant no. A1687b0033 from the Singapore governments Research, Innovation and Enterprise 2020 plan (Advanced Manufacturing and Engineering domain).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 44.99; Price excludes VAT (USA)

Softcover Book: USD 59.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Acharya, J., et al.: EBBIOT: a low-complexity tracking algorithm for surveillance in IoVT using stationary neuromorphic vision sensors. In: 2019 32nd IEEE International System-on-Chip Conference (SOCC), pp. 318–323 (2019). https://doi.org/10.1109/SOCC46988.2019.1570553690
Agarwal, S., Hervas-Martin, E., Byrne, J., Dunne, A., Luis Espinosa-Aranda, J., Rijlaarsdam, D.: An evaluation of low-cost vision processors for efficient star identification. Sensors 20(21), 6250 (2020). https://doi.org/10.3390/s20216250
Article Google Scholar
Akopyan, F., et al.: TrueNorth: design and tool flow of a 65 mw 1 million neuron programmable neurosynaptic chip. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 34(10), 1537–1557 (2015). https://doi.org/10.1109/TCAD.2015.2474396
Article Google Scholar
Amir, A., et al.: A low power, fully event-based gesture recognition system. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Bi, Y., Chadha, A., Abbas, A., Bourtsoulatze, E., Andreopoulos, Y.: Graph-based object classification for neuromorphic vision sensing. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)
Google Scholar
Blouw, P., Choo, X., Hunsberger, E., Eliasmith, C.: Benchmarking keyword spotting efficiency on neuromorphic hardware. In: Proceedings of the 7th Annual Neuro-Inspired Computational Elements Workshop. NICE 2019, Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3320288.3320304
Calabrese, E., et al.: DHP19: dynamic vision sensor 3D human pose dataset. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2019)
Google Scholar
Ceolini, E., et al.: Hand-gesture recognition based on EMG and event-based camera sensor fusion: a benchmark in neuromorphic computing. Frontiers Neurosci. 14, 637 (2020). https://doi.org/10.3389/fnins.2020.00637
Article Google Scholar
Choi, W., Pantofaru, C., Savarese, S.: A general framework for tracking multiple people from a moving camera. IEEE Trans. Pattern Anal. Mach. Intell. 35(7), 1577–1591 (2013). https://doi.org/10.1109/TPAMI.2012.248
Article Google Scholar
Davies, M., et al.: Loihi: a neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018). https://doi.org/10.1109/MM.2018.112130359
Article Google Scholar
Dozat, T.: Incorporating Nesterov momentum into Adam. In: ICLR Workshop (2016)
Google Scholar
Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, pp. 226–231. KDD 1996, AAAI Press (1996)
Google Scholar
Gallego, G., et al.: Event-based vision: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 44(1), 154–180 (2022). https://doi.org/10.1109/TPAMI.2020.3008413
Article Google Scholar
Gerstner, W.: Chapter 12 a framework for spiking neuron models: the spike response model. In: Moss, F., Gielen, S. (eds.) Neuro-Informatics and Neural Modelling, Handbook of Biological Physics, vol. 4, pp. 469–516. North-Holland (2001). https://doi.org/10.1016/S1383-8121(01)80015-4
Hinz, G., et al.: Online multi-object tracking-by-clustering for intelligent transportation system with neuromorphic vision sensor. In: Kern-Isberner, G., Fürnkranz, J., Thimm, M. (eds.) KI 2017. LNCS (LNAI), vol. 10505, pp. 142–154. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67190-1_11
Chapter Google Scholar
Kaiser, J., et al.: Embodied neuromorphic vision with continuous random backpropagation. In: 2020 8th IEEE RAS/EMBS International Conference for Biomedical Robotics and Biomechatronics (BioRob), pp. 1202–1209 (2020). https://doi.org/10.1109/BioRob49111.2020.9224330
Lan, W., Dang, J., Wang, Y., Wang, S.: Pedestrian detection based on yolo network model. In: 2018 IEEE International Conference on Mechatronics and Automation (ICMA), pp. 1547–1551 (2018). https://doi.org/10.1109/ICMA.2018.8484698
Lichtsteiner, P., Posch, C., Delbruck, T.: A 128\(\times \)128 120 db 15 \(\mu \)s latency asynchronous temporal contrast vision sensor. IEEE J. Solid-State Circ. 43(2), 566–576 (2008). https://doi.org/10.1109/JSSC.2007.914337
Article Google Scholar
Lin, Z., Davis, L.S.: Shape-based human detection and segmentation via hierarchical part-template matching. IEEE Trans. Pattern Anal. Mach. Intell. 32(4), 604–618 (2010). https://doi.org/10.1109/TPAMI.2009.204
Article Google Scholar
Liu, Y., et al.: Dynamic gesture recognition algorithm based on 3D convolutional neural network. Computational Intelligence and Neuroscience 2021(4828102) (2021). https://doi.org/10.1155/2021/4828102
Lygouras, E., Santavas, N., Taitzoglou, A., Tarchanidis, K., Mitropoulos, A., Gasteratos, A.: Unsupervised human detection with an embedded vision system on a fully autonomous UAV for search and rescue operations. Sensors 19(16), 3542 (2019). https://doi.org/10.3390/s19163542
Article Google Scholar
Mitrokhin, A., Fermüller, C., Parameshwara, C., Aloimonos, Y.: Event-based moving object detection and tracking. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1–9 (2018). https://doi.org/10.1109/IROS.2018.8593805
Mondal, A., Das, M.: Moving object detection for event-based vision using k-means clustering. In: 2021 IEEE 8th Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON), pp. 1–6 (2021). https://doi.org/10.1109/UPCON52273.2021.9667636
Nguyen, H.H., Ta, T.N., Nguyen, N.C., Bui, V.T., Pham, H.M., Nguyen, D.M.: Yolo based real-time human detection for smart video surveillance at the edge. In: 2020 IEEE Eighth International Conference on Communications and Electronics (ICCE), pp. 439–444 (2021). https://doi.org/10.1109/ICCE48956.2021.9352144
Piatkowska, E., Belbachir, A.N., Schraml, S., Gelautz, M.: Spatiotemporal multiple persons tracking using dynamic vision sensor. In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 35–40 (2012). https://doi.org/10.1109/CVPRW.2012.6238892
Pigou, L., Van Herreweghe, M., Dambre, J.: Gesture and sign language recognition with temporal residual networks. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) Workshops (2017)
Google Scholar
Rudnev, V., et al.: Eventhands: Real-time neural 3d hand pose estimation from an event stream. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). pp. 12385–12395 (October 2021)
Google Scholar
Saha, S., Lahiri, R., Konar, A., Banerjee, B., Nagar, A.K.: HMM-based gesture recognition system using kinect sensor for improvised human-computer interaction. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp. 2776–2783 (2017). https://doi.org/10.1109/IJCNN.2017.7966198
Shrestha, S.B., Orchard, G.: SLAYER: spike layer error reassignment in time. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 31, pp. 1419–1428. Curran Associates, Inc. (2018). https://papers.nips.cc/paper/7415-slayer-spike-layer-error-reassignment-in-time.pdf
Stewart, K., Orchard, G., Shrestha, S.B., Neftci, E.: Online few-shot gesture learning on a neuromorphic processor. IEEE J. Emerg. Sel. Top. Circ. Syst. 10(4), 512–521 (2020). https://doi.org/10.1109/JETCAS.2020.3032058
Article Google Scholar
Ur Rehman, M., et al.: Dynamic hand gesture recognition using 3D-CNN and LSTM networks. Comput. Mater. Continua, 70, 4675–4690 (2021). https://doi.org/10.32604/cmc.2022.019586
Xu, D., Wu, X., Chen, Y.L., Xu, Y.: Online dynamic gesture recognition for human robot interaction. J. Intell. Robot. Syst. 77(4), 604–618 (2010). https://doi.org/10.1109/TPAMI.2009.204
Article Google Scholar
Zhang, C., Li, H., Wang, X., Yang, X.: Cross-scene crowd counting via deep convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
Google Scholar
Zhang, Y., et al.: An event-driven spatiotemporal domain adaptation method for DVS gesture recognition. IEEE Trans. Circuits Syst. II Express Briefs 69(3), 1332–1336 (2022). https://doi.org/10.1109/TCSII.2021.3108798
Article MathSciNet Google Scholar
Zhang, Z.: Microsoft kinect sensor and its effect. IEEE Multimedia 19(2), 4–10 (2012). https://doi.org/10.1109/MMUL.2012.24
Article Google Scholar
Zheng, C., et al.: Deep learning-based human pose estimation: a survey. CoRR abs/2012.13392 (2020). https://arxiv.org/abs/2012.13392
Zhou, Y., Gallego, G., Lu, X., Liu, S., Shen, S.: Event-based motion segmentation with spatio-temporal graph cuts. IEEE Trans. Neural Netw. Learn. Syst. 1–13 (2021)
Google Scholar

Download references

Acknowledgment

The authors would like to thank Austin Lai Weng Mun for his help in the dataset collection.

Author information

Authors and Affiliations

Institute for Infocomm Research, A*STAR, Singapore, 138632, Singapore
Fabien Colonnier, Aravind Seeralan & Longwei Zhu

Authors

Fabien Colonnier
View author publications
You can also search for this author in PubMed Google Scholar
Aravind Seeralan
View author publications
You can also search for this author in PubMed Google Scholar
Longwei Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fabien Colonnier .

Editor information

Editors and Affiliations

Xiamen University Malaysia, Sepang, Malaysia
Han Wang
Singapore Institute of Manufacturing Technology, Singapore, Singapore
Wei Lin
Charles Sturt University, Bathurst, NSW, Australia
Paul Manoranjan
Minjiang University, Fuzhou, China
Guobao Xiao
Yau Lee Holdings Ltd., Hong Kong, Hong Kong
Kap Luk Chan
Tsinghua University, Beijing, China
Xiaonan Wang
Nanyang Technological University, Singapore, Singapore
Guiju Ping
Nanyang Technological University, Singapore, Singapore
Haoge Jiang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Colonnier, F., Seeralan, A., Zhu, L. (2023). Event-Based Visual Sensing for Human Motion Detection and Classification at Various Distances. In: Wang, H., et al. Image and Video Technology. PSIVT 2022. Lecture Notes in Computer Science, vol 13763. Springer, Cham. https://doi.org/10.1007/978-3-031-26431-3_7

Download citation

DOI: https://doi.org/10.1007/978-3-031-26431-3_7
Published: 28 April 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26430-6
Online ISBN: 978-3-031-26431-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Event-Based Visual Sensing for Human Motion Detection and Classification at Various Distances