3D Human Pose Estimation Based on Multi-Input Multi-Output Convolutional Neural Network and Event Cameras: A Proof of Concept on the DHP19 Dataset

Manilii, Alessandro; Lucarelli, Leonardo; Rosati, Riccardo; Romeo, Luca; Mancini, Adriano; Frontoni, Emanuele

doi:10.1007/978-3-030-68763-2_2

Alessandro Manilii¹⁶,
Leonardo Lucarelli¹⁶,
Riccardo Rosati¹⁶,
Luca Romeo^16,17,
Adriano Mancini¹⁶ &
…
Emanuele Frontoni¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12661))

Included in the following conference series:

International Conference on Pattern Recognition

2444 Accesses
1 Citations

Abstract

Nowadays Human Pose Estimation (HPE) represents one of the main research themes in the field of computer vision. Despite innovative methods and solutions introduced for frame processing algorithms, the use of standard frame-based cameras still has several drawbacks such as data redundancy and fixed frame-rate. The use of event-based cameras guarantees higher temporal resolution with lower memory and computational cost while preserving the significant information to be processed and thus it represents a new solution for real-time applications. In this paper, the DHP19 dataset was employed, the first and, to date, the only one with HPE data recorded from Dynamic Vision Sensor (DVS) event-based cameras. Starting from the baseline single-input single-output (SISO) Convolutional Neural Network (CNN) model proposed in the literature, a novel multi-input multi-output (MIMO) CNN-based architecture was proposed in order to model simultaneously two different single camera views. Experimental results show that the proposed MIMO approach outperforms the standard SISO model in terms of accuracy and training time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The code to reproduce all results is available at the following link: https://github.com/AlessandroManilii/3D_HumanPoseEstimation_event-based_dataset.

References

Amin, S., Andriluka, M., Rohrbach, M., Schiele, B.: Multi-view pictorial structures for 3d human pose estimation. In: 24th British Machine Vision Conference, pp. 1–12. BMVA Press (2013)
Google Scholar
Amir, A., et al.: A low power, fully event-based gesture recognition system. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7388–7397 (2017)
Google Scholar
Brandli, C., Berner, R., Yang, M., Liu, S.C., Delbruck, T.: A 240\(\times \) 180 130 DB 3 \(\mu \)s latency global shutter spatiotemporal vision sensor. IEEE J. Solid-State Circuits 49(10), 2333–2341 (2014)
Article Google Scholar
Calabrese, E., et al.: Dhp19: dynamic vision sensor 3d human pose dataset. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2019
Google Scholar
Cao, Z., Simon, T., Wei, S., Sheikh, Y., et al.: Openpose: realtime multi-person 2d pose estimation using part affinity fields. IEEE Trans. Pattern Anal. Mach. Intell. 42(5), 1146-1161 (2019)
Google Scholar
Capecci, M., et al.: A tool for home-based rehabilitation allowing for clinical evaluation in a visual markerless scenario. In: 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 8034–8037. IEEE (2015)
Google Scholar
Capecci, M., et al.: The kimore dataset: kinematic assessment of movement and clinical scores for remote monitoring of physical rehabilitation. IEEE Trans. Neural Syst. Rehabil. Eng. 27(7), 1436–1448 (2019)
Article Google Scholar
Hu, Y., Liu, H., Pfeiffer, M., Delbruck, T.: DVS benchmark datasets for object tracking, action recognition, and object recognition. Front. Neurosci. 10, 405 (2016). https://doi.org/10.3389/fnins.2016.00405, https://www.frontiersin.org/article/10.3389/fnins.2016.00405
Ionescu, C., Papava, D., Olaru, V., Sminchisescu, C.: Human3.6m: large scale datasets and predictive methods for 3D human sensing in natural environments. IEEE Trans. Pattern Anal. Mach. Intell. 36(7), 1325–1339 (2014)
Google Scholar
Lichtsteiner, P., Posch, C., Delbruck, T.: A 128\(\times \)128 120 DB 15\(\mu \) s latency asynchronous temporal contrast vision sensor. IEEE J. Solid-State Circuits 43(2), 566–576 (2008)
Article Google Scholar
Liciotti, D., Paolanti, M., Frontoni, E., Mancini, A., Zingaretti, P.: Person re-identification dataset with RGB-D camera in a top-view configuration. In: Nasrollahi, K., Distante, C., Hua, G., Cavallaro, A., Moeslund, T.B., Battiato, S., Ji, Q. (eds.) FFER/VAAM -2016. LNCS, vol. 10165, pp. 1–11. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-56687-0_1
Chapter Google Scholar
Liu, H., Moeys, D.P., Das, G., Neil, D., Liu, S., Delbrück, T.: Combined frame- and event-based detection and tracking. In: 2016 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 2511–2514 (2016)
Google Scholar
Lungu, I., Corradi, F., Delbrück, T.: Live demonstration: convolutional neural network driven by dynamic vision sensor playing roshambo. In: 2017 IEEE International Symposium on Circuits and Systems (ISCAS), p. 1 (2017)
Google Scholar
Maqueda, A.I., Loquercio, A., Gallego, G., García, N., Scaramuzza, D.: Event-based vision meets deep learning on steering prediction for self-driving cars. CoRR abs/1804.01310 (2018), http://arxiv.org/abs/1804.01310
Mehta, D., Rhodin, H., Casas, D., Sotnychenko, O., Xu, W., Theobalt, C.: Monocular 3D human pose estimation using transfer learning and improved CNN supervision. CoRR abs/1611.09813 (2016), http://arxiv.org/abs/1611.09813
Moccia, S., Migliorelli, L., Carnielli, V., Frontoni, E.: Preterm infants’ pose estimation with spatio-temporal features. IEEE Trans. Biomed. Eng. 67(8), 2370–2380 (2019)
Google Scholar
Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 483–499. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_29
Chapter Google Scholar
Paolanti, M., Romeo, L., Liciotti, D., Pietrini, R., Cenci, A., Frontoni, E., Zingaretti, P.: Person re-identification with RGB-D camera in top-view configuration through multiple nearest neighbor classifiers and neighborhood component features selection. Sensors 18(10), 3471 (2018)
Article Google Scholar
Paolanti, M., Romeo, L., Martini, M., Mancini, A., Frontoni, E., Zingaretti, P.: Robotic retail surveying by deep learning visual and textual data. Robot. Auton. Syst. 118, 179–188 (2019)
Article Google Scholar
Rhodin, H., Robertini, N., Casas, D., Richardt, C., Seidel, H., Theobalt, C.: General automatic human shape and motion capture using volumetric contour cues. CoRR abs/1607.08659 (2016), http://arxiv.org/abs/1607.08659
Sigal, L., Balan, A., Black, M.J.: HumanEva: synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion. Int. J. Comput. Vision 87(1), 4–27 (2010)
Article Google Scholar
Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019)
Google Scholar
Toshev, A., Szegedy, C.: Deeppose: human pose estimation via deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1653–1660 (2014)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Engineering (DII), Università Politecnica delle Marche, Via Brecce Bianche, 12, 60131, Ancona, Italy
Alessandro Manilii, Leonardo Lucarelli, Riccardo Rosati, Luca Romeo, Adriano Mancini & Emanuele Frontoni
Computational Statistics and Machine Learning and Cognition, Motion and Neuroscience, Istituto Italiano di Tecnologia, Genova, Italy
Luca Romeo

Authors

Alessandro Manilii
View author publications
You can also search for this author in PubMed Google Scholar
Leonardo Lucarelli
View author publications
You can also search for this author in PubMed Google Scholar
Riccardo Rosati
View author publications
You can also search for this author in PubMed Google Scholar
Luca Romeo
View author publications
You can also search for this author in PubMed Google Scholar
Adriano Mancini
View author publications
You can also search for this author in PubMed Google Scholar
Emanuele Frontoni
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Riccardo Rosati .

Editor information

Editors and Affiliations

Dipartimento di Ingegneria dell’Informazione, University of Firenze, Firenze, Italy
Alberto Del Bimbo
Dipartimento di Ingegneria “Enzo Ferrari”, Università di Modena e Reggio Emilia, Modena, Italy
Rita Cucchiara
Department of Computer Science, Boston University, Boston, MA, USA
Stan Sclaroff
Dipartimento di Matematica e Informatica, University of Catania, Catania, Italy
Giovanni Maria Farinella
Cloud & AI, JD.COM, Beijing, China
Tao Mei
Dipartimento di Ingegneria dell’Informazione, University of Firenze, Firenze, Italy
Marco Bertini
Computational Sciences Department, National Institute of Astrophysics, Optics and Electronics (INAOE), Tonantzintla, Puebla, Mexico
Hugo Jair Escalante
Dipartimento di Ingegneria “Enzo Ferrari”, Università di Modena e Reggio Emilia, Modena, Italy
Roberto Vezzani

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Manilii, A., Lucarelli, L., Rosati, R., Romeo, L., Mancini, A., Frontoni, E. (2021). 3D Human Pose Estimation Based on Multi-Input Multi-Output Convolutional Neural Network and Event Cameras: A Proof of Concept on the DHP19 Dataset. In: Del Bimbo, A., et al. Pattern Recognition. ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science(), vol 12661. Springer, Cham. https://doi.org/10.1007/978-3-030-68763-2_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-68763-2_2
Published: 21 February 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-68762-5
Online ISBN: 978-3-030-68763-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)