Uncalibrated multi-view multiple humans association and 3D pose estimation by adversarial learning

Ershadi-Nasab, Sara; Kasaei, Shohreh; Sanaei, Esmaeil

doi:10.1007/s11042-020-09733-5

Uncalibrated multi-view multiple humans association and 3D pose estimation by adversarial learning

Published: 15 September 2020

Volume 80, pages 2461–2488, (2021)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

436 Accesses
4 Citations
Explore all metrics

Abstract

Multiple human 3D pose estimation is a useful but challenging task in computer vison applications. The ambiguities in estimation of 2D and 3D poses of multiple persons can be verified by using multi-view frames, in which the occluded or self-occluded body parts of some persons might be visible in other camera views. But, when cameras are moving and uncalibrated, estimating the association of multiple human body parts among different camera views is a challenging task. This paper presents novel methods for multiple human 3D pose estimation and pose association in multi-view camera frames in an uncalibrated camera setup using an adversarial learning framework. The generator is a 3D pose estimation network that learns a mapping of distance and angular difference matrices between 2D and 3D spaces. The discriminator tries to distinguish the predicted 3D poses from the ground-truth, which helps to enforce the pose estimator to generate valid 3D poses. To increase the accuracy of the generator network, multi-view frames are used. The estimated 3D poses are associated among multi-view frames by a statistical method. The association and relative rotation and translation of cameras to each other are also obtained. This step empowers the generator network and removes ambiguities in the estimation of occluded or self-occluded body parts. The global 3D poses are the inputs to the discriminator network to imposter the discriminator that they come from the ground-truth. Experimental results conducted on multi-view and multi-person datasets (such as Campus, Shelf, Utrecht Multi-Person Motion (UMPM), and also KTH Football 2) indicate that the proposed method achieves superior performance in comparison with other state-of-the-art methods while it does require any calibration information in priori.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in the Wild

Article 16 November 2020

Zhe Zhang, Chunyu Wang, … Wenjun Zeng

CosyPose: Consistent Multi-view Multi-object 6D Pose Estimation

Multi-person Absolute 3D Human Pose Estimation with Weak Depth Supervision

References

Afrouzian R, Seyedarabi H, Kasaei S (2016) Pose estimation of soccer players using multiple uncalibrated cameras. Multimedia Tools and Applications 75(12):6809–6827
Article Google Scholar
Amin S, Andriluka M, Rohrbach M, Schiele B (2013) Multi-view pictorial structures for 3d human pose estimation. In: British machine vision conference, vol 2, BMVA press
Aujla GS, Jindal A, Chaudhary R, Kumar N, Vashist S, Sharma N, Obaidat MS (2019) Dlrs: deep learning-based recommender system for smart healthcare ecosystem. In: ICC 2019-2019 IEEE international conference on communications (ICC), IEEE, pp 1–6
Belagiannis V, Amin S, Andriluka M, Schiele B, Navab N (2014) Ilic, s.: 3d pictorial structures for multiple human pose estimation. In: IEEE Conference on computer vision and pattern recognition (CVPR), IEEE, pp 1669–1676
Belagiannis V, Amin S, Andriluka M, Schiele B, Navab N (2015) Ilic, S.: 3d pictorial structures revisited: Multiple human pose estimation. IEEE Trans Patt Anal Mach Intel 38:1929–1942
Article Google Scholar
Belagiannis V, Amin S, Andriluka M, Schiele B, Navab N (2015) Ilic, S.: 3d pictorial structures revisited: Multiple human pose estimation. IEEE Trans Patt Anal Mach Intel 38:1929–1942
Article Google Scholar
Berclaz J, Fleuret F, Turetken E, Fua P (2011) Multiple object tracking using k-shortest paths optimization. IEEE Trans Patt Anal Mach Intel 33 (9):1806–1819
Article Google Scholar
Biswas P, Liang TC, Toh KC, Ye Y, Wang TC (2006) Semidefinite programming approaches for sensor network localization with noisy distance measurements. IEEE Trans Autom Sci Eng 3(4):360–371
Article Google Scholar
Bridgeman L, Volino M, Guillemaut JY, Hilton A (2019) Multi-person 3d pose estimation and tracking in sports. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 0–0
Burenius M, Sullivan J (2011) Carlsson, S.: Motion capture from dynamic orthographic cameras. In: 2011 IEEE international conference on computer vision workshops (ICCV Workshops), pp 1634–1641
Burenius M, Sullivan J (2013) Carlsson, s.: 3d pictorial structures for multiple view articulated pose estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3618–3625
Cao Z, Simon T, Wei SE, Sheikh Y (2017) Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7291–7299
Chen L, Ai H, Chen R, Zhuang Z, Liu S (2020) Cross-view tracking for multi-human 3d pose estimation at over 100 fps. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3279–3288
Chen C, Ramanan D (2016) 3d human pose estimation = 2d pose estimation + matching. arXiv:abs/1612.06524
Chen Y, Wang Z, Peng Y, Zhang Z, Yu G, Sun J (2018) Cascaded pyramid network for multi-person pose estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7103–7112
Dong J, Jiang W, Huang Q, Bao H, Zhou X (2019) Fast and robust multi-person 3d pose estimation from multiple views. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7792–7801
Dong Y, Zhang Z, Hong WC (2018) A hybrid seasonal mechanism with a chaotic cuckoo search algorithm with a support vector regression model for electric load forecasting. Energies 11(4):1009
Article Google Scholar
Ershadi-Nasab S, Kasaei S, Sanaei E (2018) Regression-based convolutional 3d pose estimation from single image. Electron Lett 54(5):292–293
Article Google Scholar
Ershadi-Nasab S, Noury E, Kasaei S, Sanaei E (2016) 3d multiple human pose estimation from multi-view images. MMTA submitted
Garg S, Kaur K, Kumar N, Kaddoum G, Zomaya AY, Ranjan R (2019) A hybrid deep learning-based model for anomaly detection in cloud datacenter networks. IEEE Trans Netw Serv Manag 16(3):924–935
Article Google Scholar
Garg S, Kaur K, Kumar N, Rodrigues JJ (2019) Hybrid deep-learning-based anomaly detection scheme for suspicious flow detection in sdn: a social multimedia perspective. IEEE Trans Multimedia 21(3):566–578
Article Google Scholar
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680
Gower J, Dijksterhuis G (2004) Procrustes problems. Oxford Statistical Science Series. OUP Oxford. https://books.google.com/books?id=ukeWSQx0LoAC
Gulati A, Aujla GS, Chaudhary R, Kumar N, Obaidat MS (2018) Deep learning-based content centric data dissemination scheme for internet of vehicles. In: 2018 IEEE international conference on communications (ICC), IEEE, pp 1–6
Heng L, Li B, Pollefeys M (2013) Camodocal: Automatic intrinsic and extrinsic calibration of a rig with multiple generic cameras and odometry. In: 2013 IEEE/RSJ international conference on intelligent robots and systems, pp 1793–1800
Hong WC, Dong Y, Lai CY, Chen LY, Wei SY (2011) Svr with hybrid chaotic immune algorithm for seasonal load demand forecasting. Energies 4(6):960–977
Article Google Scholar
Hong WC, Li MW, Geng J, Zhang Y (2019) Novel chaotic bat algorithm for forecasting complex motion of floating platforms. Appl Math Model 72:425–443
Article MathSciNet Google Scholar
Hurley JR, Cattell RB (1962) The procrustes program: Producing direct rotation to test a hypothesized factor structure. Syst Res Behav Sci 7(2):258–262
Article Google Scholar
Insafutdinov E, Pishchulin L, Andres B, Andriluka M, Schiele B (2016) Deepercut: a deeper, stronger, and faster multi-person pose estimation model. In: European conference on computer vision, Springer, pp 34–50
Ionescu C, Papava D, Olaru V, Sminchisescu C (2014) Human3.6m: Large scale datasets and predictive methods for 3d human sensing in natural environments. IEEE Trans Patt Anal Mach Intel 36(7):1325–1339
Article Google Scholar
Iqbal U, Milan A, Gall J (2017) Posetrack: Joint multi-person pose estimation and tracking. In: IEEE conference on computer vision and pattern recognition (CVPR). arXiv:1611.07727
Jindal A, Aujla GS, Kumar N, Prodan R, Obaidat MS (2018) Drums: Demand response management in a smart city using deep learning and svr. In: 2018 IEEE Global communications conference (GLOBECOM), IEEE, pp 1–6
Kazemi V, Burenius M, Azizpour H, Sullivan J (2013) Multi-view body part recognition with random forests. In: BMVC
Kim JH, Dai Y, Li H, Du X, Kim J (2013) Multi-view 3d reconstruction from uncalibrated radially-symmetric cameras. In: 2013 IEEE international conference on computer vision, pp 1896–1903
Kundra H, Sadawarti H (2015) Hybrid algorithm of cuckoo search and particle swarm optimization for natural terrain feature extraction. Res J Inf Technol 7(1):58–69
Google Scholar
Li S, Chan AB (2014) 3d human pose estimation from monocular images with deep convolutional neural network. In: Asian conference on computer vision, Springer, pp 332–347
Li M, Zhou Z (2020) Liu, x.: 3d hypothesis clustering for cross-view matching in multi-person motion capture. Comput Visual Media 6(2):147–156
Article Google Scholar
Lian J, Jia W, Zareapoor M, Zheng Y, Luo R, Jain DK, Kumar N (2019) Deep-learning-based small surface defect detection via an exaggerated local variation-based generative adversarial network. IEEE Trans Industrial Inform 16(2):1343–1351
Article Google Scholar
Liao X, Li K, Yin J (2016) Separable data hiding in encrypted image based on compressive sensing and discrete fourier transform. Multimedia Tools and Applications 76:20,739–20,753
Article Google Scholar
Liao X, Yu Y, Li B, Li Z, Qin Z (2020) A new payload partition strategy in color image steganography. IEEE Trans Circ Syst Video Technol 30:685–696
Article Google Scholar
Liu P, Yu H, Cang S (2019) Adaptive neural network tracking control for underactuated systems with matched and mismatched disturbances. Nonlinear Dynamics 98(2):1447–1464
Article Google Scholar
Makkar A, Kumar N (2018) User behavior analysis-based smart energy management for webpage ranking: Learning automata-based solution. Sustainable Computing: Informatics and Systems 20:174–191
Google Scholar
Makkar A, Obaidat MS, Kumar N (2018) Fs2rnn: Feature selection scheme for web spam detection using recurrent neural networks. In: 2018 IEEE Global communications conference (GLOBECOM), IEEE, pp 1–6
Miglani A, Kumar N (2019) Deep learning models for traffic flow prediction in autonomous vehicles: a review, solutions, and challenges. Vehicular Communications 20(100):184
Google Scholar
Newell A, Yang K, Deng J (2016) Stacked hourglass networks for human pose estimation. In: ECCV
Pavlakos G, Zhou X, Derpanis KG, Daniilidis K (2017) Harvesting multiple views for marker-less 3d human pose annotations. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
Revaud J, Weinzaepfel P, Harchaoui Z, Schmid C (2016) Deepmatching: Hierarchical deformable dense matching. Int J Comput Vis 120(3):300–323
Article MathSciNet Google Scholar
Rosales R, Siddiqui M, Alon J, Sclaroff S (2001) Estimating 3d body pose using uncalibrated cameras. In: 2001. CVPR 2001. Proceedings of the 2001 IEEE computer society conference on computer vision and pattern recognition, vol 1, IEEE, pp I–I
Schick A, Stiefelhagen R (2015) 3d pictorial structures for human pose estimation with supervoxels. In: 2015 IEEE winter conference on applications of computer vision (WACV), IEEE, pp 140–147
Sun L, Zhao C, Yan Z, Liu P, Duckett T, Stolkin R (2018) A novel weakly-supervised approach for rgb-d-based nuclear waste object detection. IEEE Sensors J 19(9):3487–3500
Article Google Scholar
Tanke J, Gall J (2019) Iterative greedy matching for 3d human pose tracking from multiple views. In: German conference on pattern recognition, Springer, pp 537–550
Tekin B, Katircioglu I, Salzmann M, Lepetit V, Fua P (2016) Structured prediction of 3d human pose with deep neural networks. arXiv:abs/1605.05180
Toshev A, Szegedy C (2014) Deeppose: Human pose estimation via deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1653–1660
Varga D, Szirányi T, Kiss A, Spórás L, Havasi L (2015) A multi-view pedestrian tracking method in an uncalibrated camera network. In: 2015 IEEE international conference on computer vision workshop (ICCVW), pp 184–191
Wang X, Cao Z, Wang R, Liu Z, Zhu X (2019) Improving human pose estimation with self-attention generative adversarial networks. IEEE Access 7:119,668–119,680
Article Google Scholar
Yang W, Ouyang W, Wang X, Ren J, Li H (2018) Wang, x.: 3d human pose estimation in the wild by adversarial learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5255–5264
Zhang Z (2000) A flexible new technique for camera calibration. IEEE Trans Pattern Anal Mach Intell 22:1330–1334
Article Google Scholar
Zhang Z, Hong WC (2019) Electric load forecasting by complete ensemble empirical mode decomposition adaptive noise and support vector regression with quantum-based dragonfly algorithm. Nonlinear Dynamics 98(2):1107–1136
Article Google Scholar
Zhang Z, Hong WC, Li J (2020) Electric load forecasting by hybrid self-recurrent support vector regression model with variational mode decomposition and improved cuckoo search algorithm. IEEE Access 8:14,642–14,658
Article Google Scholar
Zhou X, Sun X, Zhang W, Liang S, Wei Y (2016) Deep kinematic pose regression. In: Computer vision–ECCV 2016 workshops, Springer, pp 186–201
Zhou X, Zhu M, Leonardos S, Derpanis KG, Daniilidis K (2016) Sparseness meets deepness: 3d human pose estimation from monocular video. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4966–4975
van der Aa NP, Luo X, Giezeman GJ, Tan RT, Veltkamp RC (2011) Umpm benchmark: a multi-person dataset with synchronized video and motion capture data for evaluation of articulated human motion and interaction. In: 2011 IEEE International conference on computer vision workshops (ICCV workshops), pp 1264–1269

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering, Sharif University of Technology, Tehran, Iran
Sara Ershadi-Nasab & Esmaeil Sanaei
Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
Shohreh Kasaei

Authors

Sara Ershadi-Nasab
View author publications
You can also search for this author in PubMed Google Scholar
Shohreh Kasaei
View author publications
You can also search for this author in PubMed Google Scholar
Esmaeil Sanaei
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shohreh Kasaei.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ershadi-Nasab, S., Kasaei, S. & Sanaei, E. Uncalibrated multi-view multiple humans association and 3D pose estimation by adversarial learning. Multimed Tools Appl 80, 2461–2488 (2021). https://doi.org/10.1007/s11042-020-09733-5

Download citation

Received: 07 November 2019
Revised: 10 August 2020
Accepted: 26 August 2020
Published: 15 September 2020
Issue Date: January 2021
DOI: https://doi.org/10.1007/s11042-020-09733-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Uncalibrated multi-view multiple humans association and 3D pose estimation by adversarial learning

Abstract

Access this article

Similar content being viewed by others

AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in the Wild

CosyPose: Consistent Multi-view Multi-object 6D Pose Estimation

Multi-person Absolute 3D Human Pose Estimation with Weak Depth Supervision

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Uncalibrated multi-view multiple humans association and 3D pose estimation by adversarial learning

Abstract

Access this article

Similar content being viewed by others

AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in the Wild

CosyPose: Consistent Multi-view Multi-object 6D Pose Estimation

Multi-person Absolute 3D Human Pose Estimation with Weak Depth Supervision

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation