Skip to main content

GOHAG: GANs Orchestration for Human Actions Generation

  • Conference paper
  • First Online:
  • 1304 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12033))

Abstract

Generative Adversarial Networks (GANs) made a huge contribution to the development of content creation technologies. Important place in this advancement takes video generation due to the need for human animation applications, automatic trailer or movie generation. Therefore, taking advantage of various GANs, we proposed own method for human movement video generation GOHAG: GANs Orchestration for Human Actions Generations. GOHAG is an orchestra of three GANs, where Poses generation GAN (PGAN) creates a sequence of poses, Poses Optimization GAN (POGAN) optimizes them, and Frames generation GAN (FGAN) attaches texture for the sequence, creating a video. The proposed method generates a smooth and plausible video of high-quality and showed potentials among modern techniques.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Goodfellow, I., et al.: Generative adversarial nets. In: NIPS (2014)

    Google Scholar 

  2. Saito, M., Matsumoto, E., Saito, S.: Temporal generative adversarial nets with singular value clipping. In: ICCV (2017)

    Google Scholar 

  3. Marwah, T., Mittal, G., Balasubramanian, V.N.: Attentive semantic video generation using captions. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 1435–1443. IEEE (2017)

    Google Scholar 

  4. Vondrick, C., Pirsiavash, H., Torralba, A.: Generating videos with scene dynamics. In: Lee, D.D., Sugiyama, M., Luxburg, U.V., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems 29, pp. 613–621. Curran Associates, Inc. (2016). http://papers.nips.cc/paper/6194-generating-videos-with-scene-dynamics.pdf

  5. Tulyakov, S., Liu, M.Y., Yang, X., Kautz, J.: MoCoGAN: decomposing motion and content for video generation. arXiv preprint arXiv:1707.04993 (2017)

  6. Yang, C., Wang, Z., Zhu, X., Huang, C., Shi, J., Lin, D.: Pose guided human video generation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11214, pp. 204–219. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01249-6_13

    Chapter  Google Scholar 

  7. Cai, H., Bai, C., Tai, Y.-W., Tang, C.-K.: Deep video generation, prediction and completion of human action sequences. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11206, pp. 374–390. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01216-8_23

    Chapter  Google Scholar 

  8. Mirza, M., Osindero, S.: Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014)

  9. Zhu, J., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: ICCV (2017)

    Google Scholar 

  10. Wang, T., Liu, M., Zhu, J., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional GANs. In: CVPR (2018)

    Google Scholar 

  11. Theis, L., Oord, A.V.D., Bethge, M.: A note on the evaluation of generative models. In: ICLR (2016)

    Google Scholar 

  12. Soomro, K., Zamir, A.R., Shah, M.: UCF101: a dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402 (2012)

  13. Cao, Z., Hidalgo, G., Šimon, T., Wei, S., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields. In: CVPR (2017)

    Google Scholar 

  14. Horé, A., Ziou, D.: Image quality metrics: PSNR vs. SSIM. In: ICPR (2010)

    Google Scholar 

  15. Salimans, T., Goodfellow, I.J., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training GANs. arXiv preprint arXiv:1606.03498 (2016)

  16. Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local nash equilibrium. In: NIPS (2017)

    Google Scholar 

  17. Gerhard, H.E., Wichmann, F.A., Bethge, M.: How sensitive is the human visual system to the local statistics of natural images? PLoS Comput. Biol. 9(1), e1002873 (2013)

    Article  Google Scholar 

Download references

Acknowledgements

This research was supported by the MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2017-0-01642) supervised by the IITP (Institute for Information & communications Technology Promotion).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aziz Siyaev .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Siyaev, A., Jo, GS. (2020). GOHAG: GANs Orchestration for Human Actions Generation. In: Nguyen, N., Jearanaitanakij, K., Selamat, A., Trawiński, B., Chittayasothorn, S. (eds) Intelligent Information and Database Systems. ACIIDS 2020. Lecture Notes in Computer Science(), vol 12033. Springer, Cham. https://doi.org/10.1007/978-3-030-41964-6_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-41964-6_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-41963-9

  • Online ISBN: 978-3-030-41964-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics