Skip to main content
Log in

Successive Future Image Generation of a Walking Pedestrian Using Generative Adversarial Networks

  • Article
  • Published:
The Review of Socionetwork Strategies Aims and scope Submit manuscript

Abstract

This research focuses on the generation algorithm of successive future images of a walking pedestrian from successive past images. The proposal algorithm is based on Generative adversarial networks (GANs) whose generative network and discriminative network are defined by Convolutional neural networks (CNN). The algorithm takes successive images as input data and generates their future images as the output images. The algorithm is compared with the other algorithms such as Optical Flow and Long Short-Term Memory (LSTM) for different numbers of input and output images. The results show that the accuracy of the proposed algorithm is better than LSTM in all cases and that the proposed algorithm shows better accuracy than Optical Flow in the case of large numbers of input and output images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. Chen, Y. F., Liu, M., Liu, S.-Y., Miller, J., & How, J. P. (2016). Predictive modeling of pedestrian motion patterns with Bayesian nonparametrics. In Proceedings of AIAA guidance navigation and control conference (pp.1861–1875). https://doi.org/10.2514/6.2016-1861. Accessed 8 Aug 2021.

  2. Schneider, N., Gavrila, D. M., Weickert, J., Hein, M., & Schiele, B. (2013). Pedestrian path prediction with recursive Bayesian filters: A comparative study. Proceedings of German Conference on Pattern Recognition, Lecture Notes in Computer Science, 8142, 174–183.

    Article  Google Scholar 

  3. Bonnin, S., Weisswange, T. H., Kummert, F., & Schmuedderich, J. (2014). General behavior prediction by a combination of scenario specific models. IEEE Transactions on Intelligent Transportation Systems, 15, 1478–1488.

    Article  Google Scholar 

  4. Bera, A., & Manocha, D. (2017). PedLearn: Realtime pedestrian tracking, behavior learning, and navigation for autonomous vehicles. In Proceedings of 9th workshop on planning, perception and navigation for intelligent vehicles. http://ppniv17.irccyn.ec-nantes.fr/session4/Bera/paper.pdf. Accessed 8 Aug 2021.

  5. Ziebart, B. D., Ratliff, N., Gallagher, G., Mertz, C., Peterson, K., Andrew Bagnell, J., Hebert, M., Dey, A. K., & Srinivasa, S. (2009). Planning-based prediction for pedestrians. In Proceedings of IEEE/RSJ international conference on intelligent robots and systems (pp. 3931–3936). https://doi.org/10.1109/IROS.2009.5354147.

  6. Lucas, B., & Kanade, T. (1981). An iterative image registration technique with an application to stereo vision. In IJCAI'81: Proceedings of the 7th international joint conference on artificial intelligence, Vol. 2 (pp. 674–679). https://dl.acm.org/doi/proceedings/10.5555/1623264

  7. Horn, B., & Schunck, B. (1981). Determining optical flow. Artificial Intelligence, 17(1–3), 185–203.

    Article  Google Scholar 

  8. Fortun, D., Bouthemy, P., & Kervrann, C. (2015). Optical flow modeling and computation: A survey. Computer Vision and Image Understanding, 134, 1–21.

    Article  Google Scholar 

  9. Wedel, A., & Cremers, D. (2011). Stereo scene flow for 3D motion analysis. Springer.

    Book  Google Scholar 

  10. Zach, C., Pock, T., & Bischof, H. (2007). A duality based approach for realtime TV-L 1 optical flow. Pattern Recognition, 1(1), 214–223.

    Article  Google Scholar 

  11. Brox, T., & Malik, J. (2011). Large displacement optical flow: Descriptor matching in variational motion estimation. IEEE Transaction on Software Engineering, 33(3), 500–513.

    Google Scholar 

  12. Treiber, M. A. (2013). Optimization for computer vision ? An introduction to core concepts and methods. Springer.

    Book  Google Scholar 

  13. Klette, R. (2019). Concise computer vision: An introduction into theory and algorithms. Springer.

    Google Scholar 

  14. Ranzato, M., Szlam, A., Bruna, J., Mathieu, M., Collobert, R., & Chopra, S. (2014). Video (Language) modeling: A baseline for generative models of natural videos. Computing Research Repository arXiv preprint arXiv:1412.6604. Accessed 8 Aug 2021.

  15. Revaud, J., Weinzaepfel, P., Harchaoui, Z., & Schmid, C. (2015). EpicFlow: EdgePreserving interpolation of correspondences for optical flow. In Proceedings of IEEE computer vision and pattern recognition (CVPR). arXiv preprint arXiv:1501.02565. Accessed 8 Aug 2021.

  16. Srivastava, N., Mansimov, E., & Salakhutdinov, R. (2015). Unsupervised learning of video representations using LSTMs. In Proceedings of international conference on machine learning,   Vol. 37 (pp. 843–852).

  17. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014). Generative adversarial nets. In Proceedings of neural information processing systems (NIPS), Vol. 27.

  18. Bousmalis, K., Silberman, N., Dohan, D., Erhan, D., & Krishnan. (2017). Unsupervised pixel-level domain adaptation with generative adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR. arXiv preprint arXiv:1612.05424. Accessed 8 Aug 2021.

  19. Molano-Mazon, M., Onken, A., Piasini, E., & Panzeri, S. Synthesizing realistic neural population activity patterns using generative adversarial networks. In Proceedings of 6th international conference on learning representations (ICLR). arXiv preprint arXiv:1803.00338. Accessed 8 Aug 2021.

  20. Radford, A., Metz, L., & Chintala, S. (2015). Unsupervised representation learning with deep convolutional generative adversarial networks. In Proceedings of  International Conference on Learning Representations (ICLR)

  21. Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., & Courville, A. (2017). Improved training of Wasserstein GANs. In Proceedings of the 31st International conference on neural information processing (pp. 5769–5779).

  22. Shmelkov, K., Schmid, C., & Alahari, K. (2018). How good is my GAN?. In Proceedings of European conference on computer vision. arXiv preprint arXiv:1807.09499. Accessed 8 Aug 2021.

  23. Mathieu, M., Couprie, C., & LeCun, Y. (2016). Deep multi-scale video prediction beyond mean square error. In Proceedings of  4th International conference on learning representations (ICLR2016). arXiv preprint arXiv:1511.05440. Accessed 8 Aug 2021. 

  24. Pan, Y., Qiu, Z., Yao, T., Li, H., & Mei, T. (2017). To create what you tell: Generating videos from captions. In Proceedings of the 25th ACM international conference on multimedia (p. 1789–1798). https://doi.org/10.1145/3123266.3127905

  25. Vondrick, C., Pirsiavash, H., & Torralba, A. (2016). Generating videos with scene dynamics. In Proceedings of neural information processing systems (NIPS) (pp. 613–621).

  26. Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In 2015 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3431–3440). https://doi.org/10.1109/CVPR.2015.7298965.

  27. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Proceedings of the 25th international conference on neural information processing systems, Vol. 1 (pp. 1097–1105).

  28. Dosovitskiy, A., Springenberg, J. T., & Brox, T. (2015). SLearning to generate chairs with convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR2015) (pp. 1538–1546). https://doi.org/10.1109/CVPR.2015.7298761.

  29. Zhang, Z. (2012). Microsoft KINECT sensor and its effect. In Proceedings of IEEE multiMedia, Vol. 19 (pp. 4–12).

  30. Huynh-Thu, Q., & Ghanbari, M. (2008). Scope of validity of PSNR in image/video quality assessment. Electronics Letters, 44(13), 800–801.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eisuke Kita.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

He, B., Kita, E. Successive Future Image Generation of a Walking Pedestrian Using Generative Adversarial Networks. Rev Socionetwork Strat 15, 309–325 (2021). https://doi.org/10.1007/s12626-021-00085-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12626-021-00085-6

Keywords

Navigation