Strategies for supplementing recurrent neural network training for spatio-temporal prediction

Mark Schutera; Stefan Elser; Jochen Abhau; Ralf Mikut; Markus Reischl

doi:10.1515/auto-2018-0124

Published by De Gruyter (O) July 6, 2019

Strategies for supplementing recurrent neural network training for spatio-temporal prediction

Strategien zur Unterstützung des Trainings von Rekurrenten Neuronalen Netzen zur räumlich-zeitlichen Vorhersage

Mark Schutera
M. Sc. Mark Schutera is doctoral student in the field of deep learning for autonomous driving in the “Algorithms and Machine Learning Perception System” group at ZF Friedrichshafen AG and the Institute for Automation and Applied Computer Science at the Karlsruhe Institute of Technology.Research Interests: Deep Learning, autonomous driving, computer vision, data analytics and image processing
, Stefan Elser
Prof. Dr. rer. nat. Stefan Elser works as professor for autonomous driving at the Hochschule Ravensburg-Weingarten.Research Interests: Machine learning, object detection, sensor fusion and their applications in autonomous driving
, Jochen Abhau
Dr. rer. nat. Jochen Abhau is team leader “Algorithms and Machine Learning Perception System” at ZF Friedrichshafen.Research Interests: Machine learning, image processing, data analytics, deep learning and autonomous driving
, Ralf Mikut
Apl. Prof. Dr.-Ing. Ralf Mikut is head of the research area “Automated Image and Data Analysis” of the Institute for Automation and Applied Computer Science at the Karlsruhe Institute of Technology.Research Interests: Machine learning, image processing, data analytics, computational intelligence, various applications in engineering and life sciences
and Markus Reischl
PD Dr.-Ing. Markus Reischl is head of the research group „Machine Learning for High-Throughput and Mechatronics“ of the Institute for Automation and Applied Computer Science at the Karlsruhe Institute of Technology.Research Interests: Man-machine interfaces, image processing, machine learning, data analytics

From the journal at - Automatisierungstechnik

https://doi.org/10.1515/auto-2018-0124

Showing a limited preview of this publication:

Abstract

In autonomous driving, prediction tasks address complex spatio-temporal data. This article describes the examination of Recurrent Neural Networks (RNNs) for object trajectory prediction in the image space. The proposed methods enhance the performance and spatio-temporal prediction capabilities of Recurrent Neural Networks. Two different data augmentation strategies and a hyperparameter search are implemented for this purpose. A conventional data augmentation strategy and a Generative Adversarial Network (GAN) based strategy are analyzed with respect to their ability to close the generalization gap of Recurrent Neural Networks. The results are then discussed using single-object tracklets provided by the KITTI Tracking Dataset. This work demonstrates the benefits of augmenting spatio-temporal data with GANs.

Zusammenfassung

Im autonomen Fahren sind Vorhersagen aus komplexen räumlich-zeitlichen Daten notwendig. Dieser Artikel beschreibt die Untersuchung von Rekurrenten Neuralen Netzen (RNNs) zur Trajektorienvorhersage von Objekten im Bildraum. Die vorgeschlagenen Methoden verbessern die räumlich-zeitliche Vorhersagefähigkeit von Rekurrenten Neuronalen Netzen. Zu diesem Zweck werden zwei verschiedene Datenaugmentierungsstrategien und eine Hyperparametersuche implementiert. Eine konventionelle Datenaugmentierung und ein Generative Adversarial Network (GAN) werden auf ihre Fähigkeit hin analysiert, die Generalisierungslücke von Rekurrenten Neuronalen Netzen zu schließen. Die Ergebnisse werden unter Verwenden von Einzelobjekt-Trajektorien aus dem KITTI-Tracking Datensatz diskutiert. Diese Arbeit zeigt die Vorteile der Erweiterung von räumlich-zeitlichen Daten mit GANs.

Keywords: Generative Adversarial Networks; data augmentation; Recurrent Neural Networks; generalization; trajectory prediction

Schlagwörter: Generative Adversarial Network; Datenaugmentierung; Rekurrente Neuronale Netze; Generalisierung; Trajektorienvorhersage

About the authors

Mark Schutera

M. Sc. Mark Schutera is doctoral student in the field of deep learning for autonomous driving in the “Algorithms and Machine Learning Perception System” group at ZF Friedrichshafen AG and the Institute for Automation and Applied Computer Science at the Karlsruhe Institute of Technology.Research Interests: Deep Learning, autonomous driving, computer vision, data analytics and image processing

Stefan Elser

Prof. Dr. rer. nat. Stefan Elser works as professor for autonomous driving at the Hochschule Ravensburg-Weingarten.Research Interests: Machine learning, object detection, sensor fusion and their applications in autonomous driving

Jochen Abhau

Dr. rer. nat. Jochen Abhau is team leader “Algorithms and Machine Learning Perception System” at ZF Friedrichshafen.Research Interests: Machine learning, image processing, data analytics, deep learning and autonomous driving

Ralf Mikut

Apl. Prof. Dr.-Ing. Ralf Mikut is head of the research area “Automated Image and Data Analysis” of the Institute for Automation and Applied Computer Science at the Karlsruhe Institute of Technology.Research Interests: Machine learning, image processing, data analytics, computational intelligence, various applications in engineering and life sciences

Markus Reischl

PD Dr.-Ing. Markus Reischl is head of the research group „Machine Learning for High-Throughput and Mechatronics“ of the Institute for Automation and Applied Computer Science at the Karlsruhe Institute of Technology.Research Interests: Man-machine interfaces, image processing, machine learning, data analytics

Acknowledgment

With thanks to Katherine Quinlan-Flatter for proofreading the article.

References

1. Adamy, J.; Willert, V.: Cars become robots. Automatisierungstechnik 66.Search in Google Scholar

2. Andriluka, M.; Roth, S.; Schiele, B.: Monocular 3D pose estimation and tracking by detection. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, p. 623–630, 2010.10.1109/CVPR.2010.5540156Search in Google Scholar

3. Batz, T.; Watson, K.; Beyerer, J.: Recognition of dangerous situations within a cooperative group of vehicles. In: 2009 IEEE Intelligent Vehicles Symposium, p. 907–912, 2009.10.1109/IVS.2009.5164400Search in Google Scholar

4. Bergstra, J.; Bengio, Y.: Random search for hyper-parameter optimization. Journal of Machine Learning Research 13 (2012) Feb, p. 281–305.Search in Google Scholar

5. Dequaire, J.; Rao, D.; Ondruska, P.; Wang, D. Z.; Posner, I.: Deep tracking on the move: Learning to track the world from a moving vehicle using Recurrent Neural Networks. CoRR abs/1609.09365 (2017).Search in Google Scholar

6. Ess, A.; Leibe, B.; Schindler, K. van Gool, L.: A mobile vision system for robust multi-person tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR’08), IEEE Press, 2008.Search in Google Scholar

7. Geiger, A.; Lenz, P.; Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: Conference on Computer Vision and Pattern Recognition (CVPR), 2012.Search in Google Scholar

8. Gers, F. A.; Schmidhuber, J. A.; Cummins, F. A.: Learning to forget: Continual prediction with LSTM. Neural Computation 12 (2000) 10, p. 2451–2471.10.1162/089976600300015015Search in Google Scholar PubMed

9. Glorot, X.; Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: In Proceedings of the International Conference on Artificial Intelligence and Statistics, 2010.Search in Google Scholar

10. Goodfellow, I.; Bengio, Y.; Courville, A.: Deep learning. MIT Press, http://www.deeplearningbook.org, 2016.Search in Google Scholar

11. Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y.: Generative Adversarial Nets. In: Advances in Neural Information Processing Systems 27, p. 2672–2680, Curran Associates, Inc., 2014.Search in Google Scholar

12. Goodfellow, I. J.: NIPS 2016 tutorial: Generative Adversarial Networks. CoRR abs/1701.00160 (2017).Search in Google Scholar

13. Halevy, A.; Norvig, P.; Pereira, F.: The unreasonable effectiveness of data. IEEE Intelligent Systems 24 (2009) 2, p. 8–12.10.1109/MIS.2009.36Search in Google Scholar

14. Hochreiter, S.; Schmidhuber, J.: Long short-term memory. Neural Computation 9 (1997), p. 1735–1780.10.1162/neco.1997.9.8.1735Search in Google Scholar PubMed

15. Indrabayu; Bakti, R. Y.; Areni, I. S.; Prayogi, A. A.: Vehicle detection and tracking using Gaussian Mixture Model and Kalman Filter. In: 2016 International Conference on Computational Intelligence and Cybernetics, p. 115–119, 2016.10.1109/CyberneticsCom.2016.7892577Search in Google Scholar

16. Ioffe, S.; Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, p. 448–456, 2015.Search in Google Scholar

17. Janai, J.; Güney, F.; Behl, A.; Geiger, A.: Computer vision for autonomous vehicles: Problems, datasets and state-of-the-art. CoRR abs/1704.05519 (2017).10.1561/9781680836899Search in Google Scholar

18. Kalman, R. E.: A new approach to linear filtering and prediction problems. Transactions of the ASME–Journal of Basic Engineering 82 (1960) Series D, p. 35–45.10.1115/1.3662552Search in Google Scholar

19. Karpathy, A.: CS231n: Convolutional Neural Networks for visual recognition. http://cs231n.github.io/neural-networks-3/, access: 20.01.2018.Search in Google Scholar

20. Kiefer, J.; Wolfowitz, J.: Stochastic estimation of the maximum of a regression function. Ann. Math. Statist. 23 (1952) 3, p. 462–466.10.1214/aoms/1177729392Search in Google Scholar

21. Kingma, D. P.; Ba, J. L.: Adam: A method for stochastic optimization. In: Proc. 3rd Int. Conf. Learn. Representations, 2014.Search in Google Scholar

22. Krebs, S.; Duraisamy, B.; Flohr, F.: A survey on leveraging deep neural networks for object tracking. In: 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC), p. 411–418, 2017.10.1109/ITSC.2017.8317904Search in Google Scholar

23. Lipton, Z. C.: A critical review of Recurrent Neural Networks for sequence learning. CoRR abs/1506.00019 (2015).Search in Google Scholar

24. Milan, A.; Rezatofighi, S. H.; Dick, A. R.; Reid, I. D.; Schindler, K.: Online multi-target tracking using Recurrent Neural Networks. In: Association for the Advancement of Artificial Intelligence Conference on Artificial Intelligence, Bd. 2, p. 4, 2017.Search in Google Scholar

25. Nelles, O.: Nonlinear system identification. Measurement Science and Technology 13 (2002) 4, p. 646.10.1088/0957-0233/13/4/709Search in Google Scholar

26. Ning, G.; Zhang, Z.; Huang, C.; Ren, X.; Wang, H.; Cai, C.; He, Z.: Spatially supervised recurrent Convolutional Neural Networks for visual object tracking. In: Circuits and Systems (ISCAS), 2017 IEEE International Symposium on, p. 1–4, IEEE, 2017.Search in Google Scholar

27. NVIDIA Corporation: Nvidia TESLA P100 GPU accelerator. Techn. Ber., NVIDIA Corporation, 2016.Search in Google Scholar

28. Patino, L.; Nawaz, T.; Cane, T.; Ferryman, J.: PETS 2017: Dataset and challenge. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), p. 2126–2132, 2017.10.1109/CVPRW.2017.264Search in Google Scholar

29. Perez, L.; Wang, J.: The effectiveness of data augmentation in image classification using deep learning. CoRR abs/1712.04621 (2017).Search in Google Scholar

30. Rehder, E.; Wirth, F.; Lauer, M.; Stiller, C.: Pedestrian prediction by planning using deep neural networks. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), p. 1–5, IEEE, 2018.Search in Google Scholar

31. Rumelhart, D. E.; Hinton, G. E.; Williams, R. J.: Learning representations by back-propagating errors. In: Neurocomputing: Foundations of research, p. 696–699, Cambridge, MA, USA: MIT Press, 1988.Search in Google Scholar

32. Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R.: Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research 15 (2014), p. 1929–1958.Search in Google Scholar

33. Stratonovich, R. L.: Conditional Markov processes. Theory of Probability & Its Applications 5 (1960) 2, p. 156–178.10.1137/1105015Search in Google Scholar

34. Tzeng, E.; Hoffman, J.; Saenko, K.; Darrell, T.: Adversarial discriminative domain adaptation. In: Computer Vision and Pattern Recognition (CVPR), Bd. 1, p. 4, 2017.Search in Google Scholar

35. Welch, G.; Bishop, G.: An introduction to the Kalman filter. In: Technical Report, University of North Carolina at Chapel Hill, 2006.Search in Google Scholar

36. Werling, M.; Gröll, L.; Bretthauer, G.: Invariant trajectory tracking with a full-size autonomous road vehicle. IEEE Transactions on Robotics 26 (2010) 4, p. 758–765.10.1109/TRO.2010.2052325Search in Google Scholar

37. Wu, Y.; Lim, J.; Yang, M. H.: Object tracking benchmark. IEEE Transactions on Pattern Analysis and Machine Intelligence 37 (2015) 9, p. 1834–1848.10.1109/TPAMI.2014.2388226Search in Google Scholar PubMed

Received: 2018-10-12

Accepted: 2019-05-16

Published Online: 2019-07-06

Published in Print: 2019-07-26

Strategies for supplementing recurrent neural network training for spatio-temporal prediction

Abstract

Zusammenfassung

About the authors

Acknowledgment

References

Journal and Issue

Articles in the same Issue