Abstract
We propose the first network to jointly learn visual motion and confidence from events in spatially local patches. Event-based sensors deliver high temporal resolution motion information in a sparse, non-redundant format. This creates the potential for low computation, low latency motion recognition. Neural networks which extract global motion information, however, are generally computationally expensive. Here, we introduce a novel shallow and compact neural architecture and learning approach to capture reliable visual motion information along with the corresponding confidence of inference. Our network makes a prediction of the visual motion at each spatial location using only local events. Our confidence network then identifies which of these predictions will be accurate. In the task of recovering pan-tilt ego velocities from events, we show that each individual confident local prediction of our network can be expected to be as accurate as state of the art optimization approaches which utilize the full image. Furthermore, on a publicly available dataset, we find our local predictions generalize to scenes with camera motions and the presence of independently moving objects. This makes the output of our network well suited for motion based tasks, such as the segmentation of independently moving objects. We demonstrate on a publicly available motion segmentation dataset that restricting predictions to confident regions is sufficient to achieve results that exceed state of the art methods.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Gallego, G., et al.: Event-based vision: a survey. arXiv preprint arXiv:1904.08405 (2019)
Drazen, D., Lichtsteiner, P., Häfliger, P., Delbrück, T., Jensen, A.: Toward real-time particle tracking using an event-based dynamic vision sensor. Exp. Fluids 51(5), 1465 (2011)
Lichtsteiner, P., Posch, C., Delbruck, T.: A 128\(\times \) 128 120 db 15 \(\mu s\) latency asynchronous temporal contrast vision sensor. IEEE\(\_\)J\(\_\)JSSC, 43(2), 566–576 (2008)
Rebecq, H., Ranftl, R., Koltun, V., Scaramuzza, D.: High speed and high dynamic range video with an event camera. IEEE Trans. Pattern Anal. Mach. Intell. (2019)
Delmerico, J., Cieslewski, T., Rebecq, H., Faessler, M., Scaramuzza, D.: Are we ready for autonomous drone racing? the UZH-FPV drone racing dataset. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 6713–6719. IEEE (2019)
Mitrokhin, A., Fermuller, C., Parameshwara, C., Aloimonos, Y.: Event-based moving object detection and tracking. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), October 2018. http://dx.doi.org/10.1109/IROS.2018.8593805
Zhu, A.Z., Thakur, D., Özaslan, T., Pfrommer, B., Kumar, V., Daniilidis, K.: The multivehicle stereo event camera dataset: an event camera dataset for 3D perception. IEEE Robot. Autom. Lett. 3(3), 2032–2039 (2018)
Amir, A., et al.: A low power, fully Event-Based gesture recognition system. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7388–7397, July 2017
Barranco, F., Fermuller, C., Aloimonos, Y., Delbruck, T.: A dataset for visual navigation with neuromorphic methods. Front. Neurosci. 10, 49 (2016)
Reichrdt, W.E.: Autocorrelation, a principle for the evaluation of sensory information by the central nervous system. Sens. Commun. 303–317 (1961)
Adelson, E.H., Bergen, J.R.: Spatiotemporal energy models for the perception of motion. J. Opt. Soc. Am. A. 2(2), 284–299 (1985)
Britten, K.H., Shadlen, M.N., Newsome, W.T., Movshon, J.A.: Responses of neurons in macaque MT to stochastic motion signals. Vis. Neurosci. 10(6), 1157–1169 (1993)
Simoncelli, E.P., Heeger, D.J.: A model of neuronal responses in visual area MT. Vis. Res. 38(5), 743–761 (1998)
Borst, A., Haag, J., Reiff, D.F.: Fly motion vision. Annu. Rev. Neurosci. 33, 49–70 (2010)
Zhu, A.Z., Yuan, L., Chaney, K., Daniilidis, K.: EV-FlowNet: self-supervised optical flow estimation for event-based cameras, February 2018
Ye, C., Mitrokhin, A., Parameshwara, C., Fermüller, C., Yorke, J.A., Aloimonos, Y.: Unsupervised learning of dense optical flow and depth from sparse event data. CoRR, vol. abs/1809.08625 (2018). http://arxiv.org/abs/1809.08625
Mitrokhin, A., Ye, C., Fermuller, C., Aloimonos, Y., Delbruck, T.: EV-IMO: motion segmentation dataset and learning pipeline for event cameras, March 2019
Benosman, R., Clercq, C., Lagorce, X., Ieng, S.-H., Bartolozzi, C.: Event-based visual flow. IEEE Trans. Neural Netw. Learn. Syst. 25(2), 407–417 (2014)
Gallego, G., Rebecq, H., Scaramuzza, D.: A unifying contrast maximization framework for event cameras, with applications to motion, depth, and optical flow estimation. CoRR, vol. abs/1804.01306, 2018. http://arxiv.org/abs/1804.01306
Gallego, G., Scaramuzza, D.: Accurate angular velocity estimation with an event camera. IEEE Robot. Autom. Lett. 2(2), 632–639 (2017)
Stoffregen, T., Gallego, G., Drummond, T., Kleeman, L., Scaramuzza, D.: Event-based motion segmentation by motion compensation (2019)
Jacobs, R.A., et al.: Adaptive mixtures of local experts. Neural Comput. 3(1), 79–87 (1991)
Vaswani, A., et al.: Attention is all you need. In: Advances in neural information processing systems, pp. 5998–6008 (2017)
Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Mueggler, E., Rebecq, H., Gallego, G., Delbruck, T., Scaramuzza, D.: The event-camera dataset and simulator: event-based data for pose estimation, visual odometry, and SLAM. Int. J. Rob. Res. 36(2), 142–149 (2017)
Sironi, A., Brambilla, M., Bourdis, N., Lagorce, X., Benosman, R.: HATS: histograms of averaged time surfaces for robust event-based object classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1731–1740 (2018)
Lagorce, X., Orchard, G., Galluppi, F., Shi, B.E., Benosman, R.B.: HOTS: a hierarchy of event-based time-surfaces for pattern recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(7), 1346–1359 (2017)
Gerstner, W., Kistler, W.M.: Spiking Neuron Models: Single Neurons, Populations, Plasticity. Cambridge University Press, Cambridge (2002)
Baker, S., Matthews, I.: Lucas-Kanade 20 years on: a unifying framework. Int. J. Comput. Vis. 56(3), 221–255 (2004)
Wallach, H.: Über visuell wahrgenommene bewegungsrichtung. Psychologische Forschung 20(1), 325–380 (1935)
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization, December 2014
Stoffregen, T., Kleeman, L.: Simultaneous optical flow and segmentation (sofas) using dynamic vision sensor (2018)
Harris, C., Stephens, M.: A combined corner and edge detector. In: Proceeding of Fourth Alvey Vision Conference, pp. 147–151 (1988)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Kepple, D.R., Lee, D., Prepsius, C., Isler, V., Park, I.M., Lee, D.D. (2020). Jointly Learning Visual Motion and Confidence from Local Patches in Event Cameras. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12351. Springer, Cham. https://doi.org/10.1007/978-3-030-58539-6_30
Download citation
DOI: https://doi.org/10.1007/978-3-030-58539-6_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58538-9
Online ISBN: 978-3-030-58539-6
eBook Packages: Computer ScienceComputer Science (R0)