Deeper Spatial Pyramid Network with Refined Up-Sampling for Optical Flow Estimation

Sun, Zefeng; Wang, Hanli

doi:10.1007/978-3-030-00776-8_45

Zefeng Sun^18,19 &
Hanli Wang^18,19

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11164))

Included in the following conference series:

Pacific Rim Conference on Multimedia

3699 Accesses
2 Citations

Abstract

Convolutional neural networks (CNNs) have been successfully applied to optical flow estimation and outperformed a number of variational approaches. The spatial pyramid network (SPyNet) is one of these CNN based approaches which is efficient to estimate optical flow. In this paper, a deeper spatial pyramid network (DSPyNet) is proposed based on SPyNet. In DSPyNet, the network architecture of SPyNet is reused and further refined at each pyramid level by convolutional factorization, and an addition of inception module and \(1\times 1\) convolutional operation is further used to enhance visual representation. Moreover, since bilinear interpolation reduces the quality of up-sampled flow field due to its low-pass filtering property, it is replaced with small kernel convolutional operations like image super-resolution using CNNs. The proposed DSPyNet is evaluated on several optical flow estimation benchmark datasets and the experimental results verify its effectiveness.

This work was supported in part by National Natural Science Foundation of China under Grants 61622115 and 61472281, Program for Professor of Special Appointment (Eastern Scholar) at Shanghai Institutions of Higher Learning (No. GZ2015005), Shanghai Engineering Research Center of Industrial Vision Perception & Intelligent Computing (17DZ2251600), and IBM Shared University Research Awards Program.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Baker, S., Scharstein, D., Lewis, J.P., Roth, S., Black, M.J., Szeliski, R.: A database and evaluation methodology for optical flow. Int. J. Comput. Vis. 92(1), 1–31 (2011)
Article Google Scholar
Butler, D.J., Wulff, J., Stanley, G.B., Black, M.J.: A naturalistic open source movie for optical flow evaluation. In: Proceedings of the European Conference on Computer Vision, pp. 611–625, October 2012
Chapter Google Scholar
Dosovitskiy, A., Fischery, P., Ilg, E., Hausser, P.: Flownet: learning optical flow with convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2758–2766, December 2015
Google Scholar
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3354–3361, June 2012
Google Scholar
Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of the International Conference on Artificial Intelligence and Statistics, pp. 315–323, April 2011
Google Scholar
Han, D.: Comparison of commonly used image interpolation methods. In: Proceedings of the International Conference on Computer Science and Electronics Engineerings, pp. 1556–1559, March 2013
Google Scholar
Horn, B.K., Schunck, B.G.: Determining optical flow. Artif. Intell. 17(1–3), 185–203 (1981)
Article Google Scholar
Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., Brox, T.: Flownet 2.0: evolution of optical flow estimation with deep networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2462–2470, December 2017
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of the International Conference on Neural Information Processing Systems, pp. 1097–1105, December 2012
Google Scholar
Lin, M., Chen, Q., Yan, S.: Network in network. arXiv preprint arXiv:1312.4400 (2013)
Mayer, N., et al.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4040–4048, June 2016
Google Scholar
Ranjan, A., Black, M.J.: Optical flow estimation using a spatial pyramid network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2720–2729, June 2017
Google Scholar
Shelhamer, E., Long, J., Darrell, T.: Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 640–651 (2016)
Article Google Scholar
Shi, W., et al.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1874–1883, June 2016
Google Scholar
Sun, D., Roth, S., Black, M.J.: A quantitative analysis of current practices in optical flow estimation and the principles behind them. Int. J. Comput. Vision 106(2), 115–137 (2014)
Article Google Scholar
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826, June 2016
Google Scholar
Toshev, A., Szegedy, C.: Deeppose: human pose estimation via deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1653–1660, June 2014
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Technology, Tongji University, Shanghai, 201804, People’s Republic of China
Zefeng Sun & Hanli Wang
Key Laboratory of Embedded System and Service Computing, Ministry of Education, Tongji University, Shanghai, 200092, People’s Republic of China
Zefeng Sun & Hanli Wang

Authors

Zefeng Sun
View author publications
You can also search for this author in PubMed Google Scholar
Hanli Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hanli Wang .

Editor information

Editors and Affiliations

Hefei University of Technology, Hefei, China
Richang Hong
National Chiao Tung University, Hsinchu, Taiwan
Wen-Huang Cheng
University of Tokyo, Tokyo, Japan
Toshihiko Yamasaki
Hefei University of Technology, Hefei, China
Meng Wang
City University of Hong Kong, Hong Kong, Hong Kong
Chong-Wah Ngo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sun, Z., Wang, H. (2018). Deeper Spatial Pyramid Network with Refined Up-Sampling for Optical Flow Estimation. In: Hong, R., Cheng, WH., Yamasaki, T., Wang, M., Ngo, CW. (eds) Advances in Multimedia Information Processing – PCM 2018. PCM 2018. Lecture Notes in Computer Science(), vol 11164. Springer, Cham. https://doi.org/10.1007/978-3-030-00776-8_45

Download citation

DOI: https://doi.org/10.1007/978-3-030-00776-8_45
Published: 19 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00775-1
Online ISBN: 978-3-030-00776-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics