Skip to main content

A Global-Matching Framework for Multi-View Stereopsis

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11678))

Abstract

As deep neural network demonstrated its success on various Computer Vision problems, a number of approaches have been proposed for applying it to multi-view stereopsis. Most of these approaches train networks over small cropped image patches so that the requirements on GPU’s processing power and memory space are manageable. The limitation of such approaches, however, is that the networks cannot effectively learn global information and hence have trouble handling large textureless regions. In addition, when testing on different datasets, these networks often need to be retrained to achieve optimal performances. To address this incompetency, we present in this paper a robust framework that is trained on high-resolution (\(1280 \times 1664\)) stereo images directly. It is therefore capable of learning global information and enforcing smoothness constraints across the whole image. To reduce the memory space requirement, the network is trained to output the matching scores of different pixels under each depth hypothesis at a time. A novel loss function is designed to properly handle the unbalanced distribution of matching scores. Finally, trained over binocular stereo datasets only, we show that the network can directly handle the DTU multi-view stereo dataset and generate results comparable to existing state-of-the-art approaches.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Aanæs, H., Jensen, R.R., Vogiatzis, G., Tola, E., Dahl, A.B.: Large-scale data for multiple-view stereopsis. Int. J. Comput. Vis. 120, 153–168 (2016)

    Article  MathSciNet  Google Scholar 

  2. Bleyer, M., Rhemann, C., Rother, C.: Patchmatch stereo-stereo matching with slanted support windows. In: BMVC, vol. 11, pp. 1–11 (2011)

    Google Scholar 

  3. Bromley, J., Guyon, I., LeCun, Y., Säckinger, E., Shah, R.: Signature verification using a “siamese” time delay neural network. In: Advances in Neural Information Processing Systems, pp. 737–744 (1994)

    Google Scholar 

  4. Campbell, N.D.F., Vogiatzis, G., Hernández, C., Cipolla, R.: Using multiple hypotheses to improve depth-maps for multi-view stereo. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5302, pp. 766–779. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88682-2_58

    Chapter  Google Scholar 

  5. Chang, J.R., Chen, Y.S.: Pyramid stereo matching network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5410–5418 (2018)

    Google Scholar 

  6. Choi, S., Kim, S., Sohn, K., et al.: Learning descriptor, confidence, and depth estimation in multi-view stereo. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 389–3896. IEEE (2018)

    Google Scholar 

  7. Collins, R.T.: A space-sweep approach to true multi-image matching. In: Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 358–363. IEEE (1996)

    Google Scholar 

  8. Furukawa, Y., Hernández, C., et al.: Multi-view stereo: a tutorial. Found. Trends® Comput. Graph. Vis. 9(1–2), 1–148 (2015)

    Article  Google Scholar 

  9. Furukawa, Y., Ponce, J.: Accurate, dense, and robust multiview stereopsis. IEEE Trans. Pattern Anal. Mach. Intell. 32(8), 1362–1376 (2010)

    Article  Google Scholar 

  10. Galliani, S., Lasinger, K., Schindler, K.: Massively parallel multiview stereopsis by surface normal diffusion. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 873–881 (2015)

    Google Scholar 

  11. Galliani, S., Schindler, K.: Just look at the image: viewpoint-specific surface normal prediction for improved multi-view reconstruction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5479–5487 (2016)

    Google Scholar 

  12. Hartmann, W., Galliani, S., Havlena, M., Van Gool, L., Schindler, K.: Learned multi-patch similarity. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 1595–1603. IEEE (2017)

    Google Scholar 

  13. Huang, P.H., Matzen, K., Kopf, J., Ahuja, N., Huang, J.B.: DeepMVS: learning multi-view stereopsis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2821–2830 (2018)

    Google Scholar 

  14. Im, S., Jeon, H.G., Lin, S., Kweon, I.S.: DPSNet: end-to-end deep plane sweep stereo. In: International Conference on Learning Representations (2019)

    Google Scholar 

  15. Ji, M., Gall, J., Zheng, H., Liu, Y., Fang, L.: SurfaceNet: an end-to-end 3D neural network for multiview stereopsis. arXiv preprint arXiv:1708.01749 (2017)

  16. Kendall, A., et al.: End-to-end learning of geometry and context for deep stereo regression. In: Proceedings of the International Conference on Computer Vision (ICCV) (2017)

    Google Scholar 

  17. Menze, M., Heipke, C., Geiger, A.: Joint 3D estimation of vehicles and scene flow. In: ISPRS Workshop on Image Sequence Analysis (ISA) (2015)

    Article  Google Scholar 

  18. Pang, J., Sun, W., Ren, J.S., Yang, C., Yan, Q.: Cascade residual learning: a two-stage convolutional neural network for stereo matching. In: ICCV Workshops, vol. 7 (2017)

    Google Scholar 

  19. Park, H., Lee, K.M.: Look wider to match image patches with convolutional neural networks. IEEE Signal Process. Lett. 24(12), 1788–1792 (2017)

    Article  Google Scholar 

  20. Poms, A., Wu, C., Yu, S.I., Sheikh, Y.: Learning patch reconstructability for accelerating multi-view stereo. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3041–3050 (2018)

    Google Scholar 

  21. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

    Chapter  Google Scholar 

  22. Scharstein, D., et al.: High-resolution stereo datasets with subpixel-accurate ground truth. In: Jiang, X., Hornegger, J., Koch, R. (eds.) GCPR 2014. LNCS, vol. 8753, pp. 31–42. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11752-2_3

    Chapter  Google Scholar 

  23. Tola, E., Strecha, C., Fua, P.: Efficient large-scale multi-view stereo for ultra high-resolution image sets. Mach. Vis. Appl. 23(5), 903–920 (2012)

    Article  Google Scholar 

  24. Yao, Y., Luo, Z., Li, S., Fang, T., Quan, L.: MVSNet: depth inference for unstructured multi-view stereo. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11212, pp. 785–801. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01237-3_47

    Chapter  Google Scholar 

  25. Ye, X., Li, J., Wang, H., Huang, H., Zhang, X.: Efficient stereo matching leveraging deep local and context information. IEEE Access 5, 18745–18755 (2017)

    Article  Google Scholar 

  26. Zbontar, J., LeCun, Y.: Stereo matching by training a convolutional neural network to compare image patches. J. Mach. Learn. Res. 17(1–32), 2 (2016)

    MATH  Google Scholar 

  27. Zhang, Z.: A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 22(11), 1330–1334 (2000). https://doi.org/10.1109/34.888718

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Minglun Gong .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mao, W., Gong, M., Huang, X., Cai, H., Yi, Z. (2019). A Global-Matching Framework for Multi-View Stereopsis. In: Vento, M., Percannella, G. (eds) Computer Analysis of Images and Patterns. CAIP 2019. Lecture Notes in Computer Science(), vol 11678. Springer, Cham. https://doi.org/10.1007/978-3-030-29888-3_52

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-29888-3_52

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-29887-6

  • Online ISBN: 978-3-030-29888-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics