Skip to main content
Log in

Minimal Aspect Distortion (MAD) Mosaicing of Long Scenes

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Long scenes can be imaged by mosaicing multiple images from cameras scanning the scene. We address the case of a video camera scanning a scene while moving in a long path, e.g. scanning a city street from a driving car, or scanning a terrain from a low flying aircraft.

A robust approach to this task is presented, which is applied successfully to sequences having thousands of frames even when using a hand-held camera. Examples are given on a few challenging sequences. The proposed system consists of two components: (i) Motion and depth computation. (ii) Mosaic rendering.

In the first part a “direct” method is presented for computing motion and dense depth. Robustness of motion computation has been increased by limiting the motion model for the scanning camera. An iterative graph-cuts approach, with planar labels and a flexible similarity measure, allows the computation of a dense depth for the entire sequence.

In the second part a new minimal aspect distortion (MAD) mosaicing uses depth to minimize the geometrical distortions of long panoramic images. In addition to MAD mosaicing, interactive visualization using X-Slits is also demonstrated.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Agarwala, A., Dontcheva, M., Agrawala, M., Drucker, S., Colburn, A., Curless, B., Salesin, D., & Cohen, M. (2004). Interactive digital photomontage. In SIGGRAPH (pp. 294–302).

  • Agarwala, A., Agrawala, M., Cohen, M., Salesin, D., & Szeliski, R. (2006a). Photographing long scenes with multi-viewpoint panoramas. ACM Transactions on Graphics, 25(3), 853–861.

    Article  Google Scholar 

  • Agarwala, A., Agrawala, M., Cohen, M., Salesin, D., & Szeliski, R. (2006b). Photographing long scenes with multi-viewpoint panoramas. In SIGGRAPH’06 (pp. 853–861), July 2006.

  • Bergen, J., Anandan, P., Hanna, K., & Hingorani, R. (1992). Hierarchical model-based motion estimation. In ECCV (pp. 237–252).

  • Birchfield, S., & Tomasi, C. (1998). A pixel dissimilarity measure that is insensitive to image sampling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(4), 401–406.

    Article  Google Scholar 

  • Birchfield, S., & Tomasi, C. (1999). Multiway cut for stereo and motion with slanted surfaces. In ICCV (Vol. 1, pp. 489–495).

  • Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222–1239.

    Article  Google Scholar 

  • Deng, Y., Yang, Q., Lin, X., & Tang, X. (2005). A symmetric patch-based correspondence model for occlusion handling. In ICCV (pp. 1316–1322), Washington, DC, USA.

  • Feldman, D., & Zomet, A. (2004). Generating mosaics with minimum distortions. In Proceedings of the 2004 conference on computer vision and pattern recognition workshop (CVPRW’04) (Vol. 11, pp. 163–170), Washington, DC, USA.

  • Felzenszwalb, P., & Huttenlocher, D. (2006). Efficient belief propagation for early vision. International Journal of Computer Vision, 70(1), 41–54.

    Article  Google Scholar 

  • Gortler, S., Grzeszczuk, R., Szeliski, R., & Cohen, M. (1996). The lumigraph. SIGGRAPH, 30, 43–54.

    Article  Google Scholar 

  • Hanna, K. (1991). Direct multi-resolution estimation of ego-motion and structure from motion. In MOTION’91 (pp. 156–162).

  • Hartley, R., & Zisserman, A. (2004). Multiple view geometry (2nd ed.). Cambridge: Cambridge University Press.

    MATH  Google Scholar 

  • Hong, L., & Chen, G. (2004). Segment-based stereo matching using graph cuts. In CVPR (Vol. 1, pp. 74–81), Los Alamitos, CA, USA.

  • Irani, M., & Peleg, S. (1993). Motion analysis for image enhancement: Resolution, occlusion, and transparency. Journal of Visual Communication and Image Representation, 4, 324–335.

    Article  Google Scholar 

  • Irani, M., Rousso, B., & Peleg, S. (1992). Detecting and tracking multiple moving objects using temporal integration. In ECCV’92 (pp. 282–287).

  • Irani, M., Anandan, P., & Cohen, M. (2002). Direct recovery of planar-parallax from multiple frames. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(11), 1528–1534.

    Article  Google Scholar 

  • Kawasaki, H., Murao, M., Ikeuchi, K., & Sakauchi, M. (2001). Enhanced navigation system with real images and real-time information. In ITSWC’01, October 2001.

  • Kolmogorov, V., & Zabih, R. (2001). Computing visual correspondence with occlusions via graph cuts. In ICCV (Vol. 2, pp. 508–515), July 2001.

  • Kolmogorov, V., & Zabih, R. (2002). What energy functions can be minimized via graph cuts? In ECCV’02 (pp. 65–81), May 2002.

  • Levoy, M., & Hanrahan, P. (1996). Light field rendering. SIGGRAPH, 30, 31–42.

    Article  Google Scholar 

  • Lowe, D. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.

    Article  Google Scholar 

  • Montoliu, R., & Pla, F. (2003). Robust techniques in least squares-based motion estimation problems. In Lecture notes in computer science : Vol. 2905. Progress in pattern recognition, speech and image analysis (pp. 62–70). Berlin: Springer.

    Google Scholar 

  • Ono, S., Kawasaki, H., Hirahara, K., Kagesawa, M., & Ikeuchi, K. (2003). Ego-motion estimation for efficient city modeling by using epipolar plane range image. In ITSWC’03, November 2003.

  • Pollefeys, M., VanGool, L., Vergauwen, M., Verbiest, F., Cornelis, K., Tops, J., & Koch, R. (2004). Visual modeling with a hand-held camera. International Journal of Computer Vision, 59(3), 207–232.

    Article  Google Scholar 

  • Rav-Acha, A., & Peleg, S. (2004). A unified approach for motion analysis and view synthesis. In Second IEEE international symposium on 3D data processing, visualization, and transmission (3DPVT), Thessaloniki, Greece, September 2004.

  • Rav-Acha, A., & Peleg, S. (2006). Lucas–Kanade without iterative warping. In ICIP’06 (pp. 1097–1100).

  • Rav-Acha, A., Shor, Y., & Peleg, S. (2004). Mosaicing with parallax using time warping. In Second IEEE workshop on image and video registration, Washington, DC, July 2004.

  • Román, A., & Lensch, H. P. A. (2006). Automatic multiperspective images. In Proceedings of eurographics symposium on rendering (pp. 161–171).

  • Román, A., Garg, G., & Levoy, M. (2004). Interactive design of multi-perspective images for visualizing urban landscapes. In IEEE visualization 2004 (pp. 537–544), October 2004.

  • Shi, M., & Zheng, J. Y. (2005). A slit scanning depth of route panorama from stationary blur. In CVPR’05 (Vol. 1, pp. 1047–1054).

  • Wexler, Y., & Simakov, D. (2005). Space–time scene manifolds. In ICCV’05 (Vol. 1, pp. 858–863).

  • Xiao, J., & Shah, M. (2005). Motion layer extraction in the presence of occlusion using graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10), 1644–1659.

    Article  Google Scholar 

  • Yang, Q., Wang, L., & Yang, R. (2006). Real-time global stereo matching using hierarchical belief propagation. In BMVC (pp. 989–998), Edinburgh, September 2006.

  • Zheng, J. Y. (2000). Digital route panorama. IEEE Multimedia, 7(2), 7–10.

    Article  Google Scholar 

  • Zhu, Z., Riseman, E., & Hanson, A. (2004). Generalized parallel-perspective stereo mosaics from airborne videos. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(2), 226–237.

    Article  Google Scholar 

  • Zomet, A., Feldman, D., Peleg, S., & Weinshall, D. (2003). Mosaicing new views: the crossed-slits projection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(6), 741–754.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

This research was supported by the Israel Science Foundation. Video examples and high resolution images can be viewed in http://www.vision.huji.ac.il/mad/.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rav-Acha, A., Engel, G. & Peleg, S. Minimal Aspect Distortion (MAD) Mosaicing of Long Scenes. Int J Comput Vis 78, 187–206 (2008). https://doi.org/10.1007/s11263-007-0101-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-007-0101-9

Keywords

Navigation