Omnidirectional Image Stabilization for Visual Object Recognition

Torii, Akihiko; Havlena, Michal; Pajdla, Tomáš

doi:10.1007/s11263-010-0350-x

Omnidirectional Image Stabilization for Visual Object Recognition

Published: 21 May 2010

Volume 91, pages 157–174, (2011)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Akihiko Torii¹,
Michal Havlena¹ &
Tomáš Pajdla¹

802 Accesses
12 Citations
Explore all metrics

Abstract

In this paper, we present a pipeline for camera pose and trajectory estimation, and image stabilization and rectification for dense as well as wide baseline omnidirectional images. The proposed pipeline transforms a set of images taken by a single hand-held camera to a set of stabilized and rectified images augmented by the computed camera 3D trajectory and a reconstruction of feature points facilitating visual object recognition. The paper generalizes previous works on camera trajectory estimation done on perspective images to omnidirectional images and introduces a new technique for omnidirectional image rectification that is suited for recognizing people and cars in images. The performance of the pipeline is demonstrated on real image sequences acquired in urban as well as natural environments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

2d3. Boujou (2001). http://www.boujou.com.
Akbarzadeh, A., Frahm, J. M., Mordohai, P., Clipp, B., Engels, C., Gallup, D., Merrell, P., Phelps, M., Sinha, S., Talton, B., Wang, L., Yang, Q., Stewénius, H., Yang, R., Welch, G., Towles, H., Nistér, D., & Polleeys, M. (2006). Towards urban 3D reconstruction from video. In 3DPVT, Invited paper.
Bakstein, H., & Pajdla, T. (2002). Panoramic mosaicing with a 180° field of view lens. In OMNIVIS ’02, Copenhagen, Denmark (pp. 60–67).
Bay, H., Ess, A., Tuytelaars, T., & Van Gool, L. (2008). Speeded-up robust features (SURF). Computer Vision and Image Understanding, 110(3), 346–359.
Article Google Scholar
Brown, M., & Lowe, D. G. (2003). Recognising panoramas. In ICCV ’03, Washington, DC, USA.
Chum, O., & Matas, J. (2005). Matching with PROSAC—progressive sample consensus. In CVPR ’05, Los Alamitos, USA (Vol. I, pp. 220–226).
Clipp, B. Kim, J.-H., Frahm, J.-M., Pollefeys, M., Hartley, R. (2008). Robust 6DOF motion estimation for non-overlapping, multi-camera systems. In WACV ’08 (Vol. I, pp. 1–8).
Cornelis, N., Cornelis, K., & Van Gool, L. (2006). Fast compact city modeling for navigation pre-visualization. In CVPR ’06, New York, USA (Vol. II, pp. 1339–1344).
Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In CVPR ’05, Los Alamitos, USA (Vol. I, pp. 886–893).
Davison, A. J., & Molton, N. D. (2007). Monoslam: Real-time single camera SLAM. IEEE Transactions on Patern Analysis and Machine Intelligence, 29(6), 1052–1067.
Article Google Scholar
Ess, A., Leibe, B., Schindler, K., & Van Gool, L. (2008). A mobile vision system for robust multi-person tracking. In CVPR ’08, Anchorage, AK, USA.
Fischler, M., & Bolles, R. (1981). Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381–395.
Article MathSciNet Google Scholar
Geyer, C., & Daniilidis, K. (2001). Structure and motion from uncalibrated catadioptric views. In CVPR ’01 (pp. 279–286).
Goedemé, T., Nuttin, M., Tuytelaars, T., & Van Gool, L. (2007). Omnidirectional vision based topological navigation. International Journal of Computer Vision, 74(3), 219–236.
Article Google Scholar
Hartley, R., & Zisserman, A. (2003). Multiple view geometry in computer vision (2nd ed.). Cambridge: Cambridge University Press.
Google Scholar
Havlena, M., Pajdla, T., & Cornelis, K. (2008). Structure from omnidirectional stereo rig motion for city modeling. In VISAPP ’08, Funchal, Portugal.
Havlena, M., Torii, A., Knopp, H., & Pajdla, T. (2009). Randomized structure from motion based on atomic 3D models from camera triplets. In CVPR ’09, Miami, FL, USA.
Heller, J., Havlena, M., Torii, A., & Pajdla, T. (2010). CMP SfM web service v1.0. (Research Report CTU–CMP–2010–01). CMP Prague.
Hoiem, D., Efros, A. A., & Hebert, M. (2006). Putting objects in perspective. In CVPR ’06 (Vol. II, pp. 2137–2144).
Kahl, F. (2005). Multiple view geometry and the L-infinity norm. In ICCV ’05, China, Beijing.
Google Scholar
Ke, Q., & Kanade, T. (2007). Quasiconvex optimization for robust geometric reconstruction. IEEE Transactions on Patern Analysis and Machine Intelligence, 29(10), 1834–1847.
Article Google Scholar
Knopp, J., Šivic, J., & Pajdla, T. (2009). Location recognition using large vocabularies and fast spatial matching (Research Report CTU–CMP–2009–01). CMP Prague.
Leibe, B., Cornelis, N., Cornelis, K., & Van Gool, L. (2007a). Dynamic 3D scene analysis from a moving vehicle. In CVPR ’07, Minneapolis, MN, USA.
Leibe, B., Schindler, K., & Van Gool, L. (2007b). Coupled detection and trajectory estimation for multi-object tracking. In ICCV ’07, Rio de Janeiro, Brazil.
Li, H., & Hartley, R. (2005). A non-iterative method for correcting lens distortion from nine point correspondences. In OMNIVIS ’05 China: Beijing.
Lourakis, M., & Argyros, A. (2004). The design and implementation of a generic sparse bundle adjustment software package based on the Levenberg-Marquardt algorithm (Technical Report 340). Institute of Computer Science—FORTH, Heraklion, Crete, Greece. http://www.ics.forth.gr/~lourakis/sba.
Lowe, D. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.
Article Google Scholar
Martinec, D., & Pajdla, T. (2007). Robust rotation and translation estimation in multiview reconstruction. In CVPR ’07, Minneapolis, MN, USA.
Matas, J., Chum, O., Urban, M., & Pajdla, T. (2004). Robust wide-baseline stereo from maximally stable extremal regions. Image and Vision Computing, 22(10), 761–767.
Article Google Scholar
Microsoft (2008). Photosynth: Use your camera to stitch the world. http://livelabs.com/photosynth.
Mičušík, B., & Pajdla, T. (2006). Structure from motion with wide circular field of view cameras. IEEE Transactions on Patern Analysis and Machine Intelligence, 28(7), 1135–1149.
Article Google Scholar
Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., & Van Gool, L. (2005). A comparison of affine region detectors. International Journal of Computer Vision, 65(1–2), 43–72.
Article Google Scholar
Muja, M., & Lowe, D. (2009). Fast approximate nearest neighbors with automatic algorithm configuration. In VISAPP ’09, Lisboa, Portugal.
Nistér, D. (2004a). An efficient solution to the five-point relative pose problem. IEEE Transactions on Patern Analysis and Machine Intelligence, 26(6), 756–770.
Article Google Scholar
Nistér, D. (2004b). A minimal solution to the generalized 3-point pose problem. In CVPR ’04, Washington, DC, USA (Vol. I, pp. 560–567).
Nistér, D., & Engels, C. (2006). Estimating global uncertainty in epipolar geometry for vehicle-mounted cameras. In SPIE, unmanned systems technology VIII (Vol. 6230).
Obdržálek, Š., & Matas, J. (2002). Object recognition using local affine frames on distinguished regions. In BMVC ’02, London, UK (Vol. I, pp. 113–122).
Obdržálek, Š, & Matas, J. (2003). Image retrieval using local compact DCT-based representation. In LNCS : Vol. 2781. DAGM ’03 (pp. 490–497). Berlin: Springer.
Google Scholar
Point Grey Research (2005). Ladybug 2 Spherical Digital Camera System. http://www.ptgrey.com/products/ladybug2.
Scaramuzza, D., Fraundorfer, F., Siegwart, R., & Pollefeys, M. (2008). Closing the loop in appearance guided SfM for omnidirectional cameras. In OMNIVIS ’08, Marseille, France.
Schweighofer, G., & Pinz, A. (2008). Globally optimal O(n) solution to the PnP problem for general camera models. In BMVC ’08, Leeds, UK.
Sivic, J., & Zisserman, A. (2006). Video Google: Efficient visual search of videos. In CLOR ’06 (pp. 127–144).
Snavely, N., Seitz, S., & Szeliski, R. (2006). Photo Tourism: Exploring image collections in 3D. In SigGraph ’06, Boston, USA (pp. 835–846).
Snavely, N., Seitz, S., & Szeliski, R. (2008). Skeletal graphs for efficient structure from motion. In CVPR ’08, Anchorage, AK, USA.
Stewénius, H. (2005). Gröbner basis methods for minimal problems in computer vision. PhD thesis, Centre for Mathematical Sciences LTH, Lund University, Sweden.
Sturm, J. (2006). Sedumi: A software package to solve optimization problems. http://sedumi.ie.lehigh.edu.
Tardif, J., Pavlidis, Y., & Daniilidis, K. (2008). Monocular visual odometry in urban environments using an omdirectional camera. In IROS ’08, Nice, France.
Torii, A., & Pajdla, T. (2008). Omnidirectional camera motion estimation. In VISAPP ’08, Funchal, Portugal.
Torii, A., Havlena, M., Pajdla, T., & Leibe, B. (2008). Measuring camera translation by the dominant apical angle. In CVPR ’08, Anchorage, AK, USA.
Williams, B., Klein, G., & Reid, I. (2007). Real-time SLAM relocalisation. In ICCV ’07, Rio de Janeiro, Brazil.

Download references

Author information

Authors and Affiliations

Center for Machine Perception, Department of Cybernetics, Faculty of Elec. Eng., Czech Technical University in Prague, Karlovo náměstí 13, 121 35, Prague 2, Czech Republic
Akihiko Torii, Michal Havlena & Tomáš Pajdla

Authors

Akihiko Torii
View author publications
You can also search for this author in PubMed Google Scholar
Michal Havlena
View author publications
You can also search for this author in PubMed Google Scholar
Tomáš Pajdla
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Akihiko Torii.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

(MPG 4.56 MB)

(MPG 4.58 MB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Torii, A., Havlena, M. & Pajdla, T. Omnidirectional Image Stabilization for Visual Object Recognition. Int J Comput Vis 91, 157–174 (2011). https://doi.org/10.1007/s11263-010-0350-x

Download citation

Received: 04 May 2009
Accepted: 29 April 2010
Published: 21 May 2010
Issue Date: January 2011
DOI: https://doi.org/10.1007/s11263-010-0350-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Omnidirectional Image Stabilization for Visual Object Recognition

Abstract

Access this article

Similar content being viewed by others

Image Fusion Techniques: A Survey

Deep Learning on Image Stitching With Multi-viewpoint Images: A Survey

An Overview to Visual Odometry and Visual SLAM: Applications to Mobile Robotics

References

Author information

Authors and Affiliations

Corresponding author

Electronic Supplementary Material

(MPG 4.56 MB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Omnidirectional Image Stabilization for Visual Object Recognition

Abstract

Access this article

Similar content being viewed by others

Image Fusion Techniques: A Survey

Deep Learning on Image Stitching With Multi-viewpoint Images: A Survey

An Overview to Visual Odometry and Visual SLAM: Applications to Mobile Robotics

References

Author information

Authors and Affiliations

Corresponding author

Electronic Supplementary Material

(MPG 4.56 MB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation