Elsevier

Pattern Recognition Letters

Volume 28, Issue 15, 1 November 2007, Pages 2127-2132
Pattern Recognition Letters

A matter of notation: Several uses of the Kronecker product in 3D computer vision

https://doi.org/10.1016/j.patrec.2007.06.005Get rights and content

Abstract

This work presents a number of cases in Computer Vision where the introduction of the Kronecker product allows more elegant and compact derivations. We hold that a clear notation can enlighten properties and catalyze reasoning. In particular we introduce the trifocal matrix that allows to express the trilinear constraints among three views by using the familiar matrix algebra.

Introduction

The interest in the Kronecker product has grown recently, as witnessed by Van Loan (2000):

“The Kronecker product has a rich and very pleasing algebra that supports a wide range of fast, elegant, and practical algorithms. Several trends in scientific computing suggest that this important matrix operation will have an increasingly greater role to play in the future.”

In Computer Vision (CV), however, the Kronecker product appeared only sporadically and has not been widely used. One of the first appearances is in (Mendonça, 2001), where it is used mainly to compute derivatives of matrix functions (Magnus and Neudecker, 1999). For the same purpose it has been exploited later in (Fusiello et al., 2004). In Chojnacki et al., 2003, Izquierdo and Guerra, 2003 the Kronecker product arises in the study of the pre-conditioning of the eight-point-algorithm (Hartley, 1992). In (Brand, 2005) it is used in the context of non-rigid structure from motion. Albeit this sporadic appearances, the Kronecker product has not gained the attention that it probably deserves.

First we will describe the Kronecker product and some related matrix algebra tools. Then we will apply these tools to the derivation of some classical linear algorithm in CV, as the eight-point-algorithm and the Direct Linear Transform (DLT). Then we will re-derive the Zhang’s calibration method and the Fiore’s algorithm for exterior orientation. In all these cases the use of Kronecker product and related tools yields a compact derivation, where the matrices never need to be expanded in terms of their entries. This allows to reason about global properties of matrices – such as the rank.

The alternative derivations that we provide eventually attain to the same equations of the respective original algorithms. Hence, we refer the reader to the relevant papers for the discussion of their numerical properties.

In the last part we will introduce the trifocal matrix that – thanks to the Kronecker product – enables to express the trilinear constraints among three views with matrix algebra. Avoiding the tensorial notation is a great benefit in teaching, because in a typical CV course the exposition of tensor algebra is functional to the trifocal geometry only, hence it constitutes a substantial overhead. Moreover it is fairly unpalatable to the students, who, in our experience, are more proficient with the more familiar matrix algebra. All the previous attempts to avoid the tensorial notation (Ma et al., 2003, Hartley and Zisserman, 2003) sacrifices compactness, meaning that there is not a single algebraic object that “represents” the trilinearity, as our trifocal matrix does instead.

Section snippets

Some matrix tools

This section develops some matrix tools related to the Kronecker product that will prove useful in the rest of the paper. Further readings on this topic are (Horn and Johnson, 1994, Magnus and Neudecker, 1999).

The eight-point algorithm

A number of 2D–2D point correspondences mimri (in homogeneous coordinates) is given, and we are required to find the fundamental matrix F that links corresponding points in the bilinear form:mrTFm=0.The eight-point algorithm (Hartley, 1992) exploits Eq. (9) to linearly compute F. Using the Kronecker product and Eq. (5), the derivation of the linear system of equations is particularly easy and elegant, given that one never needs to explode matrices into components

The direct linear transform algorithm

The Direct Linear Transform (DLT) algorithm (Hartley and Zisserman, 2003) solves – with small variations – two different problems:

  • Camera calibration (or resection);

  • Homography estimation.

In this section the reader will appreciate the use of the Kronecker notation not only for its compactness, but also because of its rank property (Eq. (3)).

Zhang’s internal calibration

Here we will re-derive the core of Zhang’s calibration algorithm (Zhang, 2000), i.e., the procedure for computing the internal parameters of a camera starting from world-image homographies.

Several images of a known planar pattern are available, and it is assumed that correspondences between image points and 3D points on the planar pattern have been established in each view. We are required to find the camera’s internal parameters matrix K.

It is easy to see that for a camera P = K[Rt] the

Exterior orientation

Given a number of 2D–3D point correspondences mi  Mi (in homogeneous coordinates) and the intrinsic camera parameters K, we are required to find a rotation matrix R and a translation vector t (which specify attitude and position of the camera) such that:K-1mi[R|t]Mifor alli.The problem can be cast as a camera resection and solved with the DLT algorithm, but the resulting rotation matrix R is not guaranteed to be orthonormal. Hence, Fiore’s algorithm (Fiore, 2001) is to be preferred, which is

The trifocal constraint

We have demonstrated how the Kronecker notation can yield compact and elegant derivations for some Computer Vision algorithm. We shall now demonstrate how the trifocal constraint can be introduced without resorting to trilinear tensors, thanks to the Kronecker product. This is probably the greatest merit of this notation.

Consider a point M in space projecting to m1, m2 and m3 in the three camerasP1=[I|0],P2=[A2|e2,1],andP3=[A3|e3,1].Let us write the epipolar line of m1 in the other two views:ζ2m

Conclusions

We have shown some applications of the Kronecker notation to 3D Computer Vision problems. We argued that this compact notation, especially in the case of the trifocal constraint, can be a practical aid for teaching and a fruitful tool for reasoning about the properties of the matrices that are involved.

Acknowledgement

Michela Farenzena read the draft and her comments helped to improve the presentation.

References (20)

  • Brand, M., 2005. A direct method for 3D factorization of nonrigid motion observed in 2D. In: Proc. IEEE Conf. on...
  • W. Chojnacki et al.

    Revisiting Hartley’s normalized eight-point algorithm

    IEEE Trans. Pattern Anal Machine Intell.

    (2003)
  • P.D. Fiore

    Efficient linear solution of exterior orientation

    IEEE Trans. Pattern Anal Machine Intell.

    (2001)
  • A. Fusiello et al.

    Globally convergent autocalibration using interval analysis

    IEEE Trans. Pattern Anal Machine Intell.

    (2004)
  • Hartley, R.I., 1992. Estimation of relative camera position for uncalibrated cameras. In: Proc. European Conf. on...
  • Hartley, R.I., 1995. In defence of the 8-point algorithm. In: Proc. Internat. Conf. on Computer Vision, pp....
  • R. Hartley

    Lines and points in three views and the trifocal tensor

    Internat. J. Comput. Vision

    (1997)
  • R. Hartley et al.

    Multiple View Geometry in Computer Vision

    (2003)
  • R. Horn et al.

    Topics in Matrix Analysis

    (1994)
  • E. Izquierdo et al.

    Estimating the essential matrix by efficient linear techniques

    IEEE Trans. Circuits Systems Video Technol.

    (2003)
There are more references available in the full text version of this article.

Cited by (12)

  • Refractive geometry for underwater domes

    2022, ISPRS Journal of Photogrammetry and Remote Sensing
  • Autocalibration for Structure from Motion

    2017, Computer Vision and Image Understanding
  • Applications of Anisotropic Procrustes Analysis

    2019, CISM International Centre for Mechanical Sciences, Courses and Lectures
View all citing articles on Scopus
View full text