Demisting the Hough Transform for 3D Shape Recognition and Registration

Woodford, Oliver J.; Pham, Minh-Tri; Maki, Atsuto; Perbet, Frank; Stenger, Björn

doi:10.1007/s11263-013-0623-2

Demisting the Hough Transform for 3D Shape Recognition and Registration

Published: 14 April 2013

Volume 106, pages 332–341, (2014)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Oliver J. Woodford¹,
Minh-Tri Pham¹,
Atsuto Maki¹,
Frank Perbet¹ &
…
Björn Stenger¹

1967 Accesses
45 Citations
3 Altmetric
Explore all metrics

Abstract

In applying the Hough transform to the problem of 3D shape recognition and registration, we develop two new and powerful improvements to this popular inference method. The first, intrinsic Hough, solves the problem of exponential memory requirements of the standard Hough transform by exploiting the sparsity of the Hough space. The second, minimum-entropy Hough, explains away incorrect votes, substantially reducing the number of modes in the posterior distribution of class and pose, and improving precision. Our experiments demonstrate that these contributions make the Hough transform not only tractable but also highly accurate for our example application. Both contributions can be applied to other tasks that already use the standard Hough transform.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Unified Approach to Shape Model Fitting and Non-rigid Registration

Extending the image ray transform for shape detection and extraction

Article 18 November 2014

A variant of the Hough Transform for the combined detection of corners, segments, and polylines

Article Open access 02 May 2017

Notes

Specifically we use the Shannon (1948) entropy, $H = E[-\ln p(x)] = -\int p(x)\ln p(x) ~\mathrm d x$.
Strictly speaking, the minimum-entropy Hough transform is not a transform, because the probability of each location in Hough space cannot be computed independently.
A direct similarity is a transformation consisting of a rotation, a translation and a uniform scaling.
The requirement for a scale independent optimization strategy is a further reason to use the proxy of Eq. (8).

References

Toshiba CAD model point clouds dataset (2011). http://www.toshiba-europe.com/research/crl/cvg/projects/stereo_points.html.
Allan, M., & Williams, C. K. I. (2009). Object localisation using the generative template of features. Computer Vision and Image Understanding, 113, 824–838.
Article Google Scholar
Ballard, D. H. (1981). Generalizing the Hough transform to detect arbitrary shapes. Pattern Recognition, 13(2), 111–122.
Article MATH Google Scholar
Barinova, O., Lempitsky, V., & Kohli, P. (2010). On detection of multiple object instances using Hough transforms. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Ben-Tzvi, D., & Sandler, M. B. (1990). A combinatorial Hough transform. Pattern Recognition Letters, 11(3), 167–174.
Article MATH Google Scholar
Besag, J. (1986). On the statistical analysis of dirty pictures. Journal of the Royal Statistical Society, Series B, 48(3), 259–302.
MATH MathSciNet Google Scholar
Birchfield, S., & Tomasi, C. (1999). Multiway cut for stereo and motion with slanted surfaces. In: Proceedings of the IEEE International Conference on Computer Vision.
Bober, M., & Kittler, J. (1993). Estimation of complex multimodal motion: An approach based on robust statistics and Hough transform. In: Proceedings of the British Machine Vision Conference.
Cheng, Y. (1995). Mean shift, mode seeking, and clustering. Transactions on Pattern Analysis and Machine Intelligence, 17(8), 790–799.
Article Google Scholar
Delong, A., Osokin, A., Isack, H., & Boykov, Y. (2012). Fast approximate energy minimization with label costs. International Journal of Computer Vision, 96(1), 1–27.
Article MATH MathSciNet Google Scholar
Delong, A., Veksler, O., & Boykov, Y. (2012). Fast fusion moves for multi-model estimation. In: Proceedings of the European Conference on Computer Vision.
Drost, B., Ulrich, M., Navab, N., & Ilic, S. (2010). Model globally, match locally: Efficient and robust 3D object recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 998–1005).
Duda, R. O., & Hart, P. E. (1972). Use of the Hough transformation to detect lines and curves in pictures. Communications of the ACM, 15, 11–15.
Article Google Scholar
Falk, H. (1970). Inequalities of J. W. Gibbs. American Journal of Physics, 38(7), 858–869.
Article MathSciNet Google Scholar
Fisher, A., Fisher, R. B., Robertson, C., & Werghi, N. (1998). Finding surface correspondence for object recognition and registration using pairwise geometric histograms (pp. 674–686). In: Proceedings of the European Conference on Computer Vision.
Gall, J., & Lempitsky, V. (2009). Class-specific Hough forests for object detection (pp. 1022–1029). In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Gerig, G. (1987). Linking image-space and accumulator-space: A new approach for object-recognition (pp. 112–117). In: Proceedings of the IEEE International Conference on Computer Vision.
Hough, P.V.C. (1962) Method and means for recognizing complex patterns. U.S. Patent 3,069,654.
Illingworth, J., & Kittler, J. (1987). The adaptive Hough transform. Transactions on Pattern Analysis and Machine Intelligence, 9(5), 690–698.
Google Scholar
Isack, H., & Boykov, Y. (2012). Energy-based geometric multi-model fitting. International Journal of Computer Vision, 97(2), 123–147.
Article MATH MathSciNet Google Scholar
Knopp, J., Prasad, M., Willems, G., Timofte, R., & Van Gool, L. (2010). Hough transform and 3D SURF for robust three dimensional classification (pp. 589–602). In: Proceedings of the European Conference on Computer Vision.
Lamdan, Y., & Wolfson, H. (1988). Geometric hashing: A general and efficient model-based recognition scheme (pp. 238–249). In: Proceedings of the IEEE International Conference on Computer Vision.
Leibe, B., Leonardis, A., & Schiele, B. (2004). Combined object categorization and segmentation with an implicit shape model. In: ECCV Workshop on Statistical Learning in Computer Vision.
Leibe, B., Leonardis, A., & Schiele, B. (2008). Robust object detection with interleaved categorization and segmentation. International Journal of Computer Vision, 77(1–3), 259–289.
Article Google Scholar
Li, H., Lavin, M. A., & Le Master, R. J. (1986). Fast Hough transform: A hierarchical approach. Computer Vision, Graphics, and Image Processing, 36(2–3), 139–161.
Article Google Scholar
MacKay, D. J. C. (2009). Information theory. Inference and learning algorithms. Cambridge: Cambridge University Press.
Google Scholar
Maji, S., & Malik, J. (2009). Object detection using a max-margin Hough transform. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Mian, A., Bennamoun, M., & Owens, R. (2006). Three-dimensional model-based object recognition and segmentation in cluttered scenes. Transactions on Pattern Analysis and Machine Intelligence, 28(10), 1584–1601.
Article Google Scholar
Minka, T. P. (2003). The ‘summation hack’ as an outlier model. Technical note.
Okada, R. (2009). Discriminative generalized Hough transform for object detection (pp. 2000–2005). In: Proceedings of the IEEE International Conference on Computer Vision.
Pham, M. T., Woodford, O. J., Perbet, F., Maki, A., Stenger, B., & Cipolla, R. (2011). A new distance for scale-invariant 3D shape recognition and registration. In: Proceedings of the IEEE International Conference on Computer Vision.
Rosten, E., & Loveland, R. (2009). Camera distortion self-calibration using the plumb-line constraint and minimal Hough entropy. Machine Vision and Applications.
Shannon, C. E. (1948). A mathematical theory of communication. The Bell System Technical Journal, 27(379–423), 623–656.
Article MathSciNet Google Scholar
Sheikh, Y. A., Khan, E. A., & Kanade, T. (2007). Mode-seeking by medoidshifts. In: Proceedings of the IEEE International Conference on Computer Vision.
Stephens, R. S. (1991). A probabilistic approach to the Hough transform. Image and Vision Computing, 9(1), 66–71.
Article Google Scholar
Toldo, R., & Fusiello, A. (2008). Robust multiple structures estimation with j-linkage. In: Proceedings of the European Conference on Computer Vision.
Tombari, F., & Di Stefano, L. (2010). Object recognition in 3D scenes with occlusions and clutter by Hough voting (pp. 349–355). In: Proceedings of PSIVT.
Vedaldi, A., & Soatto, S. (2008). Quick shift and kernel methods for mode seeking (pp. 705–718). In: Proceedings of the European Conference on Computer Vision.
Vincent, E., & Laganiere, R. (2001). Detecting planar homographies in an image pair (pp. 182–187). In: Proceedings of the International Symposium on Image and Signal Processing and Analysis.
Vogiatzis, G., & Hernández, C. (2011). Video-based, real-time multi view stereo. Image and Vision Computing, 29(7), 434–441.
Article Google Scholar
Woodford, O. J., Pham, M. T., Maki, A., Gherardi, R., Perbet, F., & Stenger, B. (2012). Contraction moves for geometric model fitting. In: Proceedings of the European Conference on Computer Vision.
Xu, L., Oja, E., & Kultanen, P. (1990). A new curve detection method: Randomized Hough transform (RHT). Pattern Recognition Letters, 11(5), 331–338.
Google Scholar
Zhang, W., & Kosecká, J. (2007). Nonparametric estimation of multiple structures with outliers. In R. Vidal, A. Heyden, & Y. Ma (Eds.), Dynamical Vision, Lecture Notes in Computer Science, vol. 4358 (pp. 60–74). Heidelberg: Springer.
Google Scholar
Zhang, Y., & Chen, T. (2010). Implicit shape kernel for discrimintative learning of the Hough transform detector. In: Proceedings of the British Machine Vision Conference.
Zuliani, M., Kenney, C. S., & Manjunath, B. S. (2005). The multiRANSAC algorithm and its application to detect planar homographies. In: Proceedings of the IEEE International Conference on Image Processing.

Download references

Acknowledgments

The authors are extremely grateful to Bob Fisher, Andrew Fitzgibbon, Chris Williams, John Illingworth and the anonymous reviewers for providing valuable feedback on this work.

Author information

Authors and Affiliations

Toshiba Research Europe Ltd., 208 Cambridge Science Park, Milton Road, Cambridge, CB4 0GZ, UK
Oliver J. Woodford, Minh-Tri Pham, Atsuto Maki, Frank Perbet & Björn Stenger

Authors

Oliver J. Woodford
View author publications
You can also search for this author in PubMed Google Scholar
Minh-Tri Pham
View author publications
You can also search for this author in PubMed Google Scholar
Atsuto Maki
View author publications
You can also search for this author in PubMed Google Scholar
Frank Perbet
View author publications
You can also search for this author in PubMed Google Scholar
Björn Stenger
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Oliver J. Woodford.

Appendix Proof of the Integer Nature of Vote Weights

Theorem 1

Given Eq. (3), an integer set of optimal values of $\varvec{\uptheta }$ exists, i.e. for which $\theta _{ij}\in \{0,1\}~\forall i,j$.

Proof

Let $\varvec{\uptheta }^{\prime }$ denote a globally optimal value of $\varvec{\uptheta }$, i.e. one that minimizes Eq. (3). Let us consider only the vote weights of the $i^\mathrm th $ feature, and assume the other vote weights are fixed at their optimal value, i.e. $\varvec{\uptheta }_j = \varvec{\uptheta }_j^{\prime } ~\forall j\ne i$. The objective function can then be written as

$$\begin{aligned} f(\varvec{\uptheta }_i)&= -\int _\mathcal H p(\mathbf y |\varvec{\uptheta }_i) \ln p(\mathbf y |\varvec{\uptheta }_i^{\prime }) ~\mathrm d \mathbf y ,\end{aligned}$$

(14)

$$\begin{aligned} p(\mathbf y |\varvec{\uptheta }_i)&= C(\mathbf y ) + \omega _i\sum _{j=1}^{J_i}\theta _{ij} K(\mathbf x _{ij},\mathbf y ), \end{aligned}$$

(15)

where $C(\mathbf y )$ is a function which is independent of $\varvec{\uptheta }_i$. Note that we have further assumed that the instance of $\varvec{\uptheta }_i$ in the $\ln p(\mathbf y |\varvec{\uptheta }_i)$ term in Eq. (14) is also at its optimal value, $\varvec{\uptheta }_i^{\prime }$. This allows us to rewrite the objective function as follows:

$$\begin{aligned} f(\varvec{\uptheta }_i)&= D - \sum _{j=1}^{J_i}\theta _{ij} a_{ij}\end{aligned}$$

(16)

$$\begin{aligned} a_{ij}&= \int _\mathcal H \omega _iK(\mathbf x _{ij},\mathbf y ) \ln p(\mathbf y |\varvec{\uptheta }_i^{\prime }) ~\mathrm d \mathbf y \end{aligned}$$

(17)

where $D$ is a constant, as are the values $a_{ij}$. Given the constraints of Eq. (2), minimizing Eq. (16) with respect to $\varvec{\uptheta }_i$, can always be achieved by setting $\theta _{ij} = 1$ for one $j$ for which $a_{ij}$ is largest, and setting all other weights to 0. In addition, Gibbs’ inequality Falk (1970) implies that Eq. (14) is minimized when $\varvec{\uptheta }_i= \varvec{\uptheta }_i^{\prime }$ (as we require them to be). Therefore the $i^\mathrm th $ feature must have an integer set of optimal weights. This argument can be applied to each feature independently. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Woodford, O.J., Pham, MT., Maki, A. et al. Demisting the Hough Transform for 3D Shape Recognition and Registration. Int J Comput Vis 106, 332–341 (2014). https://doi.org/10.1007/s11263-013-0623-2

Download citation

Received: 19 September 2012
Accepted: 01 April 2013
Published: 14 April 2013
Issue Date: February 2014
DOI: https://doi.org/10.1007/s11263-013-0623-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Demisting the Hough Transform for 3D Shape Recognition and Registration

Abstract

Access this article

Similar content being viewed by others

A Unified Approach to Shape Model Fitting and Non-rigid Registration

Extending the image ray transform for shape detection and extraction

A variant of the Hough Transform for the combined detection of corners, segments, and polylines

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix Proof of the Integer Nature of Vote Weights

Theorem 1

Proof

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Demisting the Hough Transform for 3D Shape Recognition and Registration

Abstract

Access this article

Similar content being viewed by others

A Unified Approach to Shape Model Fitting and Non-rigid Registration

Extending the image ray transform for shape detection and extraction

A variant of the Hough Transform for the combined detection of corners, segments, and polylines

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix Proof of the Integer Nature of Vote Weights

Appendix Proof of the Integer Nature of Vote Weights

Theorem 1

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation