Skip to main content
Log in

Demisting the Hough Transform for 3D Shape Recognition and Registration

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

In applying the Hough transform to the problem of 3D shape recognition and registration, we develop two new and powerful improvements to this popular inference method. The first, intrinsic Hough, solves the problem of exponential memory requirements of the standard Hough transform by exploiting the sparsity of the Hough space. The second, minimum-entropy Hough, explains away incorrect votes, substantially reducing the number of modes in the posterior distribution of class and pose, and improving precision. Our experiments demonstrate that these contributions make the Hough transform not only tractable but also highly accurate for our example application. Both contributions can be applied to other tasks that already use the standard Hough transform.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. Specifically we use the  Shannon (1948) entropy, \(H = E[-\ln p(x)] = -\int p(x)\ln p(x) ~\mathrm d x\).

  2. Strictly speaking, the minimum-entropy Hough transform is not a transform, because the probability of each location in Hough space cannot be computed independently.

  3. A direct similarity is a transformation consisting of a rotation, a translation and a uniform scaling.

  4. The requirement for a scale independent optimization strategy is a further reason to use the proxy of Eq. (8).

References

  • Toshiba CAD model point clouds dataset (2011). http://www.toshiba-europe.com/research/crl/cvg/projects/stereo_points.html.

  • Allan, M., & Williams, C. K. I. (2009). Object localisation using the generative template of features. Computer Vision and Image Understanding, 113, 824–838.

    Article  Google Scholar 

  • Ballard, D. H. (1981). Generalizing the Hough transform to detect arbitrary shapes. Pattern Recognition, 13(2), 111–122.

    Article  MATH  Google Scholar 

  • Barinova, O., Lempitsky, V., & Kohli, P. (2010). On detection of multiple object instances using Hough transforms. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

  • Ben-Tzvi, D., & Sandler, M. B. (1990). A combinatorial Hough transform. Pattern Recognition Letters, 11(3), 167–174.

    Article  MATH  Google Scholar 

  • Besag, J. (1986). On the statistical analysis of dirty pictures. Journal of the Royal Statistical Society, Series B, 48(3), 259–302.

    MATH  MathSciNet  Google Scholar 

  • Birchfield, S., & Tomasi, C. (1999). Multiway cut for stereo and motion with slanted surfaces. In: Proceedings of the IEEE International Conference on Computer Vision.

  • Bober, M., & Kittler, J. (1993). Estimation of complex multimodal motion: An approach based on robust statistics and Hough transform. In: Proceedings of the British Machine Vision Conference.

  • Cheng, Y. (1995). Mean shift, mode seeking, and clustering. Transactions on Pattern Analysis and Machine Intelligence, 17(8), 790–799.

    Article  Google Scholar 

  • Delong, A., Osokin, A., Isack, H., & Boykov, Y. (2012). Fast approximate energy minimization with label costs. International Journal of Computer Vision, 96(1), 1–27.

    Article  MATH  MathSciNet  Google Scholar 

  • Delong, A., Veksler, O., & Boykov, Y. (2012). Fast fusion moves for multi-model estimation. In: Proceedings of the European Conference on Computer Vision.

  • Drost, B., Ulrich, M., Navab, N., & Ilic, S. (2010). Model globally, match locally: Efficient and robust 3D object recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 998–1005).

  • Duda, R. O., & Hart, P. E. (1972). Use of the Hough transformation to detect lines and curves in pictures. Communications of the ACM, 15, 11–15.

    Article  Google Scholar 

  • Falk, H. (1970). Inequalities of J. W. Gibbs. American Journal of Physics, 38(7), 858–869.

    Article  MathSciNet  Google Scholar 

  • Fisher, A., Fisher, R. B., Robertson, C., & Werghi, N. (1998). Finding surface correspondence for object recognition and registration using pairwise geometric histograms (pp. 674–686). In: Proceedings of the European Conference on Computer Vision.

  • Gall, J., & Lempitsky, V. (2009). Class-specific Hough forests for object detection (pp. 1022–1029). In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

  • Gerig, G. (1987). Linking image-space and accumulator-space: A new approach for object-recognition (pp. 112–117). In: Proceedings of the IEEE International Conference on Computer Vision.

  • Hough, P.V.C. (1962) Method and means for recognizing complex patterns. U.S. Patent 3,069,654.

  • Illingworth, J., & Kittler, J. (1987). The adaptive Hough transform. Transactions on Pattern Analysis and Machine Intelligence, 9(5), 690–698.

    Google Scholar 

  • Isack, H., & Boykov, Y. (2012). Energy-based geometric multi-model fitting. International Journal of Computer Vision, 97(2), 123–147.

    Article  MATH  MathSciNet  Google Scholar 

  • Knopp, J., Prasad, M., Willems, G., Timofte, R., & Van Gool, L. (2010). Hough transform and 3D SURF for robust three dimensional classification (pp. 589–602). In: Proceedings of the European Conference on Computer Vision.

  • Lamdan, Y., & Wolfson, H. (1988). Geometric hashing: A general and efficient model-based recognition scheme (pp. 238–249). In: Proceedings of the IEEE International Conference on Computer Vision.

  • Leibe, B., Leonardis, A., & Schiele, B. (2004). Combined object categorization and segmentation with an implicit shape model. In: ECCV Workshop on Statistical Learning in Computer Vision.

  • Leibe, B., Leonardis, A., & Schiele, B. (2008). Robust object detection with interleaved categorization and segmentation. International Journal of Computer Vision, 77(1–3), 259–289.

    Article  Google Scholar 

  • Li, H., Lavin, M. A., & Le Master, R. J. (1986). Fast Hough transform: A hierarchical approach. Computer Vision, Graphics, and Image Processing, 36(2–3), 139–161.

    Article  Google Scholar 

  • MacKay, D. J. C. (2009). Information theory. Inference and learning algorithms. Cambridge: Cambridge University Press.

    Google Scholar 

  • Maji, S., & Malik, J. (2009). Object detection using a max-margin Hough transform. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

  • Mian, A., Bennamoun, M., & Owens, R. (2006). Three-dimensional model-based object recognition and segmentation in cluttered scenes. Transactions on Pattern Analysis and Machine Intelligence, 28(10), 1584–1601.

    Article  Google Scholar 

  • Minka, T. P. (2003). The ‘summation hack’ as an outlier model. Technical note.

  • Okada, R. (2009). Discriminative generalized Hough transform for object detection (pp. 2000–2005). In: Proceedings of the IEEE International Conference on Computer Vision.

  • Pham, M. T., Woodford, O. J., Perbet, F., Maki, A., Stenger, B., & Cipolla, R. (2011). A new distance for scale-invariant 3D shape recognition and registration. In: Proceedings of the IEEE International Conference on Computer Vision.

  • Rosten, E., & Loveland, R. (2009). Camera distortion self-calibration using the plumb-line constraint and minimal Hough entropy. Machine Vision and Applications.

  • Shannon, C. E. (1948). A mathematical theory of communication. The Bell System Technical Journal, 27(379–423), 623–656.

    Article  MathSciNet  Google Scholar 

  • Sheikh, Y. A., Khan, E. A., & Kanade, T. (2007). Mode-seeking by medoidshifts. In: Proceedings of the IEEE International Conference on Computer Vision.

  • Stephens, R. S. (1991). A probabilistic approach to the Hough transform. Image and Vision Computing, 9(1), 66–71.

    Article  Google Scholar 

  • Toldo, R., & Fusiello, A. (2008). Robust multiple structures estimation with j-linkage. In: Proceedings of the European Conference on Computer Vision.

  • Tombari, F., & Di Stefano, L. (2010). Object recognition in 3D scenes with occlusions and clutter by Hough voting (pp. 349–355). In: Proceedings of PSIVT.

  • Vedaldi, A., & Soatto, S. (2008). Quick shift and kernel methods for mode seeking (pp. 705–718). In: Proceedings of the European Conference on Computer Vision.

  • Vincent, E., & Laganiere, R. (2001). Detecting planar homographies in an image pair (pp. 182–187). In: Proceedings of the International Symposium on Image and Signal Processing and Analysis.

  • Vogiatzis, G., & Hernández, C. (2011). Video-based, real-time multi view stereo. Image and Vision Computing, 29(7), 434–441.

    Article  Google Scholar 

  • Woodford, O. J., Pham, M. T., Maki, A., Gherardi, R., Perbet, F., & Stenger, B. (2012). Contraction moves for geometric model fitting. In: Proceedings of the European Conference on Computer Vision.

  • Xu, L., Oja, E., & Kultanen, P. (1990). A new curve detection method: Randomized Hough transform (RHT). Pattern Recognition Letters, 11(5), 331–338.

    Google Scholar 

  • Zhang, W., & Kosecká, J. (2007). Nonparametric estimation of multiple structures with outliers. In R. Vidal, A. Heyden, & Y. Ma (Eds.), Dynamical Vision, Lecture Notes in Computer Science, vol. 4358 (pp. 60–74). Heidelberg: Springer.

    Google Scholar 

  • Zhang, Y., & Chen, T. (2010). Implicit shape kernel for discrimintative learning of the Hough transform detector. In: Proceedings of the British Machine Vision Conference.

  • Zuliani, M., Kenney, C. S., & Manjunath, B. S. (2005). The multiRANSAC algorithm and its application to detect planar homographies. In: Proceedings of the IEEE International Conference on Image Processing.

Download references

Acknowledgments

The authors are extremely grateful to Bob Fisher, Andrew Fitzgibbon, Chris Williams, John Illingworth and the anonymous reviewers for providing valuable feedback on this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Oliver J. Woodford.

Appendix Proof of the Integer Nature of Vote Weights

Appendix Proof of the Integer Nature of Vote Weights

Theorem 1

Given Eq. (3), an integer set of optimal values of \(\varvec{\uptheta }\) exists, i.e. for which \(\theta _{ij}\in \{0,1\}~\forall i,j\).

Proof

Let \(\varvec{\uptheta }^{\prime }\) denote a globally optimal value of \(\varvec{\uptheta }\), i.e. one that minimizes Eq. (3). Let us consider only the vote weights of the \(i^\mathrm th \) feature, and assume the other vote weights are fixed at their optimal value, i.e. \(\varvec{\uptheta }_j = \varvec{\uptheta }_j^{\prime } ~\forall j\ne i\). The objective function can then be written as

$$\begin{aligned} f(\varvec{\uptheta }_i)&= -\int _\mathcal H p(\mathbf y |\varvec{\uptheta }_i) \ln p(\mathbf y |\varvec{\uptheta }_i^{\prime }) ~\mathrm d \mathbf y ,\end{aligned}$$
(14)
$$\begin{aligned} p(\mathbf y |\varvec{\uptheta }_i)&= C(\mathbf y ) + \omega _i\sum _{j=1}^{J_i}\theta _{ij} K(\mathbf x _{ij},\mathbf y ), \end{aligned}$$
(15)

where \(C(\mathbf y )\) is a function which is independent of \(\varvec{\uptheta }_i\). Note that we have further assumed that the instance of \(\varvec{\uptheta }_i\) in the \(\ln p(\mathbf y |\varvec{\uptheta }_i)\) term in Eq. (14) is also at its optimal value, \(\varvec{\uptheta }_i^{\prime }\). This allows us to rewrite the objective function as follows:

$$\begin{aligned} f(\varvec{\uptheta }_i)&= D - \sum _{j=1}^{J_i}\theta _{ij} a_{ij}\end{aligned}$$
(16)
$$\begin{aligned} a_{ij}&= \int _\mathcal H \omega _iK(\mathbf x _{ij},\mathbf y ) \ln p(\mathbf y |\varvec{\uptheta }_i^{\prime }) ~\mathrm d \mathbf y \end{aligned}$$
(17)

where \(D\) is a constant, as are the values \(a_{ij}\). Given the constraints of Eq. (2), minimizing Eq. (16) with respect to \(\varvec{\uptheta }_i\), can always be achieved by setting \(\theta _{ij} = 1\) for one \(j\) for which \(a_{ij}\) is largest, and setting all other weights to 0. In addition, Gibbs’ inequality Falk (1970) implies that Eq. (14) is minimized when \(\varvec{\uptheta }_i= \varvec{\uptheta }_i^{\prime }\) (as we require them to be). Therefore the \(i^\mathrm th \) feature must have an integer set of optimal weights. This argument can be applied to each feature independently. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Woodford, O.J., Pham, MT., Maki, A. et al. Demisting the Hough Transform for 3D Shape Recognition and Registration. Int J Comput Vis 106, 332–341 (2014). https://doi.org/10.1007/s11263-013-0623-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-013-0623-2

Keywords

Navigation