Abstract
High-dimensional Gaussian filtering is a popular technique in image processing, geometry processing and computer graphics for smoothing data while preserving important features. For instance, the bilateral filter, cross bilateral filter and non-local means filter fall under the broad umbrella of high-dimensional Gaussian filters. Recent algorithmic advances therein have demonstrated that by relying on a sampled representation of the underlying space, one can obtain speed-ups of orders of magnitude over the naïve approach. The simplest such sampled representation is a lattice, and it has been used successfully in the bilateral grid and the permutohedral lattice algorithms. In this paper, we analyze these lattice-based algorithms, developing a general theory of lattice-based high-dimensional Gaussian filtering. We consider the set of criteria for an optimal lattice for filtering, as it offers a good tradeoff of quality for computational efficiency, and evaluate the existing lattices under the criteria. In particular, we give a rigorous exposition of the properties of the permutohedral lattice and argue that it is the optimal lattice for Gaussian filtering. Lastly, we explore further uses of the permutohedral-lattice-based Gaussian filtering framework, showing that it can be easily adapted to perform mean shift filtering and yield improvement over the traditional approach based on a Cartesian grid.
Similar content being viewed by others
References
Adams, A., Gelfand, N., Dolson, J., Levoy, M.: Gaussian kd-trees for fast high-dimensional filtering. In: ACM Transactions on Graphics (Proc SIGGRAPH ’09), pp. 1–12 (2009)
Adams, A., Baek, J., Davis, A.: High-dimensional filtering with the permutohedral lattice. In: Proceedings of EUROGRAPHICS, pp. 753–762 (2010)
Arbelaez, P., Maire, M., Fowlkes, C.C., Malik, J.: From contours to regions: an empirical evaluation. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 2294–2301 (2009)
Aurich, V., Weule, J.: Non-linear Gaussian filters performing edge preserving diffusion. In: Mustererkennung 1995, 17. DAGM-Symposium, pp. 538–545 (1995)
Bambah, R.P., Sloane, N.J.A.: On a problem of Ryskov concerning lattice coverings. Acta Arith. 42, 107–109 (1982)
de Boor, C.: B-form basics. In: Geometric Modeling, pp. 131–148 (1987)
de Boor, C., Höllig, J., Riemenschneider, S.: Box splines, vol. 98. Springer, Berlin (1993)
Buades, A., Coll, B., Morel, J.M.: A non-local algorithm for image denoising. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 60–65 (2005)
Chen, J., Paris, S., Durand, F.: Real-time edge-aware image processing with the bilateral grid. In: ACM Transactions on Graphics (Proc SIGGRAPH ’07), p. 103 (2007)
Cheng, Y.: Mean shift, mode seeking, and clustering. IEEE Trans. Pattern Anal. Mach. Intell. 17(8), 790–799 (1995)
Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 603–619 (2002)
Conway, J.H., Sloane, N.J.A.: Sphere Packings, Lattices and Groups. Springer, Berlin (1998)
Delaunay, B.N., Ryskov, S.S.: Solution of the problem of least dense lattice covering of a four-dimensional space by equal spheres. Sov. Math. Dokl. 4, 1014–1016 (1963)
Dolbilin, N.P.: Minkowski’s theorems on parallelohedra and their generalizations. Russ. Math. Surv. 62, 793–795 (2007)
Dolson, J., Baek, J., Plagemann, C., Thrun, S.: Upsampling range data in dynamic environments. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1141–1148 (2010)
Durand, F., Dorsey, J.: Fast bilateral filtering for the display of high-dynamic-range images. In: Proc. SIGGRAPH ’02, pp. 257–266 (2002)
Eisemann, E., Durand, F.: Flash photography enhancement via intrinsic relighting. In: ACM Transactions on Graphics (Proc. SIGGRAPH 04), pp. 673–678 (2004)
Entezari, A., Dyer, R., Möller, T.: Linear and cubic box splines for the body centered cubic lattice. In: IEEE Visualization, pp. 11–18 (2004)
Entezari, A., Ville, D.V.D., Möller, T.: Practical box splines for reconstruction on the body centered cubic lattice. IEEE Trans. Vis. Comput. Graph. 14, 313–328 (2008)
Greengard, L.F., Strain, J.A.: The fast gauss transform. SIAM J. Sci. Stat. Comput. 12, 79–94 (1991)
Grundmann, M., Kwatra, V., Han, M., Essa, I.: Efficient hierarchical graph-based video segmentation. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 2141–2148 (2010)
Kershner, R.: The number of circles covering a set. Am. J. Math. 61, 665–671 (1939)
Kim, M.: Symmetric box-splines on root lattice. PhD thesis, University of Florida, Gainesville, FL (2008)
Kopf, J., Cohen, M.F., Lischinski, D., Uyttendaele, M.: Joint bilateral upsampling. In: ACM Transactions on Graphics (Proc. SIGGRAPH 07), p. 96 (2007)
Kuhn, H.W.: Some combinatorial lemmas in topology. IBM J. Res. Dev. 45, 518–524 (1960)
Martin, D., Fowlkes, C., Tal, D., Malik, J.: A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: Proc. 8th Int. Conf. Computer Vision, vol. 2, pp. 416–423 (2001)
Mirzargar, M., Entezari, A.: Voronoi splines. IEEE Trans. Signal Process. 58, 4572–4582 (2010)
Mood, R.V., Patera, J.: Voronoi and Delaunay cells of root lattices: classification of their faces and facets by Coxeter-Dynkin diagrams. J. Phys. A, Math. Gen. 25, 5089–5134 (1992)
Paris, S.: Edge-preserving smoothing and mean-shift segmentation of video streams. In: Proc. European Conference on Computer Vision, pp. 460–473 (2008)
Paris, S., Durand, F.: A fast approximation of the bilateral filter using a signal processing approach. In: Proc. European Conference on Computer Vision, pp. 568–580 (2006)
Paris, S., Durand, F.: A topological approach to hierarchical segmentation using mean shift. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2007)
Paris, S., Durand, F.: A fast approximation of the bilateral filter using a signal processing approach. Int. J. Comput. Vis. 81(1), 24–52 (2009)
Perlin, K.: Noise. In: ACM SIGGRAPH ’02 Course Notes (2002)
Petersen, D.P., Middleton, D.: Sampling and reconstruction of wave-number-limited functions in N-dimensional Euclidean spaces. Inf. Control 5, 279–323 (1962)
Petschnigg, G., Szeliski, R., Agrawala, M., Cohen, M., Hoppe, H., Toyama, K.: Digital photography with flash and no-flash image pairs. In: ACM Transactions on Graphics (Proc. SIGGRAPH 04), pp. 664–672 (2004)
Rahimi, A., Recht, B.: Random features for large-scale kernel machines. Adv. Neural Inf. Process. Syst. 20, 1177–1184 (2008)
Rogers, C.A.: Packing and Covering. Cambridge University Press, Cambridge (1964)
Ryshkov, S.S.: The structure of primitive parallelohedra and Voronoi’s last problem. Russ. Math. Surv. 53, 403–405 (1998)
Ryskov, S.S., Baranovskii, E.P.: Solution of the problem of least dense lattice covering of five-dimensional space by equal spheres. Sov. Math. Dokl. 16, 586–590 (1975)
Samelson, H.: Notes on Lie Algebras. Van Nostrand Reinhold Company, Princeton (1969)
Senechal, M.: Quasicrystals and Geometry. Cambridge University Press, Cambridge (1995)
Serre, J.P.: Complex Semisimple Lie Algebras. Springer, Berlin (1987)
Smith, S., Brady, J.M.: SUSAN: a new approach to low level image processing. Int. J. Comput. Vis. 23, 45–78 (1997)
Tomasi, C., Manduchi, R.: Bilateral filtering for gray and color images. In: Proc. International Conference on Computer Vision, pp. 836–846 (1998)
Voronoi, G.F.: Studies of Primitive Parallelotopes, vol. 2, pp. 239–368. Ukrainian Academy of Sciences Press, Kiev (1952)
Wang, P., Lee, D., Gray, A., Rehg, J.M.: Fast mean shift with accurate and stable convergence. In: Melia, M., Shen, X. (eds.) Artificial Intelligence and Statistics (2007)
Weiss, B.: Fast median and bilateral filtering. In: ACM Transactions on Graphics (Proc. SIGGRAPH ’06), pp. 519–526 (2006)
Winnemöller, H., Olsen, S.C., Gooch, B.: Real-time video abstraction. ACM Trans. Graph. 25(3), 1221–1226 (2006)
Witsenhausen, H.S.: Spiegelungsgruppen und aufzählung halbeinfacher liescher Ringe. Abh. Math. Sem. Univ. Hamburg 14 (1941)
Yang, C., Duraiswami, R., Gumerov, N.A., Davis, L.: Improved fast gauss transform and efficient kernel density estimation. In: Proc. International Conference on Computer Vision, vol. 1, pp. 664–671 (2003)
Yang, Q., Yang, R., Davis, J., Nistér, D.: Spatial-depth super resolution for range images. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2007)
Yaroslavsky, L.P.: Digital Picture Processing. An Introduction. Springer, Berlin (1985)
Acknowledgements
We would like to thank Marc Levoy for his advice and support, as well as Nokia Research.
Jongmin Baek acknowledges support from Nokia Research, as well as Lucent Technologies Stanford Graduate Fellowship; Andrew Adams is supported by a Reed-Hodgson Stanford Graduate Fellowship; Jennifer Dolson acknowledges support from an NDSEG Graduate Fellowship from the United States Department of Defense.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
1.1 A.1 Useful Lemmas
This section outlines a number of useful lemmas required for computing the SNR terms for K x and Z x. Most of them concern the permutohedral lattice—more machinery is required for the derivations on \(A^{*}_{d}\). They are included here so that Sect. A.2 and A.3 can remain succinct and parallel.
Lemma A.1
The power of a d-dimensional Gaussian kernel with covariance θI is (4πθ)−d/2.
Proof
as desired. □
Lemma A.2
Let \(\mathcal{S}\) be a d-dimensional simplex defined by vertices \(\mathbf {s_{0}},\ldots,\mathbf {s_{d}}\), and let B 0(x),B 1(x),…,B d(x) be the barycentric coordinates of x for the vertices in the given order. Then, ∀k,j∈{0,…,d},
Proof
Let P be the face defined by all vertices except \(\mathbf {s_{k}}\). Then \(\mathcal{S}\) is a pyramid with base P and apex \(\mathbf {s_{k}}\). Let h be the height of the altitude and let y be the foot of the altitude. Consider slicing \(\mathcal{S}\) along a hyperplane parallel to P that contains \((1-t)\cdot \mathbf {s_{k}}+t\cdot \mathbf {y}\), for some t∈[0,1]. The slice will be similar to P with a scale factor of t, and by definition of barycentric coordinates, B k(⋅) will evaluate to a constant, namely 1−t, across the whole slice. Therefore,
Note that the same integral without the weight term (1−t)2 will simply yield the volume of \(\mathcal{S}\):
Dividing (A.1) by (A.2) yields the first of the two cases. When k≠j, we consider applying the same decomposition to P: let P′ be the face of P spanned by all vertices of \(\mathcal{S}\) except \(\mathbf {s_{k}}\) and \(\mathbf {s_{j}}\). Then P is again a pyramid with base P′ and height h′, and y′ the foot of the altitude. Each slice of P parallel to P′ cutting the altitude at \((1-r)\cdot \mathbf {s_{j}}+r\cdot \mathbf {y'}\) is again a copy of P′ scaled by r, and the barycentric weight B j(⋅) is uniformly 1−r. Therefore,
Without the barycentric weights,
Dividing (A.3) by (A.4) yields the second of the two cases, as desired. □
Lemma A.3
Let \(\mathbf {x},\mathbf {y}\in A^{*}_{d}\). Then the number of Delaunay cells containing both x and y is given by
Proof
Since every Delaunay cell is given by permuting the coordinates of the canonical simplex and translating the vertices, if x−y cannot be written as a permutation of a vertex of the canonical simplex, then they must not belong to any common Delaunay cell. So we can assume that x−y is a permutation of some vertex of the canonical simplex.
Without loss of generality, let y=0. Then any simplex containing y has a corresponding permutation in S d+1. Among those, the number of simplices also containing x is the number of permutations that sends the remainder-k vertex of the canonical simplex to x.
Because components of x have two possible values, namely k and k−(d+1), and the number of these two values are d+1−k and k respectively, the number of possible permutations is (d+1−k)!k!. □
The next pair of lemmas illustrate how to integrate arbitrary functions over the canonical simplex with particular weighting schemes.
Lemma A.4
Let X be the union of Delaunay cells containing the origin in \(A^{*}_{d}\), and let f:X→ℝ be a function. Then,
Proof
This is a direct consequence of Proposition 5.4. In short, weighted sampling corresponding to K s can be achieved by uniformly sampling in the unit cube and projecting the sample onto H d. □
Lemma A.5
Let \(f:A^{*}_{d}\rightarrow\mathbb{R}\) be a function on \(A^{*}_{d}\). Let \(b:A^{*}_{d}\rightarrow\mathbb{R}\) and the vectors \(\mathbf {w_{k}}\) be as in Sect. 5.2. Then,
Proof
The blur kernel b(x) is given by the convolution of \(K^{0}_{b},\ldots,K^{d}_{b}\). Therefore, it is the accumulation of the weights corresponding to offsets in the d+1 axes that sum to x. The weight equals \(\frac{1}{2}\) if the offset t k in a given axis is zero, and \(\frac{1}{4}\) if it is t k=±1. Hence we can write it succinctly as \(\frac{1}{2^{1+|t_{k}|}}\). The claim follows. □
1.2 A.2 Evaluation of SNR for ℤd
The kernel Z x repeats periodically because of the translational symmetry of ℤd. In order to evaluate the expected value of an expression involving Z x, it suffices to consider x∈[0,1]d. In other words,
Expanding the quadratic term in the integrand, we obtain \(\eta={\frac{(A.6)}{(A.5)+(A.6)-2(A.7)}}\), containing the following three expressions:
First of all, Lemma A.1 tells us that (A.6) is (10π/3)−d/2.
(A.5) is similarly separable:
Note that B p(x) and B p′(x) are both nonzero only if p=p′ or |p−p′|=1. In the former case, the integral of B p(x)B p′(x) with respect to x over ℝ is 2/3; in the latter case, 1/6. Similarly, B q(y) and B q′(y) are both nonzero over y∈[0,1] iff q,q′=0 or 1. Setting q=q′ yields 1/3 for the value of the integral, and q≠q′ yields 1/6. Summing over all possible values of p′,q,q′ with nonzero summands, we obtain that (A.5) equals
The cross term (A.7) remains. We can similarly expand the expression and collect terms into the product of sums, and exploit the fact that only few values of p,q yield nonzero summands. Unfortunately, because of the Gaussian term, there is no succinct analytic expression.
Corollary A.6
The SNR term for ℤd is
1.3 A.3 Evaluation of SNR for \(A^{*}_{d}\)
The error analysis for \(A^{*}_{d}\) is analogous to that of ℤd, with a few subtle changes. First, K x does not sum to 1 over its domain and should be normalized first before being compared to a Gaussian.
Lemma A.7
Proof
The last line follows as the sum of splatting weights for x over all lattice points is also 1. □
Since K x is translation-invariant (for translations by a lattice point) and symmetric with respect to permutation of the axes, it suffices to iterate x over the canonical simplex C ∗, in order to compute the expected value of the L 2 distance. Recall from Proposition 4.11 that the volume of C ∗ is given by (d+1)d−1/2/d!. Therefore,
η can be written as \({\frac{(A.9)}{(A.8)+(A.9)-2(A.10)}}\), where the components are as follows:
By Lemma A.1, (A.9) equals \((\frac{8(d+1)^{2}\pi}{3})^{-d/2}\).
Let us compute (A.8) without the constant term. The integrand may be expanded as follows:
Our expression is a weighted sum in which the weights are \(b(\mathbf {q_{1}})b(\mathbf {q_{2}})\). By Lemma A.5, it is equivalent to
where \(\mathbf {q_{1}}=\sum t_{k}\cdot \mathbf {w_{k}}\) and \(\mathbf {q_{2}}=\sum s_{k}\cdot \mathbf {w_{k}}\).
Note that the first bracketed term is effectively given by Lemma A.2. The second term is similar, but the integral is over H d, so we must additively consider each simplex whose vertex contains both \(\mathbf {p_{1}}-\mathbf {q_{1}}\) and \(\mathbf {p_{2}}-\mathbf {q_{2}}\). The number of such simplices is given by Lemma A.3.
(A.11) is more attractive than its original form because its summands are now simple numerical quantities. There is no succinct analytic expression for the sum, but its numerical computation is straightforward. The number of summands is (d+1)2⋅32(d+1).
That leaves us with (A.10). Once again, the integral is expanded and manipulated:
Using Lemma A.5, we obtain
where \(\mathbf {q}=\sum t_{k}\cdot \mathbf {w_{k}}\). By symmetry, we can swap out C ∗ with the union of Delaunay cells containing the origin. Then we have an integral weighted by B p(x) and B p+q(y), which can be computed by Monte-Carlo sampling the (d+1)-dimensional unit cube, by Lemma A.4.
In summary, we iterate over t∈{−1,0,1}d+1 and p∈C ∗, and for each pair of vectors, sample x and y via Lemma A.4, and evaluate the Gaussian of y−x, weighted by \(\prod_{k}\frac{1}{2^{1+|t_{k}|}}\). Multiplying the result by an appropriate constant yields (A.11).
Rights and permissions
About this article
Cite this article
Baek, J., Adams, A. & Dolson, J. Lattice-Based High-Dimensional Gaussian Filtering and the Permutohedral Lattice. J Math Imaging Vis 46, 211–237 (2013). https://doi.org/10.1007/s10851-012-0379-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10851-012-0379-2