Sparse Bayesian learning for image rectification with transform invariant low-rank textures☆
Introduction
Feature extraction in images is one of the fundamental problems in computer vision. Most of existing image features, e.g., SIFT points [1], Harris corners [2], Canny edges [3], are key to many high-level computer vision applications, such as 3D reconstruction, target identification, and scene understanding over the past decades. But, the image features mentioned above, called “low-level local features”, are always inaccurate and unrobust. Although many researchers still make unremitting efforts in improving the performance of the features, it is still a bottleneck of many computer vision applications.
Transform invariant low-rank textures (TILT) has been recently proposed in [4] to recover the deformation of user-specified patches in 2D images so that the underlying textures become regular. Instead of extracting low-level local features, TILT can in some sense globally rectify a large class of low-rank textures, e.g., regularity, symmetry or repetitions that can be measured with low-rank property, thus significantly improves the accuracy and robustness. Recently, TILT has been introduced to many computer vision tasks, such as camera calibration [5], 3D shape reconstruction [6], [7] and character rectification [8].
Generally speaking, TILT seeks to extract the intrinsic low-rank textures from deformed and corrupted images. In some sense, this problem can be modeled (approximately) as the robust principal component analysis (RPCA) presented in [9], [10], the texture I with a proper inverse transformation τ can be expressed as the superposition of two components: where I0 is the low-rank component and E is the sparse component. The only difference between TILT and RPCA is the introduction of an inverse transformation τ in TILT framework. As a result, some problems may arise in the processes where the practical methods for RPCA are modified to solve TILT problem.
There are two existing solvers for TILT. In seminal work of TILT [4], this task is implemented with the modified alternating direction method (ADM) [11]. However, in order to take into account the warping of textures, the transformation groups over images are exploited in TILT and thus make the problem nonlinear and non-convex. Zhang et al. [4] used the first-order term of Taylor expansion to linearize the original nonlinear problem. The newly added term leads to the unguaranteed convergence since the convergence of ADM has been well-studied and established under the case of two variables, while the inner loop of TILT problem has three variables [4]. As a result, Ren et al. [12] proposed an improved algorithm based on the linearized alternating direction method with adaptive penalty (LADMAP) [13], with a theoretical guarantee on the convergence of its inner loop.
Unfortunately, there are still some cases, as shown in Fig. 1, where the ADM and the improved LADMAP method may fail [4]: (a) two incompatible dominant low-rank structures (the facade and the shadow) overlapped; (b) the combined region containing two adjacent low-rank regions each of which is distorted differently; (c) too many occlusions; (d) random texture.
In this paper, we focus on the third situation when the user-specified patches are with too many corruptions. Although TILT is designed to be robust to corruptions and occlusions, there is an assumption essentially necessary that the amount of corruptions and occlusions can not be enormous. This is due to the fact that the ADM and LADMAP are both trying to solve the nuclear and ℓ1-norm constrained optimization problem, which has been proven to have lots of local minima when with too many noises and perturbations [14], [15]. Moreover, the aforementioned optimization algorithms need to set the trade-off parameters in advance, which are not known a priori. What’s worse, naive fixed values will lose uncertainty in the parameters, which induces uncertainty in predictions [16], [17]. These may drive the optimization to a mistaken end under this kind of complex situation.
Consequently, we turn to exploit the Bayesian approach which has been exhaustively investigated in the past decades [18], [19], [20], [21], [22], [23] and proven to have much less local minima [21]. In addition, instead of point estimation in the existing methods, nonparametric Bayesian approach treats all parameters as probability distributions over possible values, it remains the uncertainty in the parameters completely. Hence, the Bayesian based method can handle more complex situations in natural scenes, such as the case with corruptions and occlusions.
In this paper, we propose a Bayesian framework for TILT problem (BF–TILT), the low-rank component, the sparse component and the transformation are considered simultaneously. We assume that each entry in our model is guided by a corresponding hierarchical Bayesian modeling to induce prior. Then the variational Bayesian inference is implemented in the Bayesian framework to obtain the estimations.
The rest of this paper is organized as follows. In Section 2 we briefly review the definition of low-rank texture and TILT problem. Then the Bayesian framework is introduced in Section 3 for solving TILT, including the hierarchical Bayesian modeling and variational Bayesian inference. Empirical results with synthetic and real data are presented inSection 4, and finally the paper ends up with a conclusion in Section 5.
Section snippets
Transform invariant low-rank textures
In this section, we first briefly review the definition of low-rank texture and the mathematical model of TILT.
Bayesian framework for transform invariant low-rank textures
In this section, hierarchical Bayesian modelings will be employed to impose constraints on corresponding entries, all unknown quantities are treated as stochastic variables. Then we obtain all expectations via variational Bayesian approach [25], where the posteriors are approximated by maximizing lower-bound of marginal likelihood.
Numerical experiments
In this section, some numerical experiments are conducted to demonstrate the advantage of the proposed BF-TILT. The two exist algorithms, ADM2 and LADMAP3, are exploited for comparison. We conduct numerical experiments with both artificial synthetic images and real images to test the performance of the three solvers. All MATLAB codes are
Conclusion
In this paper, based on Bayesian framework, we propose a robust algorithm for better solving TILT problem. Hierarchical Bayesian modelings are exploited to our model to impose low-rankness inducing prior and sparsity-inducing prior on corresponding entries, variational Bayesian inference is implemented to compute the estimations of related parameters. Besides less local minima, nonparameter Bayesian approach introduces the uncertainty in the parameters to make our new algorithm can handle more
References (29)
- et al.
Bayesian compressive sensing for cluster structured sparse signals
Sig. Process.
(2012) - et al.
Model based bayesian compressive sensing via local beta process
Sig. Process.
(2015) Distinctive image features from scale-invariant keypoints
Int. J. Comput. Vis.
(2004)- et al.
A combined corner and edge detector.
Alvey Vision Conference
(1988) A computational approach to edge detection
IEEE Trans. Pattern Anal. Mach. Intell.
(1986)- et al.
Tilt: transform invariant low-rank textures
Int. J. Comput. Vis.
(2012) - et al.
Camera calibration with lens distortion from low-rank textures
Computer Vision and Pattern Recognition (CVPR)
(2011) - et al.
Unwrapping low-rank textures on generalized cylindrical surfaces
International Conference on Computer Vision (ICCV)
(2011) - et al.
Holistic 3d reconstruction of urban structures from low-rank textures
International Conference on Computer Vision (ICCV)
(2011) - et al.
Rectification of optical characters as transform invariant low-rank textures
International Conference on Document Analysis and Recognition (ICDAR)
(2013)
Robust principal component analysis?
J. ACM (JACM)
Robust principal component analysis: exact recovery of corrupted low-rank matrices via convex optimization
Advances in Neural Information Processing Systems (NIPS)
The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices
Technical Report UILU-ENG-09-2215
Linearized alternating direction method with adaptive penalty and warm starts for fast solving transform invariant low-rank textures
Int. J. Comput. Vis.
Cited by (3)
Change detection in SAR images based on matrix factorisation and a Bayes classifier
2019, International Journal of Remote SensingA Dynamical System with Fixed Convergence Time for Sparse Recovery
2019, IEEE Access
- ☆
This work is supported by NSFC Grant 61401315, by SRF for ROCS, SEM, under Grant 230303 and by the Chinese Scholarship Council (CSC).