Keywords

1 Introduction

Model estimation is a crucial step in various computer vision problems, including structure from motion (SFM), dim targets detection, image retrieval and SLAM, etc. One of the main challenges during model estimation is that the measurements (often points or correspondences) are unavoidably contaminated with outliers due to the imperfection of current measurements acquisition algorithms. These contaminated measurements will probably lead to an arbitrary bad model which could bring a catastrophic impact on the final result in real world applications.

In order to verify and exclude the outliers, a variety of robust model fitting techniques [3, 10, 14, 15, 19, 25,26,27] have been studied for years and can be roughly categorized into two kinds. One kind of methods [4, 17, 27, 28] analyze each measurement by its residuals of different model hypotheses, then distinguish inliers from the outliers. This kind of method are usually used in multi-model fitting problems and limited by the requirement of abundant model hypotheses. The other kind of algorithms generate model hypotheses by sampling in the measurements, then find the best model via optimizing a well-designed cost function. The cost function can be solved by random sampling [2, 14, 19, 25, 26] or branch-and-bound (BnB) algorithm which guarantees a optimal solution with much more time consuming [3, 15, 15, 29]. In real-world applications, these algorithms are carefully selected in terms of different usages. Among these methods, random sample consensus (RANSAC) algorithm [10] is one of the most popular algorithms for model estimation, which is broadly known by its simplicity and effectiveness. The algorithm follows a vote-and-verify paradigm. It repeatedly draw minimal measurement sets by random sampling, and calculate model parameters through each measurement set to form a model hypothesis. For every model hypothesis, a verification step is also needed. All the measurements are incorporated to calculate their residuals with current model parameters. Then, measurements which residuals below a certain threshold will be grouped as the consensus set of that model hypothesis. After sufficient trials of model generation and verification, RANSAC selects the model hypothesis with the largest consensus set as the optimal hypothesis. And the final model parameters are calculated from the consensus set of the optimal hypothesis.

However, RANSAC still needs improvement for better efficiency. In order to get sufficient trials in RANSAC, we must draw k trials to get at least one outlier-free sample with confidence \(\eta _{0}\),

$$\begin{aligned} k \ge \frac{log(1-\eta _{0})}{log(1-\epsilon ^{m})}, \end{aligned}$$
(1)

where \(\epsilon \) is the fraction of inliers in the dataset, m denotes the size of the minimal sampling set. It’s obvious that as \(\epsilon \) declines, k grows dramatically which brings heavy computational burden on the algorithm. This phenomenon results from the uniform sampling in RANSAC, that the algorithm samples every measurement at equivalent probability. Therefore some valuable information indicating true inliers will be overlooked, which could have been used to cut down the running time of the algorithm.

In this paper, we mainly address on this problem and present ARSAC, which aims to boosts the efficiency of the algorithm. Our contributions are as follows:

  • We propose a method which adopts non-uniform sampling based on adaptively ranked measurements. It updates the ranking of measurements at each trial, and incorporate most prominent measurement into sample, which is experimentally proved to achieve high efficiency.

  • We designed a mild geometric constraint, which samples measurements with broad spatial distribution. It largely alleviates degeneracy in epipolar geometry estimation and improves the overall robustness of the algorithm.

  • We develop ARSAC, an efficient algorithm which is able to achieve better robustness compared to other efficiency-improving algorithms.

2 Related Work

Due to the inefficiency of RANSAC, a number of schemes have been proposed to improve its performance. Basic ideas focus efficiency improving on sampling and model verification. For efficiency on sampling, non-uniform sampling is conducted by leveraging prior knowledge of the measurements. NAPSAC [20] selects measurements that are sampled within a hypersphere in high dimensional space of radius r. It can effectively handle the problem of poor sampling in higher dimensional space, but may result in degeneration for the measurements may be too close to each other. GroupSAC [21] divides the measurements into different groups, and samples from the most prominent group. However, it relies on the performance of the group function which is supposed to make sense in the algorithm. PROSAC algorithm [7] ranks the measurements by their quality, then draw non-uniform sampling with measurements in descending quality order. However, the fixed quality order of measurements could not be sufficient to precisely reflect the probability of true inliers, especially for the scenes with unapparent features. PROSAC also needs to be improved in case of degeneracy [23] during non-uniform sampling.

For the efficiency on model verification, a subset of all measurements is often chosen for verification. Preemptive methods [1, 6] follows the paradigm that a very small number of measurements is randomly selected for verification to filter out extremely invalid models. SPRT test [8, 18] uses Wald’s Theory of sequential testing [13] on model verification with adaptive likelihood ratio to be a criterion. However, these methods may lead to misjudgments towards false outliers, that the conclusion is usually drawn before all measurements are verified.

For the purpose of robust estimation, some methods [5, 24] extended the inlier set by conducting extra RANSAC trials based on the current best consensus set, which explores potential inliers that is not strictly consistent with current best model. In order to handle with degeneracy, QDEGSAC [11] uses several runs of RANSAC to provide the most constraining model, which leads to the least probability of degeneracy. However, these methods requires further runs of RANSAC which helps little for efficiency.

Our method adopts non-uniform sampling for efficiency by maintaining an ordered measurements set. Different from the methods above, we adaptively update the ranking of measurements to combine prior knowledge and current model information together. At the mean time, in order to alleviate degeneracy which usually caused by non-uniform sampling, we constrain the spatial distribution of measurements during the sampling process, which requires no further trials.

Fig. 1.
figure 1

The inlier rate \(\epsilon \) out of top n measurements for different algorithms from synthetic dataset. (a) denotes RANSAC which draws uniform sampling, (b) denotes PROSAC that draws non-uniform sampling with fixed ranking and (c) denotes ARSAC which adopts adaptively ranked non-uniform sampling. The blue dashed denotes the inlier rate of the whole dataset, which is 0.45. (Color figure online)

3 Adaptively Ranked Sampling Consensus Algorithm

In order to improve the efficiency of RANSAC, ARSAC adopts non-uniform sampling that measurements with high quality are sampled in advance. Different from methods with fixed ranking of measurements [7], in ARSAC, the ranking of measurements is iteratively updated, which offers a better description of the measurements with high quality. In the experiment of synthetic data shown in Fig. 1, the inlier rate of RANSAC in top n measurements fluctuate around that of whole dataset, for RANSAC does not rank the measurements. Both PROSAC and ARSAC keep high inlier rate in the top ranked measurements. However, the inlier rate of PROSAC falls down quickly as n grows and there are ups and downs in its inlier rate curve, which means the ranking strategy that PROSAC implemented could not effectively choose real inliers. While ARSAC finds the largest number of top measurements which keep very high inlier rate. That means in limited sampling trials, ARSAC reaches the highest probability to get all-inlier samples, which strongly contributes the convergence of the algorithm. At the same time, a mild geometric constraint is proposed to constrain the sampling of ARSAC. Note that \(\mathcal {X}=\{x_{i}\}^{N}_{i=1}\) is the dataset with N measurements, where i is the index of every measurement. \(\mathcal {M}_{j}\) is the minimal sample set of size m, j denotes the index of the trials. \(\theta _{j}\) is the model hypothesis generated by \(\mathcal {M}_{j}\) and its consensus set is noted as \(I_{j}\). \(\mathcal {U}\) is the ranked measurements, where \(\mathcal {U}_n\) denotes top n measurements of it. The whole procedures of ARSAC are described in Algorithm 1.

figure a

3.1 Adaptively Ranked Progressive Sampling

In ARSAC, we adopts non-uniform sampling and improve the scheme by adaptively updating the ranking of measurements. We first show the sampling process when a ranked measurements set is given, then introduce the updating strategy for ranked measurements with posterior information and geometric constraint respectively.

Non-uniform Sampling with Ranked Measurements. When given a set of ranked measurements \(\mathcal {U}\), it’s intuitively difficult to ensure which measurements should be selected or how many times should a measurement be selected. We thus use a progressive scheme which was inspired by [7], that a subset \(\mathcal {U}_{n}\) containing n top-ranked measurements is chosen to draw a minimal sized sample. As the sampling process continues, \(\mathcal {U}_{n}\) will gradually increase. This aims to draw the same samples as RANSAC does even with a non-uniform sampling strategy. Note that \(S_{N}\) is the total samples drawn in standard RANSAC, let \(S_{n}\) denotes the average number of samples that only contains measurements from \(\mathcal {U}_{n}\)

$$\begin{aligned} S_{n}=S_{N}\frac{{n \atopwithdelims ()m}}{{N \atopwithdelims ()m}}=S_{N}\prod _{i=0}^{m-1}\frac{n-i}{N-i} \;. \end{aligned}$$
(2)

Considering that there is overlap between samples in \(S_{n}\) and \(S_{n-1}\), so the number of samples \(S^{'}_{n}\) that need to be drawn for current \(\mathcal {U}_{n}\) will be

$$\begin{aligned} S^{'}_{n}=\lceil S_{n}-S_{n-1} \rceil , \end{aligned}$$
(3)

where \(\lceil \bullet \rceil \) denotes the operation of getting upper bound. It’s worth noticing that new sample drawn from \(\mathcal {U}_{n}\) follows the principle that the nth measurement in \(\mathcal {U}_{n}\) must be selected and the rest \(m-1\) measurements are randomly chosen from \(\mathcal {U}_{n-1}\). When \(S^{'}_{n}\) samples is drawn from current \(\mathcal {U}_{n}\), n will increase by 1, and \(\mathcal {U}_{n}\) will in turn expand by including the best inlier from the remaining measurements \(\mathcal {U}\backslash \mathcal {U}_{n}\).

Measurements Updating. Measurements updating in ARSAC is related to the sampling process. The ranking of measurements is initialized by the evaluation of isolated measurements [16], then is iteratively updated whenever \(\mathcal {U}_{n}\) is about to expand (see Sect. 3.1). We assume that the current best model \({\theta ^{\star }_{j}}\) is more reliable to evaluate the quality of remaining measurements \(\mathcal {U}\backslash \mathcal {U}_{n}\) than the initial ranking. In ARSAC, the best inlier \(x^{\star }_{i}\) of \({\theta ^{\star }_{j}}\) is calculated by the residuals of the model. Then the location of \(x^{\star }_{\mathcal {U}_{n}}\) in current quality ranking is leveraged to judge the reliability of current ranking. Once \(x^{\star }_{i}\) does not locate in the top \(\beta \) fraction of \(\mathcal {U}\backslash \mathcal {U}_{n}\), the existing ranking will be regarded as unreliable and \(x^{\star }_{\mathcal {U}_{n}}\) will be inserted after \(\mathcal {U}_{n}\), which also results in an updated \(\mathcal {U}\) that offers a better description of the latent inliers. The default value of \(\beta \) is set to 0.05.

We enhance the sampling strategy by a simple constraint to alleviate degeneracy. Inspired by situations illustrated in [9], We assume that measurements leading to degeneracy are liable to gather in limited areas in the image. In ARSAC, a constraint circle \(C_{\mathcal {U}_{n}}\) is presented which contains all measurements in \(\mathcal {U}_{n}\). The center of the circle \(c_{\mathcal {U}_{n}}\) lies on the center of measurements in \(\mathcal {U}_{n}\), and the radius \(r_{\mathcal {U}_{n}}\) is set to be

$$\begin{aligned} r_{\mathcal {U}_{n}}=\max \Arrowvert x_{i}-c_{\mathcal {U}_{n}} \Arrowvert _{2} + \lambda , \quad x_{i} \in \mathcal {U}_{n} \end{aligned}$$
(4)

where \(\lambda \) is a step factor to decide the extra expansion of the circle. New measurement to be included in \(\mathcal {U}_{n}\) should meet with the constraint that the position of it should be out of the range of \(C_{\mathcal {U}_{n}}\) in the image, in order to avoid degeneracy which result from the clustering nature of degenerated measurements.

3.2 Stopping Criteria

In order to get an optimal model after a number of trials, The basic stopping criteria of ARSAC are designed by three constraints: maximality constraint, non-randomness constraint and geometry constraint.

The maximality constraint guarantees that after \(k_{\eta }\) trails, the probability of exsiting another model which has a larger consensus set falls below a certain threshold \(\eta \)

$$\begin{aligned} (1-\epsilon ^{m}_{0})^{k_{\eta }} \le {\eta }, \end{aligned}$$
(5)

where \(\epsilon _{0}\) denotes the inlier rate of the measurements. The trials of our algorithm k must be larger than \(k_{\eta }\) before being terminated.

The non-randomness constraint is designed to make the probability of the situation below a certain threshold \(\varPsi \), that a bad model is calculated and supported by a consensus set by chance. The cardinal g of the consensus set for a wrong model follows the binomial distribution B (n\(\beta \)), that

$$\begin{aligned} P^{n}(g)=\beta ^{g-m}(1-\beta )^{n-g+m}{n-m \atopwithdelims ()g-m}, \end{aligned}$$
(6)

where \(\beta \) is the probability of a measurement in \(\mathcal {U}_{n}\) to be misclassified as an inlier by a wrong model. Thus for each n, the minimal size of consensus set \(L^{n}_{min}\) is

$$\begin{aligned} L^{n}_{min}=\min \{j: \sum ^{n}_{g=j}P^{n}(g) < \varPsi \} \; . \end{aligned}$$
(7)

For the concern of time saving, we further assume the size of the subset \(\mathcal {U}_{n}\) is big enough that the distribution can be approximated by Gaussian distribution according to central limit theorem

$$\begin{aligned} B(n,\beta ) \sim N(\mu ,\sigma ), \end{aligned}$$
(8)

where \(\mu =n\beta \), and \(\sigma =\sqrt{n\beta (1-\beta )}\). Then Eq. (7) could be demonstrated by Chi-square distribution. So the minimal size of consensus should be

$$\begin{aligned} L^{n}_{min}=\lceil m+\mu +\sigma *\sqrt{Chi^{2}} \rceil , \end{aligned}$$
(9)

where \(Chi^{2}\) is determined by the threshold \(\varPsi \). The trails of ARSAC will not stop if the size of the best consensus set is not bigger than \(L^{n}_{min}\).

Geometric constraint is used to judge whether the measurements in \(\mathcal {U}_{n}\) distribute in an adequate range. Note that \(D_{\mathcal {U}_{n}}\) is the bounding box of measurements in \(\mathcal {U}_{n}\), and \(D_{total}\) is the bounding box of all measurements in the image. We terminate the trails when the condition is met

$$\begin{aligned} \frac{D_{\mathcal {U}_{n}}}{D_{total}}>r_{range}, \end{aligned}$$
(10)

where \(r_{range}\) is the acceptable ratio of two bounding boxes.

4 Experiments

In this section, we conduct experiments on real-world data to verify the efficiency and robustness of ARSAC. We evaluate different algorithms on two well-known model estimation problems: fundamental matrix (F) estimation and homography (H) estimation. Both model estimation problems require effective outlier removal process to guarantee the accuracy of the final solution. Fundamental matrix constrains the 3D spatial relationship between two views and homography describes the transformation between two plane objects. Solutions to both problems can be get by solving least-square problems. The minimal sample size for fundamental matrix is 7 while that for homography is 4. As discussed in Sect. 2, there has been some algorithms for better efficiency during measurements sampling. So in the experiments, except for RANSAC we further compare the proposed ARSAC with the following state-of-the-art algorithms: NAPSAC, PROSAC, GroupSAC. We implement these algorithms in Matlab. All the evaluations are performed on an Intel i7 CPU with 32 GB RAM.

We tested ARSAC’s performance on real-world images for model estimation problems. The datasets is provided by [22], each dataset presents various of challenges in terms of low inlier rate and degeneracy. In the experiments, dataset A \(\sim \) D is selected to estimate the fundamental matrix, while dataset E \(\sim \) H are used to perform homography estimation. Since the ground truth of inlier is unknown on real-world images, we approximate it by performing \(10^{6}\) trials of random sampling. The baseline comparison is shown in Table 1. For each algorithm, the table lists the found inliers (I), the number of trails (k) and the total runtime (time) measured by millisecond. The error is computed by Sampson error [12] and return the mean value after 500 executions for each algorithm. As shown in Table 1, in most cases, ARSAC performs the best in terms of average trials and time than other algorithms. What’s more, the average error of ARSAC keeps at the lowest level compared with other efficient-driven algorithms. It is worth noticing that RANSAC could find more inliers and estimate model with lower error, but this performance builds on significant number of trials which is the case we are trying to avoid.

In order to compare the efficiency of different methods intuitively, we display the least trials which are needed for different methods to fall below a certain error threshold. Considering that ARSAC, PROSAC and GroupSAC shows far better performance in efficiency compared with NAPSAC, we compare those three algorithms as shown in Fig. 2. As is illustrated in the diagrams, for some cases, all methods provide satisfactory results for efficiency, but for some specific cases, e.g. low inlier rate or degeneracy, ARSAC shows its superiority in fastest convergence compared with other methods which the adaptively ranked scheme is not implemented.

Table 1. Baseline comparison of ARSAC with other algorithms on real-world data.
Fig. 2.
figure 2

Average trials by each algorithm to reach predefined error threshold \(\delta \). The label on X-axis correspond to datasets in Table 1, Y-axis denotes the minimal trials of each algorithm to make the error of model falls below \(\delta \), where \(\delta \) is set to 3.0. The plots represent average value of 500 runs.

Table 2. Degenerate cases for fundamental matrix estimation.

We further compare the robustness of ARSAC with other algorithms. Table 2 shows the number of degenerate cases (\(k_{Deg}\)) among total trials (k) in fundamental matrix estimation on real-world dataset A \(\sim \) D. Each algorithm is executed 500 times. The result shows that ARSAC performs the best with very few degenerate cases for every dataset, thanks to its geometric constraint which samples measurements with broad spatial distribution. The true inlier rate for different algorithms’ consensus set also demonstrate the robustness of ARSAC, as the optimal consensus set determines the final result of the model. We compare them as shown in Fig. 3, it can be seen that ARSAC prevails over all other non-uniform sampling algorithms. At the mean time, the true inlier rate of ARSAC could reach the same level as that of RANSAC, with a significant improvement of efficiency.

It could be observed that in the context of efficiency-driven robust model fitting problems, algorithms like PROSAC and NAPSAC improve the efficiency a lot, but the error of estimated model seems too large to be acceptable. GroupSAC performs well in many cases, but it could suffer from degeneracy in fundamental matrix estimation for scenes with dominant planes. Among all the algorithms, ARSAC presents the best performance in terms of both efficiency and robustness, which provide a better solution for robust model fitting problems.

Fig. 3.
figure 3

Fraction of true inliers returned by each algorithm for fundamental matrix estimation and homography estimation problems. The label on X-axis correspond to datasets in Table 1, the Y-axis denotes the inlier rate of each dataset estimated by different algorithms. The plots represent average value of 500 runs.

5 Conclusion

We present ARSAC, a novel variant of RANSAC that draws non-uniform sampling with adaptively ranked measurements. At each trial of ARSAC, the algorithm selects measurement with highest quality into the new sample set. We also propose a mild geometric constraint in ARSAC to alleviate degeneracy. Our algorithm is capable of handling low inlier rate measurements in model estimation and is shown to be more efficient and robust compared with state-of-the-art algorithm. Though proved to be effective, our algorithm may be bothered by user-provided parameter settings. In future work, we plan to explore a parameter-free strategy for simplicity with the performance being guaranteed.