Abstract
It is an important but challenging issue to construct a reasonable seamless image mosaic from images with non-ignorable different viewpoints or multiple distinct planes. The main limitations of existing image-stitching approaches lie in two facts: (a) the multiple plane nature of scenes has not been well considered in the image alignment step, which usually results in obvious misalignments; (b) the ignored alignment errors often lead to broken structures in the seam composition step. To overcome these problems, this paper proposed a smoothly planar homography model for image stitching, by considering the multi-plane geometry of natural scene. First, we integrate local warps estimated in each plane to achieve smoothly plane stitching. Then, we introduce a novel alignment-guided seam composition to handle parallax. Experimental results on a series of challenging data demonstrate that our model achieves the state-of-the-art stitching performance.
G.-S. Xia—This work was supported by the NSFC Grants under the contract No. 41501462 and No. 61771350, and the Outstanding Youth Project of Hubei Province under the contract No. 2017CFA037.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Image stitching has been extensively studied recently and applied in many fields, such as scene understanding [31], virtue reality [12], photogrammetry and remote sensing [10]. However, they often perform under the assumptions [22] that the imaging scene is approximately planar, or that images are taken under simple camera rotations. Obviously, these conditions are not always conformed with the real case, especially for photos taking by smart phones or cameras, as demonstrated in Fig. 1. The main challenges are:
-
Global warps [2] or even local warps [28] are difficult to handle the complex scene with different dominant planes. The former adopts only one transformation, which lacks the flexibility for complex scenes. The latter often ignores the different planes in the scene and causes large alignment errors.
-
The existing methods cannot work well on images with large parallax, caused by random shooting positions and viewing angles [30]. Thus, they will inevitably bring noticeable artifacts or objectionable distortions.
Many approaches have been proposed to solve these problems. The main solution is spatially varying warps, e.g. multiple local warps [7] or the global warp with mesh optimization [29, 30], which provide flexible warps to handle images with moderate parallax. However, these methods greatly depend on the number and distribution of point correspondences. In addition, distortions, resulted by non-linear transformations [9], are commonly obvious, e.g. projective and structure deformations. Many methods are developed to mitigate distortions, such as constraint of similarity transformation [4, 5, 14, 25], or geometric structure cues [24, 26, 30], however, the reduction is limited under the scene with rich contents and structures. Besides, large parallax is a challenging task for these methods [15].
Another solution is seam-assist image stitching, which holds the advantages of dealing with large parallax. The common way is to perform the seam cutting after image alignment to hide the inevitable ghosting or artifacts [7, 30]. The seam line is often selected by the color or gradient difference, image edges, etc., while they little consider the influence of alignment [15, 30]. The seam cutting can be also closely integrated with alignment for interaction [8, 15, 29]. The main idea of these methods is that images are aligned well only in local area, where the seam line across. Seam quality assessment are proposed to guide the selection of homography estimated from a set of point correspondences. In fact, they rely on the selection of local homography/correspondence set. In some complex scene, the optimal selection is difficult to find if the local alignment region contains multiple planes, due to these methods only take one homography to tackle the whole scene.
To the best of our knowledge, few works consider how to deal with the scene with strong structural regularities, in the form of multiple distinct planes. Because one global or local homography cannot fit for the complex scene, dual-homography warping [7] clustered the match points into two groups to estimate the dual homographies for the scene containing two predominate planes: a distant plane and a ground plane. However, the difference between the rough plane partition and the true plane scene may cause misalignment and structure deformations. It may degrade the performance in the complex scene with more than two distinct planes.
Therefore, this paper proposes a smoothly planar homography model for image stitching. To obtain the plane warps, we propose to automatically detect plane points and segment the scene into piecewise planar regions. Then adaptive plane-based warps are estimated and integrated to perform local alignment. Once the images are geometrically aligned, a misalignment-guided seam is calculated to perform seamless stitching. This model can handle more than two distinct planes with large parallax. Figure 1 gives an example of the proposed method. Thus, the contribution of this paper is twofold:
-
We propose a multi-plane homography estimation and integration strategy to handle the complex scene with multiple dominant planes and achieve plausible stitching.
-
We propose a novel seam estimation method guided by alignment error to deal with parallax, which provides seamless image stitching.
2 Related Works
Numerous works have been devoted to image stitching. A exhaustive review was proposed in [22]. Here, we give a briefly survey of related works.
Global Parametric Models. Early methods adopt global parametric warps (e.g. affine, projective warps) to align images. The performance is degraded when images are taken with different viewpoints or scenes are not roughly planar. To remedy deficiency of single warp, Gao et al. [7] proposed a dual-homography warp to stitch images. However, it only fits for simple scene with two planes, ground and distant planes.
Spatially Varying Warps. Spatially varying warps are proposed to handle complex scene. Followed by composition techniques, these methods work well for images with moderate parallax. They can be roughly classified into two categories: local warps and mesh optimization-based warps. The former estimates multiple local transformations to align images locally, such as smoothly varying affine warps [17], shape-preserving half-projective (SPHP) [4], as-projective-as-possible (APAP) warps [28] and its variants [5, 14, 18]. The latter applies mesh optimization model with a series of feature constraints after general warps, such as feature alignment [23, 27, 30] and photometric alignment [16]. These methods cannot consider the particularity of multi-plane scenes [19], that is the difference of transformation of different plane regions, thus they may fail to produce satisfactory stitching results.
Seam-Assist Stitching Methods. To stitch images with large parallax, some seam-assist methods are proposed. Unlike the method that performs seam cutting after image alignment [13], Gao et al. [8] proposed a seam-driven image stitching method. The method evaluates the seam-cut quality to guide the selection of optimal transformation. Based on it, parallax-tolerant stitching model [29] and seam-guided local alignment model [15] are proposed to improve the stitching performance. However, these methods may only align one local regions at a time, and the applied seam may accidentally pass through the other regions with large misalignments.
3 Smoothly Planar Stitching
The proposed stitching algorithm is illustrated in Fig. 2. The planar regions are estimated based on the detection of planar points, then the multiple planar homography are integrated by the designed weight strategy for smoothly stitching. To handle parallax, alignment errors are used to guide the seamline estimation for seamless composition.
3.1 Planar Region Estimation
For real scenes with multiple planes, we use a robust multi-structure geometric fitting method, called random cluster models sampler (RCMSA)Â [20], to detect planes from the point correspondences. RCMSA adopts random cluster models to perform hypothesis generation using subsets larger than minimal. Compared with random hypothesis generation, RCMSA provides good hypotheses, which are less affected by the vagaries of fitting on minimal subsets.
For two views of multiple-plane scene, given N point matches \(P=\{{p}_i\}_{i=1}^{N}\) across two images, where each \({p}_i=(\mathbf {x}_i, \mathbf {x}_i^{'})\) denotes a pair of match points in homogeneous coordinates. The RCMSA is to partition the match points into different planes (structures) as well as to remove the false matches. The number of structures is unknown and must also be estimated.
Basically, RCMSA works in the following way. Random cluster models is first used as hypothesis sampler to generate clusters for hypotheses \(\varTheta = \{\theta _c\}_{c=1}^K\). Next, an annealing method based on graph cuts is employed to optimize the fitting of structures. The graph \( \mathcal {G}=(\mathcal {V}, \mathcal {N})\) is builded on the match points, where each vertice \(\mathcal {V}=P\), and the edge \(\mathcal {N}\) is constructed from the Delaunay triangulation of P. The goal is to assign each pair of match points \({p}_i\) to one of the structures in \(\varTheta \), denoted by labels \(L = \{l_i\}_{i=1}^N\). That is, \(l_i = k, k =\{1,2,...,K\}\) if \({p}_i\) belongs to the \(k \textendash th\) structures, otherwise \(l_i = 0\) if \({p}_i\) is an outlier. The energy function is defined as
where \(D({p}_i, l_i)\) is the data cost and constructed as
where \(r({p}_i,{\theta _{l_i}})\) is the absolute residual of \({p}_i\) to structure \(\theta _{l_i}\), and \(\eta \) is the penalty if \({p}_i\) is an outlier. The smoothness cost V is defined as
The solution of \(L = \{l_i\}\) can be obtained based on \(\alpha \textendash \text {expansion}\) [1].
In our implementation, RCMSA is iteratively adopted on outliers, until outliers are small enough or the new detected plane points are small. To refine the detection of plane points, the projective distance is employed to adjust the plane labels of points. If the projective distance of one point by \(k \textendash th\) planar homography \(\mathbf {H}_k\) is less than \(\delta \), the point is reassigned to this plane label \(l_i = k\), where \(\mathbf {H}_k\) is estimated by the correspondences in plane \(\theta _k\). Thus, the points are labeled to each plane.
One simple way is to warp each plane by its corresponding transformation, however, there may be gaps between the plane regions, or plane regions may overlap. In our idea, the images are partitioned into two regions: plane and transition regions. For plane regions, we adopt the homography estimated by the point correspondences belong to current plane. For transition regions, they are transformed by the local weighted homography, detailed below, so that to keep the continuity along the boundary of neighboring plane regions. Here, the neighborhood of each plane points, e.g. less than \(\varepsilon \), is regarded as the plane regions, and the rest is transition region. Figure 3 shows the detection of plane points by applying RCMSA and the partition of plane regions.
3.2 Smoothly Planar Homography
For transition regions, the local weighted homography is employed to maintain the continuity and smoothness between neighboring plane regions. Given a pixel p in transition regions, the warps is estimated as
where \(\mathbf {H}_i\) represent the each plane homography, K is the number of plane regions, and \(\alpha _i\) denotes weight that adjusts the contribution of each plane homography. The weight is computed based on spatial proximity with Gaussian kernel,
where \(d_i\) denotes the distance to the closest pixel in \(i \textendash th\) planar regions, and \(\sigma \) is set to 4−8. To mitigate the projective distortions, the global similarity constraint proposed in [24] is employed by integration with local homography. The procedure of smoothly planar homography is given in Algorithm 1.
3.3 Alignment-Guided Seamless Composition
After alignment, seam cutting plays an important role in seamless stitching mosaic, especially for large parallax cases. To search for optimal seam line between two images, the difference of image color, gradient and edge map [15, 29] in the overlapping region are often adopted to construct smoothness terms in graph cut seam algorithm [11].
In fact, alignment error has a great influence on the seam finding [30]. The large misalignment pixels with similar colors will confusion seam cutting and produce bad seams. A plausible seam should traverse low-texture and inconspicuous regions, and avoid passing pixels with large alignment errors or distinct structures such as edges. Therefore, we propose to integrate alignment error and edge difference to generate good seams.
For match point, the alignment error is calculated as
where \((\mathbf {x}_i, \mathbf {x}_i^{'})\) is a pair of match points. \(\mathbf {H}\) is the corresponding plane homography.
According to point alignment error, we can generate a per-pixel error map by interpolation,
where \(w_{p,x}\) is the weight factor calculated by the distance of the pixel p in overlapping region to match point x. \(\rho \) is scale parameter and set to 8. The interpolation is conducted by the M match points closet to the pixel p. To reduce the influence of large alignment errors, we define the alignment term as
where \(\tau \) is set to 0.003D, where D denotes the length of image diagonal. The smoothness cost function is
where \(E_c\) is the color difference, \(E_e\) denotes the image edge probability difference computed by structured edge detector [6]. The smoothness cost is combined into graph cut seam finding algorithm [11] to search for a good seam. Then multi-band blending [3] is applied.
4 Experiments
To verify the effectiveness of the proposed method, we test our algorithm on a series of challenging data and compared with other stitching methods. The parameters of the compared methods are set as recommendation in the respective papers. Given a pair of images, the keypoints are detected and matched by deep matching algorithm [21] in our implementation.
4.1 Warping Performance
Figure 4 compares the warp performance with other two common warp model, that is, global warping and local warping. Here, global homography [2] and APAP warps [28] are selected for comparison. After warping, the proposed alignment-guided seam composition is employed on these stitching results for seamless composition. Figure 4(a) shows the result by global method, which applies a global homography to warp images. On one hand, scene with multiple distinct planes cannot be represented by only one transformation, result in severe misalignments. On the other hand, even though seam-cutting is applied, the seam-cutting cannot find well-aligned regions across in some areas. Thus, the seam passes misalignment regions and produces broken structures. APAP adopts multiple local homographies to align as many point matches as possible, and improves the stitching performance, e.g. green region in Fig. 4(b). However, due to the adverse influence of point matches in different planes (blue region) or uneven and insufficient points (green region), it is hard to provide accurate warping model for well alignment, result in stitching errors. Our smoothly planar homography adopts two different warping model to align planar and transition regions, which provides satisfactory alignment locally. Together with our novel alignment-guided seam composition, the estimated seam finds locally well-aligned regions, which can avoid regions with large parallax.
4.2 Composition Performance
Figure 5 shows the seam composition performance of EnblendFootnote 1, our method without guidance of alignment, and our method with guidance of alignment. From the enlarged views, Enblend produces severe seam errors, e.g. the disappeared buoy and the distortions of construction. In fact, Enblend only considers the color difference and gradient difference, which may suffer from ghosts or errors. By adding edge or boundary constraints, our method without alignment guidance provides a relatively better result, but the seam error is still obvious, mainly because of the large mis-match on red concrete columns. With the constraint of alignment error, the proposed seam composition avoids the regions with big alignment errors and provide satisfactory seam composition.
4.3 Comparison with Other Methods
Figure 6 gives the comparison with some spatially varying methods, including image composition editor (ICEFootnote 2), APAP [28], SPHP [4] and ours method. Some details are provided in enlarged views for comparison. Although ICE takes global transformation, it provides good stitching result because of the advanced image composition. However, the alignment errors remain obvious shown in red circle. APAP adopts local homographies to align as many correspondences as possible in the overlapping region. Due to rich correspondences, it provides satisfactory alignment performance, but it suffers from local distortions (shown in red circle) caused by feature matches in multiple planes. SPHP produces obvious stitching errors, because the applied warps cannot well represent the multi-plane image transformation. The estimated seam may accidentally pass through regions with misalignments and thus generate broken structures. Our smoothly planar homography method uses different model to process planar regions and transition regions and thus aligns different planar regions well. Together with the alignment-guided seam composition, which finds local well-aligned regions for composition, our method provides visually appealing stitching results.
Figure 7 provides the comparison with seam-assist stitching methods, including ICE and parallax-tolerant stitching [29] method. In ICE results, the seam cutting does not consider alignment errors and thus causes obvious broken structures. In parallax-tolerant stitching, the best homography is choosed for good local alignment. However, the applied seam may still be stumbled by large misalignment. In comparison, the proposed method provides satisfactory stitching results.
5 Conclusion
In this paper, we present a smoothly planar homography model for stitching images with multiple planes and large parallax. The plane and transition regions are detected based on the multiple plane correspondences, and warped with respective transformations. The multiple plane homographies are integrated to perform the smoothly stitching on transition regions. In addition, the alignment-guided seam composition is adopted to perform seamless stitching. Experiments prove the effectiveness and robustness of the proposed method and confirm the state-of-the-art stitching performance. In the future, the advanced plane detection methods may be beneficial for accurate detection of plane regions.
References
Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell. 23(11), 1222–1239 (2001)
Brown, M., Lowe, D.G.: Automatic panoramic image stitching using invariant features. Int. J. Comput. Vis. 74(1), 59–73 (2007)
Burt, P.J., Adelson, E.H.: A multiresolution spline with application to image mosaics. ACM Trans. Graph. 2(4), 217–236 (1983)
Chang, C.H., Sato, Y., Chuang, Y.Y.: Shape-preserving half-projective warps for image stitching. In: CVPR, Columbus, USA, pp. 3254–3261 (2014)
Chen, Y.-S., Chuang, Y.-Y.: Natural image stitching with the global similarity prior. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 186–201. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_12
Dollár, P., Zitnick, C.L.: Fast edge detection using structured forests. IEEE Trans. Pattern Anal. Mach. Intell. 37(8), 1558–1570 (2015)
Gao, J., Kim, S.J., Brown, M.S.: Constructing image panoramas using dual-homography warping. In: CVPR, Colorado Springs, CO, USA, pp. 49–56 (2011)
Gao, J., Li, Y., Chin, T.J., Brown, M.S.: Seam-driven image stitching. In: Eurographics, Girona, Spain, pp. 45–48 (2013)
Hu, J., Zhang, D.Q., Yu, H., Chen, C.W.: Multi-objective content preserving warping for image stitching. In: ICME, Turin, Italy, pp. 1–6 (2015)
Kang, Z., Zhang, L., Zlatanova, S., Li, J.: An automatic mosaicking method for building facade texture mapping using a monocular close-range image sequence. ISPRS J. Photogramm. 65(7), 282–293 (2010)
Kwatra, V., Schödl, A., Essa, I.A., Turk, G., Bobick, A.F.: Graphcut textures: image and video synthesis using graph cuts. ACM Trans. Graph. 22(3), 277–286 (2003)
Lee, J., Kim, B., Kim, Y., Kim, Y., Noh, J.: Rich360: optimized spherical representation from structured panoramic camera arrays. ACM Trans. Graph. 35(4), 63 (2016)
Li, L., Yao, J., Lu, X., Tu, J., Shan, J.: Optimal seamline detection for multiple image mosaicking via graph cuts. ISPRS J. Photogramm. 113, 1–16 (2016)
Lin, C.C., Pankanti, S., Ramamurthy, K.N., Aravkin, A.Y.: Adaptive as-natural-as-possible image stitching. In: CVPR, Boston, MA, USA, pp. 1155–1163 (2015)
Lin, K., Jiang, N., Cheong, L.-F., Do, M., Lu, J.: SEAGULL: seam-guided local alignment for parallax-tolerant image stitching. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 370–385. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_23
Lin, K., Jiang, N., Liu, S., Cheong, L.F., Do, M., Lu, J.: Direct photometric alignment by mesh deformation. In: CVPR, pp. 2405–2413 (2017)
Lin, W.Y., Liu, S., Matsushita, Y., Ng, T.T., Cheong, L.F.: Smoothly varying affine stitching. In: CVPR, Colorado Springs, CO, USA, pp. 345–352 (2011)
Liu, W.X., Chin, T.: Correspondence insertion for as-projective-as-possible image stitching. CoRR, arXiv: 1608.07997 (2016)
Lou, Z., Gevers, T.: Image alignment by piecewise planar region matching. IEEE Trans. Multimedia 16(7), 2052–2061 (2014)
Pham, T., Chin, T., Yu, J., Suter, D.: The random cluster model for robust geometric fitting. IEEE Trans. Pattern Anal. Mach. Intell. 36(8), 1658–1671 (2014)
Revaud, J., Weinzaepfel, P., Harchaoui, Z., Schmid, C.: Deepmatching: hierarchical deformable dense matching. Int. J. Comput. Vis. 120(3), 300–323 (2016)
Szeliski, R.: Image alignment and stitching: a tutorial. Found. Trends Comput. Graph. Vis. 2(1), 1–104 (2006)
Xia, G., Delon, J., Gousseau, Y.: Accurate junction detection and characterization in natural images. Int. J. Comput. Vis. 106(1), 31–56 (2014)
Xiang, T.Z., Xia, G.S., Bai, X., Zhang, L.: Image stitching by line-guided local warping with global similarity constraint. Pattern Recognit. 83, 481–497 (2018)
Xiang, T., Xia, G.S., Zhang, L.: Image stitching with perspective-preserving warping. In: XXIII ISPRS Congress, Prague, Czech Republic, pp. 287–294 (2016)
Xiang, T., Xia, G.S., Zhang, L., Huang, N.: Locally warping-based image stitching by imposing line constraints. In: ICPR, Cancun, Mexico, pp. 4178–4183 (2016)
Xue, N., Xia, G.S., Bai, X., Zhang, L., Shen, W.: Anisotropic-scale junction detection and matching for indoor images. IEEE Trans. Image Process. 27(1), 78–91 (2018)
Zaragoza, J., Chin, T.J., Tran, Q.H., Brown, M.S., Suter, D.: As-projective-as-possible image stitching with moving DLT. IEEE Trans. Pattern Anal. Mach. Intell. 36(7), 1285–1298 (2014)
Zhang, F., Liu, F.: Parallax-tolerant image stitching. In: CVPR, Columbus, OH, USA, pp. 3262–3269 (2014)
Zhang, G., He, Y., Chen, W., Jia, J., Bao, H.: Multi-viewpoint panorama construction with wide-baseline images. IEEE Trans. Image Process. 25(7), 3099–3111 (2016)
Zhang, Y., Song, S., Tan, P., Xiao, J.: PanoContext: a whole-room 3D context model for panoramic scene understanding. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 668–686. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10599-4_43
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Xiang, TZ., Xia, GS., Zhang, L. (2018). Image Stitching Using Smoothly Planar Homography. In: Lai, JH., et al. Pattern Recognition and Computer Vision. PRCV 2018. Lecture Notes in Computer Science(), vol 11256. Springer, Cham. https://doi.org/10.1007/978-3-030-03398-9_45
Download citation
DOI: https://doi.org/10.1007/978-3-030-03398-9_45
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03397-2
Online ISBN: 978-3-030-03398-9
eBook Packages: Computer ScienceComputer Science (R0)