Keywords

1 Introduction

An object with repeated patterns in balance such as rotation and bilateral reflection symmetry can be easily recognized out of background. Symmetry pattern on an object works as a salient visual feature attracting human attention. Various types of symmetry (rotation, reflection, translation, etc.) are mathematically defined and represented by a set of similar patterns located under certain repetition rules. Symmetry is omnipresent in real world objects such as snow crystal, face, flower, butterfly, and most of human-made objects such as buildings, cars, clothes, etc. Symmetry has been studied in computer vision as a discriminative visual clue in object recognition, shape matching and scene understanding [1]. Reflection symmetry is the most common and essential type that can be found almost everywhere in the surroundings. However, reflection symmetry detection from real world images is not a trivial task due to image noises, partial occlusion, perspective distortion and the lack of robust features to support the symmetry. Many researchers have devoted to practical and robust reflection symmetry detection method under various challenging environments as extensively summarized in [2]. Most of symmetry detection methods are based on sparsely detected feature points describing respective local neighborhood. Marola [3] introduces an algebraic technique for detecting a planar bilateral symmetry in Euclidean space. They fit polynomials to input image and detect bilateral symmetry on those fitted polynomials. Prasad and Yegnanarayana [4] propose gradient vector flow and symmetry saliency map for bilateral symmetry detection. They use edge gradients in order to be robust to illumination change. Mitra et al. [5] define general regularity in 3D geometry based on a region based matching. In order to figure out potential regularity, they rigidly transform (rotation, reflection, etc.) a matched key point to the other building meaningful transformation clusters. The feature based methods such as Loy and Eklundh [6] find symmetry matches based on sparse key points like SIFT. They show fast and robust symmetry detection performance with real world images. However appearance based features frequently cannot be detected from low-textured objects and as a result symmetry pattern cannot be detected neither. In such sparse feature point based symmetry detection method, the quality of the extracted feature is critical in the performance of symmetry detection. Feature detection and description are essential in many computer vision tasks including symmetry detection. Symmetry detection methods based on sparse appearance features such as SIFT [7], MSER [8], Scale and Affine Invariant Interest Point [9] are incompetent when symmetric objects have limited amount of textures or show significant changes in intensity across repeated patterns. On the other hand, symmetry of an object is frequently represented by the shape of edge structure very well. In many cases, structure of an object is the most critical and visually salient aspect attracting human attention more. Butterfly in Fig. 1 is a good example, where patterns on the wings hardly support the symmetry of the butterfly, yet we easily recognize the reflection symmetry structure based on the overall silhouette of the butterfly supported by its boundaries.

Fig. 1.
figure 1

Proposed reflection symmetry detection method using our appearance of structure (AoS) feature

Mikolajczyk et al. [10] propose an edge-based feature for shape recognition. The feature is defined by multiple neighbour edges and estimated in scale invariant manner similar to SIFT. Atadjanov and Lee [11] propose Scale Invariant Structure Feature that also finds set of edges with extremum curvature responses. Unlike [10], they construct scale space in 1-dimensional domain describing edge signature. They claim that using isotropic filtering in 2D image is not suitable for edge based key points that are anisotropically localized in image. Zitnick [12] develop image binary patch descriptor based on the location, orientation and length of edges. Edge Foci Interest Point [13] describes structure at the point roughly equidistant from neighbour edges with orientations perpendicular to the point. For wide-baseline correspondence task, Meltzer and Soatto [14] construct edge descriptor as a list of histogram of gradient orientations computed on each anchor point on the edge describing its region in the corresponding scale. They describe the edge by gradient orientations of local regions aligned to the edge. Therefore, the descriptor is not invariant to illumination change. Guan et al. [15] propose a 3-dimensional histogram descriptor for image matching that is computed from line segment votes lying on the local region of the key point. The volume of the region is calculated based on the scale where the feature point is detected. Histogram collects weights calculated for each line segment around the key point having similar orientation and location information. Therefore, the method is capable of building a structure feature reflecting local shape around the key point.

We observe that the symmetry pattern can be represented and detected better by incorporating structure description rather than just depending on texture and appearance descriptors. Overall symmetry shape can be described better by its global structure and local appearance provides strength of the symmetry-ness on it. Structure based methods produce significant amount of false matches due to the absence of description ability of local neighbourhood even though they detect more true positive symmetry patterns. In this paper, we propose a generalized Appearance of Structure (AoS) feature for reflection symmetry detection. Our appearance of structure feature finds key points with extreme curvature responses on edge line segments extracting structure information and describes local appearance represented by edges and contours. We propose to build 4-dimensional histogram for the description of the appearance of structure feature that accumulates neighbour edge points around the feature point and investigate their orientation, location and curvature encoding both local structure and appearance. Finally, we apply our appearance of structure feature for reflection symmetry detection. We perform extensive experimental evaluations qualitatively and quantitatively on two public symmetry detection datasets. We also perform an analytical evaluation based on human perception on the true symmetry axes.

2 Edge Extraction

Our appearance of structure feature point is localized on an edge and describes local structure supported by edge segments around it. Therefore, the performance of our feature point detection and description is highly dependent upon edge extraction result. Extracting clean, clear and correct edges is not a trivial task. An edge can be observed in multiple scales. But we hope to find an optimal scale in which edge becomes most salient and is able to describe local appearance. We adapt the anisotropic diffusion for scale space construction suggested by Liu et al. [16]. The authors claim that the scale of an edge can be chosen as the scale where the gradient magnitude becomes local maximum. We denote gradient of intensity and its marginals as \(\nabla I\) and \(I_x\), \(I_y\) and find maximum \(\nabla I\) of edge points over multiple scales. Instead of zero-crossing detection, we find a local maxima of transformed gradient, \(\nabla \tilde{I_t}\) defined as follows [16].

$$\begin{aligned} \nabla \tilde{I_t} = sign(\nabla I_t)(1 - \frac{\nabla I_t}{max(abs(\nabla I_t))}) \end{aligned}$$
(1)

where \(\nabla I_t\) is derivative of spatial gradient on a scale. The second-moment matrix of 3D Harris takes the following format.

$$\begin{aligned} M = \begin{pmatrix} I_x^2 &{} I_xI_y &{} I_x \nabla \tilde{I_t} \\ I_xI_y &{} I_y^2 &{} I_y \nabla \tilde{I_t} \\ I_x\nabla \tilde{I_t} &{} I_y \nabla \tilde{I_t} &{} \nabla \tilde{I_t}^2 \end{pmatrix} \end{aligned}$$
(2)

Eigenvalues of the matrix M measure the changes in the principal directions [17], defining the 3D-Harris response R.

$$\begin{aligned} R = det(M) - ktrace3(M) = \lambda 1 \cdot \lambda 2 \cdot \lambda 3 - k(\lambda 1+\lambda 2+\lambda 3) \end{aligned}$$
(3)

R gets negative value on an edge point, since edge has single big eigenvalue in general [17]. Therefore, the optimal scale s(xy) of an edge point location (xy) can be chosen using the following formulation [16].

$$\begin{aligned} s(x,y) = arg \max _t |R(x,y,t)|, R(x,y,t)<0. \end{aligned}$$
(4)

In order to let nearby edges have similar scales, we propagate chosen scale to its neighbour and connect the edges as illustrated in the edge extraction step in Fig. 1. Left image shows all edges in all scales in 3-dimensional plot and right image shows the edge propagation result.

3 AoS Feature Point Detection

For each edge point \(e_i\) (Fig. 2(a)), we define its neighborhood with edge segment \(\{e_i' ,e_i''\}\) inside the circle whose radius equals to the scale \(s_i\) of the edge point. We calculate curvature \(\gamma _i\) and orientation \(\phi _i\) for each edge point as follows. The curvature \(\gamma _i\) of an edge point is calculated based on the orientation change between vectors \(\overrightarrow{e_i'e_i}\) and \(\overrightarrow{e_ie_i''}\).

$$\begin{aligned} \gamma _i = \gamma (e_i) = \arctan \frac{y_{e_i''} - y_{e_i}}{x_{e_i''} - x_{e_i}} - \arctan \frac{y_{e_i} - y_{e_i'}}{x_{e_i} - x_{e_i'}} \end{aligned}$$
(5)

where \((x_{e_i}, y_{e_i}),(x_{e_i'}, y_{e_i'})\) and \((x_{e_i''}, y_{e_i''})\) are coordinates of \(i^{th}\) edge point and its edge segment’s endpoints, \(e_i'\) and \(e_i''\), respectively. \(|\gamma _i|\) is the approximation for curvature at \(e_i\) edge point.

The orientation \(\phi _i\) of an edge point is the orientation of the vector \(\overrightarrow{z_ie_i}\) starting at the center point \(z_i\) of the line segment \(\overline{e_i'e_i''}\) and going through the edge point \(e_i\).

$$\begin{aligned} \phi (e_i) = \arctan \frac{y_{e_i''} - y_{e_i'}}{x_{e_i''} - x_{e_i'}} - \frac{\gamma }{|\gamma |} * \pi /2 \end{aligned}$$
(6)
Fig. 2.
figure 2

(a) The orientation and curvature of an edge point based on its neighborhood edge segment. (b) Calculation of 4D Histogram bins for each key point. \(\rho _i\) - geometrical distance between locations of key point k and edge point \(e_i\). \(\alpha _i\) - angle between orientation of key point k and vector starting at key point going through edge point \((\overline{k,e_i})\). \(\beta _i\) - angle between orientations of key point k and edge point \(e_i\). \(\gamma _i\) - curvature of edge point \(e_i\).

Now, we select key points from edge points that have local maximum curvature above a threshold value. These outstanding points are robustly detected from geometrical distortions, however, edges having circular shape show similar curvature values over a series of consecutive points yielding multiple nearby key points from small local region. Considering both key point density and the number of key points to conduct symmetry detection task with maximum performance, we choose all such key points for our symmetry detection. Let \(e_{i}^{p}\), \(e_i\) and \(e_{i}^{n}\) be three consecutive edge points on a edge segment. We choose \(e_i\) as a key point if it satisfies the following two conditions.

$$\begin{aligned} Cond(e_i) =\{ |\gamma _i| + \epsilon> |\gamma _{i}^{p}| , |\gamma _i| + \epsilon > |\gamma _{i}^{n}| \} \end{aligned}$$
(7)

where \(\epsilon \) is error tolerance for the key point selection. This allows us to select edge point with slightly lower curvature value than its two neighbours. In other words, we choose a point having local maxima curvature by the precision of \(\epsilon \) to be robust to the errors in curvature calculation caused by discrete pixel locations in an image. Note that the orientation that we have calculated at this step is for each selected edge point. In the following subsection, we will develop our appearance of structure key point description based on these edge points and its orientation further for better description.

4 AoS Feature Point Description

Local regional neighborhood of key point is proportional to the scale of the key point. Therefore we describe key point based on the shape structure within the corresponding scale. In other words edge points detected in the previous step within the circle boundary describes the key point. The radius of the circle is proportional to the scale of the key point. To this purpose, we collect edge points within the circle boundary with the scale not less than the scale of key point. We obtain the orientation \(\phi (k)\) of key point k using similar method used in [7]. In our case, we collect orientations of edge points rather than all neighbourhood pixels in the SIFT.

Our key point descriptor is built from the responses of 4-dimensional histogram of weighted votes from all edge points inside its neighborhood region encoding the appearance of edges and contours. With each edge point, 4 parameter values (\(\varvec{\rho }\), \(\varvec{\measuredangle \alpha }\), \(\varvec{\measuredangle \beta }\), \(\varvec{\gamma }\)) are defined as shown in Fig. 2(b). \(\varvec{\rho }\) is the distance between key point k and edge point \(e_i\). \(\varvec{\measuredangle \alpha }\) is the angle between key point orientation \(\phi \) and the line \(\overline{k,e_i}\) connecting the key point k with edge point \(e_i\). \(\varvec{\measuredangle \beta }\) is the angle between key point \(\phi \) and edge point orientations \(\phi _i\). \(\varvec{\gamma }\) is the angle representing the curvature of sample edge point \(e_i\). These 4 dimensions describe the structure such as relative location and shape of each edge point inside the regional neighborhood collectively. A set of four parameter values from an edge point make one vote in the 4-dimensional histogram. In order to let our descriptor be invariant to scale and rotation in the voting step, distance \(\varvec{\rho }\) is normalized by the scale radius r of the key point and two angle values (\(\varvec{\measuredangle \alpha }\) and \(\varvec{\measuredangle \beta }\)) are calculated respective to the orientation phi of the key point. Two angle dimensions have equal step size in their bins assigning no priority in the range. However, regarding the distance \(\varvec{\rho }\), closer edge point is more important than farther one and the step size of the bins are assigned in log scale.

5 Reflection Symmetry Detection

In our symmetry detection, we divide the \(\rho \) and \(\gamma \) into 8 segments (8 bins in the histogram description) and have only positive values. Two angles \(\alpha \) and \(\beta \) are divided into 11 segments (11 bins) having signed value (6th bin corresponds to having 0 angle value for each). Considering the significance of each dimension and corresponding bins in symmetry detection task, we assign different weight distributions to the bins of the 4-dimensional histogram description. First, \(\alpha \) gets uniform weight over its bins because we do not have any preference in this angle. \(\rho \) gets zero mean normal distributed weight to give lower weight to farther edge points assuming that a closer edge point gives better description to key point that is located at the origin of the 4-dimensional descriptor. \(\gamma \) gets normal distributed weight with the mean at the maximum value of it to give higher weight to the edge point with higher curvature. \(\beta \) is the angle difference between key point and edge point orientations. It gets zero mean normal distributed weight to give higher weight to smaller angle difference. Finally, we get the following histogram weight function with an edge point.

$$\begin{aligned} w(e_i) =\frac{s_i}{S} \cdot \exp (-\frac{b_{\rho }^2(\rho _i)}{2 \sigma _{\rho }^2}) \exp (-\frac{b_{\beta }^2(\beta _i)}{2 \sigma _{\beta }^2}) \exp (-\frac{(b_{\gamma }(\gamma _i)-b_{\gamma }(\gamma _{max}))^2}{2 \sigma _{\gamma }^2}) \end{aligned}$$
(8)

where \(e_i\) is \(i^th\) edge point inside the local regional neighborhood of key point, \(b_{\rho }\), \(b_{\beta }\) and \(b_{\gamma }\) are binning function of each dimension with edge point \(e_i\), S is scaling factor based on all \(\sigma \) values and the radius of local regional neighborhood of key point. \(s_i\) is scale of \(i^{th}\) edge point. \(\rho _i\), \(\beta _i\) and \(\gamma _i\) are 3 parameter values. \(\sigma _{\rho }\), \(\sigma _{\beta }\), and \(\sigma _{\gamma }\) can be decided proportional to the maximum allowed range of each dimension.

Fig. 3.
figure 3

Key point mirroring example. (a) Original key point. (b) Mirrored key point

Fig. 4.
figure 4

Histogram description mirroring example with two 3-dimensional parameter spaces: Three edge points (D1, D2, D3) are accumulated in the histogram. Mirrored histogram description is created by flipping \(\beta \) and \(\alpha \) values.

Based on the extracted structure key points and their 4-dimensional description, we perform reflection symmetry detection in an image. Reflection symmetry pattern can be found by matching a key point descriptor with mirrored key points. In order to find mirrored matches, we create a set of all flipped key points. Figure 3(b) is an example of mirrored key point of an original key point shown in Fig. 3(a). For each mirrored key point, we have to update its description to reflect the flipped structure of the pattern. \(\varvec{\rho }\) in the description does not change in the mirroring step. Two angles (\(\varvec{\measuredangle \alpha }\) and \(\varvec{\measuredangle \beta }\)) get opposite signed values. For \(\varvec{\gamma }\) we use its absolute value in the description and there is no change in this value. Figure 4 gives an example of mirrored histogram description in the two 3-dimensional parameter spaces, (\(\alpha \), \(\beta \), \(\rho \)) and (\(\alpha \), \(\beta \), \(\gamma \)). In this example, we assume that the key point collects three edge points (D1, D2, D3) as accumulated in the histogram. Note that the mirrored histogram description is obtained by flipping \(\beta \) and \(\alpha \) values.

Based on the two key point groups (original and mirrored), we detect top K best matches of each original key point from mirrored key points. Each match vote for one potential symmetry axis. We accumulate the votes for each potential symmetry axis in order to find the strongest symmetry axes. Each match (\(i^{th}\) original key point and \(j^{th}\) mirrored key point) vote is weighed by the following function consists of four constraints.

$$\begin{aligned} {\begin{matrix} W_{ij} = {\left\{ \begin{array}{ll} F_{ij} \varPhi _{ij}S_{ij}D_{ij}, &{} \quad \text{ if } \varPhi _{ij} > 0 \\ 0, &{} \quad \text{ otherwise } \end{array}\right. } \end{matrix}} \end{aligned}$$
(9)

where \(F_{ij} = 1-fd_{ij}\) is similarity measure between the matched descriptors. \(fd_{ij}\) is the sum of absolute differences of all descriptor elements. \(\varPhi _{ij}\) is a phase weighting function used in [3].

$$\begin{aligned} \varPhi _{ij} = 1 - \cos ( \alpha _i + \alpha _j - 2\theta _{ij}) \end{aligned}$$
(10)

where \(\alpha _i\) and \(\alpha _j\) are angles between horizontal line and \(i^{th}\) and \(j^{th}\) key point orientations respectively. \(\theta _{ij}\) is the angle between horizontal line and the line connecting \(i^{th}\) and \(j^{th}\) key points. \(S_{ij}\) is scale constraint.

$$\begin{aligned} S_{ij} = \exp (\frac{-|s_i - s_j|}{\sigma (s_i + s_j)})^2 \end{aligned}$$
(11)

where \(s_i, s_j\) are scales of \(i^{th}\) and \(j^{th}\) key points. Finally, \(D_{ij}\) is distance constraint.

$$\begin{aligned} D_{ij} = \exp (\frac{-d^2_{ij}}{2 \sigma ^2_d}) \end{aligned}$$
(12)

where \(d_{ij}\) is geometric distance between \(i^{th}\) and \(j^{th}\) key points. \(F_{ij}\), \(S_{ij}\), and \(D_{ij}\) are adopted from [6]. In order to accumulate votes and detect final reflection axis, we transform all potential symmetry axes into Hough space with the calculated symmetry weights of the matches supporting respective axis. In the Hough space, the point (axis) with maximum accumulated weight value has been chosen as a candidate reflection symmetry axis.

6 Experimental Results

We evaluate our reflection symmetry detection method in three ways: (1) quantitative evaluation on two symmetry detection public datasets distributed in symmetry detection competitions at CVPR 2011 [18] and CVPR 2013 [19] workshops, (2) qualitative comparison to the most recent results in [11], and (3) analytic evaluation based on human perception. In our experiments, we use 8, 11, 11, 8 bins for \(\rho \), \(\alpha \), \(\beta \), and \(\gamma \), respectively. The radius of neighbourhood at each key point is set to 5 times of key point scale, which is selected empirically. In symmetry pattern matching, we choose top 10 best matches for each key point. We limit the number of detected axes at most 5 for single symmetry detection and 10 for multiple symmetry detection. For our first and second evaluations, we count the number of true and false positives of each method following the suggested decision rule in [19] counting true positive detection if angle deviation of detected axis is smaller than 10 degree and the center point of the axis is located within one fifth of the length of ground truth axis. In our third analytic evaluation, symmetry detection rate is re-evaluated by human evaluators.

Fig. 5.
figure 5

Experimental results on CVPR’11 workshop dataset [18]. Proposed method is compared to Loy and Eklundh [6] that is the most well performing prior method reported on the dataset. Second column show our dominant appearance of structure features.

6.1 Quantitative Evaluation on Two Public Datasets

Figure 5 shows reflection symmetry detection results on CVPR 2011 workshop dataset [18] which contains various real and synthetic images with single and multiple symmetry axes. Loy et al. [6] is the most well performing reported method on the dataset. We compare our method with it quantitatively and qualitatively. CVPR 2011 workshop dataset [18] contains total 258 images in 4 categories: (1) 79 real images with single symmetry, (2) 85 real images with multiple symmetries, (3) 55 synthetic images single symmetry, and (4) 39 synthetic images with multiple symmetries. A1 in Fig. 5 contains one global symmetry and two local symmetries. Proposed method successfully detects all tree symmetry axes while [6] detects only global symmetry axis. A1 is almost no textured image but has very clear boundaries that helps proposed method using appearance of structure feature to extract enough descriptors even with small local symmetry objects. A2 has two clear symmetry faces mostly supported face contours that have been correctly detected by proposed method. However, [6] fails to collect enough number of supporting appearance features and other cluttered axes are detected as stronger symmetry than those faces. A4 is very interesting example in which several our detections look better than given ground truth. Diagonal symmetry axes in the ground truth do not reflect the concentric ellipses, while our detected axes find those unexpected but correct symmetry axes. In fact, this observation (correct detections that are not listed in given ground truth or mistakenly presented) has brought our further performance evaluation and analysis based on human perception that is presented in the next subsection. A5 has almost no texture to support the symmetry object except the contour of the object. That’s why only our method detects the symmetry axis. A6 has symmetry axes on the skewed patterns where our method detects more correct axes than [6]. However, the skewed shape of the objects in A6 make our method difficult to find complete axes. As our feature groups edge segments on the local regional area, skewed local contour pairs become weak supports for the symmetry axes. With less textured synthetic images such as A7, A8, and A9, [6] frequently fails with only few number of detected feature points. Especially, A7 has less than 10 SIFT features and [6] gives no result image. Figure 6 shows corresponding precision and recall results calculated with all default parameters and settings. Except multiple - synthetic image category, proposed method performs better in both precision and recall rates.

Fig. 6.
figure 6

Precision and recall rates of proposed and Loy and Eklundh [6] on CVPR’11 workshop dataset [18]

CVPR 2013 workshop dataset [19] contains 121 real world images with single (75 images) and multiple (46 images) symmetry subgroups. Figure 7 shows selected detection results compared to [6] and Fig. 8 shows precision-recall curves compared to four previous methods [6, 2022] appeared in the competition. In B1 in Fig. 7, proposed method detects longer and complete symmetry axis because contours and edges support the symmetry of the object more than local appearances. The handles of the bag in B1 that are correctly detected only by our method clearly show this fact. B2 is a nature image with many random edge segments on the tree. In such nature images, it is very difficult to extract clean symmetry edges and proposed method only detect partial symmetry axis of the tree. In B3, B4, and B5, proposed method detects more complete symmetry axes supported by the contour of objects while [6] fails to find complete axes due to the lack of enough number of SIFT feature points to support them. B6 and B8 are nature images with occlusions and background clutters. Background clutters always cause a problem in grouping true supporting feature point pairs. As a result, our method detects more false symmetry axes with them than other images. Usually clutters consist of set of short edges with random scale factor that is almost not probable that they build a long connected edge in our scale propagation step. We can exclude such edges with smaller length than its scale or by putting a threshold on the connected edge length. B7 and B9 are good for both methods as they have clean and enough texture and structure features. B10 has two symmetric objects and proposed method detects one of them correctly.

Fig. 7.
figure 7

Experimental results on CVPR’13 workshop dataset [19]. Second column images show our dominant feature points.

Figure 8 illustrates precision-recall curves for single and multiple datasets of CVPR 2013 workshop dataset [19] compared to four prior methods (Loy and Eklundh [6], Michaelsen et al. [20], Patraucean et al. [21], and Kondra et al. [22]). For both single and multiple symmetry datasets, proposed method outperforms all previous methods in almost all region of the curves. Especially in the single symmetry result, proposed method shows very high recall values proving that our method gives few number of false negative detections. In other words, we detect most expected ground truth symmetry axes.

Fig. 8.
figure 8

Precision-Recall curves on CVPR13 workshop dataset [19] compared to four prior methods (Loy et al. [6], Michaelsen et al. [20], Patraucean et al. [21], and Kondra et al. [22]) *\(Precision=\frac{true. positive}{true.positive + false.positive}\), \(Recall=\frac{true.positive}{true.positive + false.negative}\)

Fig. 9.
figure 9

Comparison with structure only [11] and appearance only [6] reflection symmetry detection methods with our appearance of structure (AoS) reflection symmetry detection

We also compare proposed method with the most recent reflection symmetry detection results reported in [11] that uses structure only feature based on edge segments (Fig. 9). In C1, [11] detects the most complete axis, however, proposed method finds slightly better axis location in its detected angle based on the support of local appearance. In C5, C7 and C8, proposed method finds more complete axes (C5, C7) or more number of correct axes (C7, C8).

Fig. 10.
figure 10

Failure cases with nature background clutters or small foreground objects

Figure 10 illustrates sample images where our method fails. These images are mostly nature images with background clutters or small foreground images. F1 and F4 have both cluttered background and small symmetry object. In F1, [6] successfully detects symmetry axis. F2 and F5 have small or unclear symmetry objects to extract enough number of features to support true symmetry axes. Note that the detected axis in F2 is not listed in the ground truth, however it is meaningful axis based on the similar shape of the two airplanes.

6.2 Evaluation Based on Human Perception

As we already have observed in several experimental results (Fig. 5 A4, Fig. 10 F2), complete labeling of all true reflection symmetry axes is hardly possible. Figure 11 shows two such examples where potential true symmetry axes are detected by proposed method but are not listed in the ground truth. Ground truth of two public datasets [18, 19] do not contain complete potential symmetry axes and we easily can find missing but outstanding reflection symmetry axis. Our observation is that the quantitative evaluation on the predetermined ground truth can be unfair for every methods.

Fig. 11.
figure 11

Example for potential symmetry axes that are missed in ground truth but detected by proposed method

Therefore we perform a new evaluation test based on human perception on the single symmetry dataset of [19]. First we run reflection symmetry methods on the dataset finding out top 10 potential symmetry axes for each image. After that we let 20 human evaluators decide if each detected axis is true symmetry axis without prior knowledge on given ground truth. If more than half of votes are collected from the evaluators, we conclude that human sees that it is a true symmetry axis. If both ground truth and human evaluation say that it is true symmetry axis, we count it as real true positive axis. If human evaluation says that it is a true symmetry axis but not listed in the given ground truth, we count it as neither true positive nor false positive in the evaluation. If multiple axes are detected and decided as true symmetry axes from the single corresponding ground truth in the list, we count only once for them as true positive. In Fig. 12, we show new precision and recall curves for [6] and proposed method based on human perception evaluation. Compared to original curves appeared in Fig. 8, human perception based evaluation of proposed method (dashed line in Fig. 12) shows almost identical curve. As we have already mentioned, our method has very few number of false positives resulting in limited improvement in this human perception based evaluation. On the other hand, the curve for [6] (solid line in Fig. 12) is shifted toward right side after human perception based evaluation. This indicates that there are false positive detections in this method which are counted as true symmetry axes by human evaluators. We believe that this human perception based evaluation is more meaningful if we expect to use detected symmetry patterns as visually salient features for object characterization.

Fig. 12.
figure 12

New precision and recall curves based on human perception evaluation are overlayed on the original curves appeared in Fig. 8

Alternatively, we can count all symmetry axes decided by human evaluators as true positives totally ignoring given ground truth. However in this case, we have a problem in counting the number of true positive axes, because multiple detected axes can be found from one real symmetry axis due to noise or other challenging conditions. This problem involves another human perception based decision if two nearby detected axes are actually from single true symmetry axis or not.

7 Conclusion

In this paper we propose new appearance of structure (AoS) feature based reflection symmetry detection method. Extensive evaluation on two public datasets show promising results of our method. Proposed method finds robust and outstanding edge feature points and builds an appearance of structure descriptor capturing local appearance of edges and contours. We have observed that synthetic images and manmade objects collect stronger supports for reflection symmetry due to their clean and precise shapes. Nature images with cluttered background contain many random edge segments that distract capturing underlying reflection symmetry pattern. We also have seen that a small sized symmetry object can be found well with our method.