Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Identification of human individuals by their faces, fingerprints, or iris are problems that have been of interest to researchers for many years. However, identification of wild animals in a non intrusive fashion is of rather recent interest, it is also a more challenging problem since images are more deformed by position, scalability, rotation and skin folding, after all you cannot ask a wild animal to stand still for a picture.

We in a previous work [4] managed to automatically detect and segment the ocelot for identifying the region of interest in the image. These images were taken by trap-cameras which were triggered by infrared sensors when an animal crossed in front of them. In this work we propose a method to characterize the spots of the ocelot in a way that is very convenient for the next step, the identification of the individual ocelot inside the image.

2 Previous Work

Some recognition systems of wildlife are shown in this section. These works use intrusive techniques (i.e. marks, chips, etc.), the images need to meet certain requirements or the implementation is not fully automatic since it requires the work of an operator to process each image manually.

Jurgen den Hartog et al. [5] implemented a software known as \(I^{3}S\) (Interactive Individual Identification System), for a pattern recognition system over a body of animals, however, it is a supervised technique that needs the work of a human operator which makes it of little use when thousands of pictures have to be processed. The supervisor must mark where the animal is inside the image and \(I^{3}S\) makes a model for each image; after that, this model is searched in the database sequentially. The main disadvantage is that all images must satisfy certain conditions (they must be taken at 30 degrees), which means discarding most of the photos or they have to be taken intrusively which in many cases is not a valid solution.

Charre-Mendel et al. [1] showed with 6 photographs captured from trap-cameras the presence of jaguars (Panthera onca) in Michoacan, Mexico in 2010. The recognition was performed manually by trained employees. All these records belongs to semi-tropical forest located between south of Sierra Madre and the Pacific Coast, they have around 300 records of ocelots that also, have been identified manually.

In [7, 11] the authors showed an implementation of a photo-identification system of wildlife with natural marks. In particular in [7], the authors have worked with cheetahs from the Serengeti National Park of Tanzania. However these cheetahs were recognized in a semi-automatic approach, they extract manually every area: shoulder blade, hip joint, the abdomen line and the vertebral column, after a sample of the pattern of the fur. The drawback is of course the need of a human operator to process each image.

A pattern recognition system needs to solve three problems at least:

  1. 1.

    Identify the region of interest, this problem is known as segmentation;

  2. 2.

    Extract features from the pattern;

  3. 3.

    Decide to which class the pattern belongs to, this problem is known as classification.

Segmentation: In [4] the authors proposed a new method to segment the animal from the image. The proposed method consists of getting the difference of two images taken from the same trap-camera and the same position (trap-cameras are placed on non-moving objects like trees or stones).

Feature extraction: There are several ways to model closed-shapes. A convenient way to model closed-shapes is to represent them as ellipses. In [8], the authors used elliptical properties of the Fourier coefficients for characterization purposes. In [10] the authors identified diatoms and extract some features of leaflets using geometric descriptors to approximate the shapes as ellipses. In [3] an algorithm called tangent method rope detect ellipses on an image.

A photo-recognition system for wildlife using non-intrusive techniques is an open problem in this world. Images taken by trap-cameras are a big challenge because they can be in a different position, or side (animals are not symmetric on the distribution of patterns).

This paper is organized as follows: at Sect. 3, our proposal for feature extraction is presented, in Sect. 4 results are showed using the proposed method, and finally, in Sect. 6 we present some conclusions and future work.

3 Our Proposal

Our goal is to extract features from the images to model the spots in the fur of the animal in the pictures taken, in particular, we are interested in those captured with trap-cameras. First, an edge detection algorithm was used in order to emphasize edges, this makes easier to identify the main feature we use to identify the animal, that is, the closed shapes or spots in the fur, then we replace them with ellipses.

3.1 Filters

Digital filters are normally used to enhance contrast or reduce noise of pictures, this is a normal preprocessing phase in image recognition systems. We used two filters: Sobel and Canny (details are not given because both filters are widely known in the literature).

3.2 Feature Extraction Using Ellipses

Firstly, we present two methods to obtain a set of ellipses from the animal’s spots in the fur. The first one is a geometric construction and the second one uses the image moments. An ellipse is the trajectory of a point that surrounds two foci such that the sum of the distances to the two foci is constant for every location of the point on the curve [2]. For example, Fig. 1(a) shows it has two fixed points (F and \(F'\)) called foci of the ellipse. The straight line l through the focus is called the focal axis. The ellipse intersects the focal axis at two points V and \(V'\) called the vertices. The portion of the focal axis between the vertices, or the segment \(VV'\) is called the major axis, the longest diameters of an ellipse. The point C of the major axis, midpoint of the line segment linking the two foci, is called the center. The line \(l'\) pass through C and it is perpendicular to the focal axis l and it is called the normal axis. The normal axis cuts the ellipse at two points A and \(A'\), the segment \(AA'\) is called the minor axis, the shortest diameters. A segment as \(BB'\), that links any two different points of the ellipse, is called the chord. In particular a chord passing through one focus as \(EE'\) is called the focal chord. Two diameters are conjugated if and only if they are orthogonal.

In order to find the center of a given ellipse we have to draw two chords \(EE'\) and \(BB'\), by definition a line passing through at midpoints of these \(EE'\) and \(BB'\), is a diameter of the ellipse, and therefore passes through the center C.

Fig. 1.
figure 1

Ellipse properties and construction.

Geometric Construction. It is possible to construct an ellipse by knowing a pair of conjugate diameters, in position and magnitude. The Rytz’s construction determines both the larger axis and the smaller axis of the ellipse in location and magnitude. Figure 1(b) shows an example. First, we need two conjugate diameters s and t, segment BM is rotated \(90^{\circ }\), the result is segment \(B'M\); then \(B'\), circle \(C_{1}\) is drawn with diameter \(AB'\), its midpoint is T. The magnitud of the axis is given by the distance of the intersection points of line \(L_1\) with \(C_1\), i.e. (\(p_1,p_2\)). In order to find the directions of the axis, we draw a line u through A and \(B'\). The Tales circle with center in the midpoint T that pass at center M cuts u in X and Y lines. The directions of the axis are given by YM and XM.

Using Image Moments. Another way to describe shapes is based on the use of statistical properties called moments, the first three moments of a discrete one-dimensional function f(x) are related to its mean (\(\mu \)); its variance(\(\sigma \)) and its skewness, which is a measure of the symmetry of the function and it is defined as:

$$\begin{aligned} skew = \frac{\sum _{x=1}^{N}(x-\mu )^3 f(x)}{\sum _{x=1}^{N} f(x)} \end{aligned}$$
(1)

Since we are interested in computing descriptors that are invariant under translation, what we really need to compute are the central moments of an image. The general two-dimensional (\(p+q\))th order moments of a grey-level image f(xy) are defined as [6]:

$$\begin{aligned} m_{pq}= \int _{-\infty }^{\infty }\int _{-\infty }^{\infty } x^{p}y^{q}f(x,y)dxdy \end{aligned}$$
(2)
$$\begin{aligned} p,q =0,1,2\dots \end{aligned}$$
(3)

\(m_{00}\) is the moment of degree zero and represents the area of a grey-level image. In order to compute a centroid (or center of mass), it can be determined by combining \(m_{00}\) with the image moments of the first degree (\(m_{01}\) and \(m_{10}\)). Using the moments of second degree (\(m_{11}\), \(m_{02}\) and \(m_{20}\)), the orientation of the object in the image can be estimated.

$$\begin{aligned} m_{00}= & {} \sum _{x=1}^{N}\sum _{y=1}^{N}f(x,y)\\ \nonumber m_{10}= & {} \sum _{x=1}^{N}xf(x,y)\\ \nonumber m_{01}= & {} \sum _{y=1}^{N}yf(x,y) \nonumber \end{aligned}$$
(4)

The centroid is computed as follows:

$$\begin{aligned} \bar{x}=\frac{m_{10}}{m_{00}} \end{aligned}$$
(5)
$$\begin{aligned} \bar{y}=\frac{m_{01}}{m_{00}} \end{aligned}$$
(6)

In order to make descriptors invariant to translation, we compute the central moments which are the moments about the image centroid (see Eq. 7).

$$\begin{aligned} \mu _{p,q}=\sum _{x=1}^{N}\sum _{y=1}^{N}(x-\bar{x})^{p}(y-\bar{y})^{q}f(x,y) \end{aligned}$$
(7)

Other descriptors are the size of the principal axis, it can be computed from the eigenvalues of the inertia matrix l (covariance matrix [9]), it is composed by central moments \(\mu _{11}\), \(\mu _{20}\) y \(\mu _{02}\).

$$I= \left( \begin{array}{lcr} \mu _{20}&{} \mu _{11} \\ \mu _{11}&{} \mu _{02} \\ \end{array} \right) $$

This eigenvalues are the principal moments and are given by

$$\begin{aligned} \lambda _{1,2}=\frac{(\mu _{20}+\mu _{02})\pm \sqrt{\mu _{20}^{2}+\mu _{02}^{2}-2\mu _{20}\mu _{02}+4\mu _{11}^{2}}}{2} \end{aligned}$$
(8)

4 Experiments

We tested our proposal with real images taken in Michoacan, Mexico using a trap-camera, in Fig. 2 we show an original image 2(a) and the segmented one is in Fig. 2(b). These images were segmented partially using the technique described in [4], after the images were delimited by hand since it is not the aim of this paper.

Fig. 2.
figure 2

Ocelot used for our tested

Table 1 show both photograph of the ocelots used (the first row corresponds to ocelot 1 and the second row to ocelot 2), each column show these images after a filter was applied. The first column is for the Canny filter, and the second column is for the Sobel filter. For both filters we use the MATLAB’s implementation with threshold value between 0.07 and 0.08. Notice that images with Canny filter have more edges detected than Sobel filter. For the next step we use images with Canny filter.

Table 1. Comparison between the use of Canny filters and Sobel filters. MATLAB functions with threshold between 0.07 and 0.08.
Table 2. Spot selected for testing our proposal. The spot enclosed in yellow appears in both images, spots in red are only in one of the ocelots, and spots in blue seem to be closed but they are not.
Table 3. Spot 1 for ocelot 1 (top) and 2 (bottom), that is column 1, spot 2 for ocelot 1(top) and 2(bottom) at column 3.

4.1 Selected Patterns

We designed a controlled experiment in order to see if the proposal introduced here was working properly, in our experiment we use some spots from both ocelots, the spots were automatically chosen on the basis that the selected spots were those whose contour were completely closed by the edge detection algorithms that we used, we tried both Canny filters and Sobel filters for this purpose. See Table 2, these images were processed with the MATLAB (function imfill), the first row is for ocelot 1, and second row is for ocelot 2. The spot enclosed in yellow appears in both images, spots in red are only in one of the ocelots, and spots in blue seem to be closed but they are not. Finally, the second column shows the spot selected to test our proposal.

Some examples of feature extraction of spots selected can be seen in figures of Table 3. Notice how well those ellipses correspond to the closed shapes.

5 Discussion

A comparison between the methods we used for finding the ellipses are shown in figures of Table 4, the first row is for the Geometric construction method and the second row is for the Image moments method. The ellipses from the spots of the Ocelot in Photo 1 should match the corresponding ellipses of the Ocelot in Photo 2 since the Ocelot in both photographs are in fact the same ocelot and in this controlled experiment the spots of Photo 1 were manually compared with the corresponding spots in Photo 2, therefore the ellipses should be very similar both in size (i.e. Major and minor axis length) and in orientation angle. We conclude that the Image moments method outperform the Geometric construction method, particularly the orientation angle is almost identical when using the Image moments method (second row of Table 4).

Table 4. Comparison of some details of spots as ellipses, Geometric method (first row), Moments method (second row).

6 Conclusions

In this paper, we proposed a feature extraction of fur patterns of animals using ellipses, we used two methods to determine the ellipses that represent closed shapes, one of them is Rytz’s construction and the other uses the image moments. We found that the method based on image moments produces the best results. We applied the image moments method with two ocelot’s images and the results are shown in Table 5. The first column shows our ocelot’s images, in the second column each closed shape or spot was automatically replaced by an ellipse, finally, in the third column each ellipse was replaced by a unitary vector located in the center of the ellipse. These vectors will be used in our future work where we will compare a set of vectors with another set of vectors to decide if both sets correspond to the same ocelot.

Table 5. Feature extraction process.