Texture representation and retrieval using the causal autoregressive model

doi:10.1016/j.jvcir.2010.04.004

Journal of Visual Communication and Image Representation

Volume 21, Issue 7, October 2010, Pages 651-664

https://doi.org/10.1016/j.jvcir.2010.04.004 Get rights and content

Abstract

In this paper we propose to revisit the well-known autoregressive model (AR) as a texture representation model. We consider the AR model with causal neighborhoods. First, we will define the AR model and discuss briefly the parameters estimation process. Then, we will present the synthesis algorithm and we will show some experimental results. A perceptual interpretation of the AR estimated parameters will be then proposed and discussed. In particular, a computational measure to estimate the degree of randomness/regularity of textures is proposed. The set of the estimated parameters will be then applied in content-based image retrieval (CBIR) to model texture content and experimental results are shown. Benchmarking, using the precision/recall measures conducted on the well-known Brodatz database, shows interesting results.

Introduction

Among statistical models that have been used to model texture, random fields models such as the autoregressive (AR) model have been among the most successful. Such models are generally characterized by a set of parameters that must be estimated using different methods such as the least squares error (LSE) or the maximum likelihood (ML) methods. These estimated parameters are considered to represent the content of the texture. Another important property of the autoregressive model is its forecasting ability, that is, with this model, the grey-level value of a pixel is predicted using the grey-level values of pixels in its neighborhood.

The autoregressive (AR) model has been used in different works related to texture classification and segmentation [11], [14], [17], [19], texture synthesis [18] and more rarely to texture retrieval [12]. Sondge [18] studied different models, among them the AR model, in order to synthesize textures. He considered the simultaneous model (SAR) as well as the separable model which is a particular case of the SAR model. Kashyap and Chellappa [11] proposed a circular autoregressive model (CAR) in order to take into account rotation invariance. Mao and Jain [14] proposed a multiresolution AR model (MRSAR) which consists in the use of multiple resolutions in order to capture different scales of the texture. They also proposed a rotation-invariant simultaneous autoregressive model (RISAR) to take into account rotation invariance. In the framework of image retrieval, Liu and Picard [12] used the MRSAR proposed by Mao and Jain for texture retrieval. They benchmarked the MRSAR model against the Wold model and their results show that the MRSAR gives interesting results even if the Wold model is shown to give slightly better results.

Given their nature as random fields, such models, generally, work better for random textures than regular textures. A common problem with the use of these models is the choice of a neighborhood in which pixels are considered as related to each other. A priori, it is better to choose a small neighborhood for a fine texture and a large neighborhood for a coarse texture. However, in practice there is another problem which arises with large neighborhood. In fact, when the neighborhood is large, and due to the averaging effect phenomenon [14], the discrimination power of the estimated parameters tends to decrease. This phenomenon is related to curse of dimensionality problem as reported by Mao and Jain [14]. So, such model works better for fine textures. This has also some link with the randomness of textures. In fact, coarse textures are generally perceived as regular while fine textures are perceived as random, and as it has been already mentioned, the autoregressive model works better, by essence, with random textures.

Another problem associated with such models is the fact that their efficiency is altered with large neighborhoods. In fact, the size of the set of estimated parameters becomes much larger as the neighborhood becomes larger. Also, the more the size of the set of estimated parameters is large, the more the computational cost to extract those parameters is high.

In this paper, we propose to reconsider the SAR model in its causal version without considering multiple resolutions. Causality is not a natural constraint in a bidimensional (2D) space. However, this causality allows to simplify both the model and the synthesis process. With causal neighborhoods, diminishing the parameters number increases the efficiency of parameters extraction. We consider both non-symmetric half-plane (NSHP) and quarter-plane (QP) neighborhoods Fig. 1. Also, we consider small neighborhoods which allows to diminish both the number of parameters and the computational cost necessary to extract parameters. We show that, when such a model (causal SAR with small neighborhood) is applied in content-based image retrieval, the loss in terms of search effectiveness (relevance) is acceptable since there is an important gain in terms of search efficiency. Furthermore, we propose a new perceptual interpretation of the estimated parameters. Such a perceptual interpretation is easily comprehensible by users and allows them to better formulate their queries and better understand results returned by the model.

The rest of the paper is organized as follows: In Section 2, we define the model, discuss briefly the parameters estimation scheme and present the main steps in the synthesis process; In Section 3, we present an evaluation methodology, and experimental results, allowing to depict the ability of the causal SAR model to capture texture content as well as its efficiency; In Section 4, we propose a perceptual interpretation of the estimated parameters and a computational measure of the randomness of textures is proposed based on the estimated parameters; In Section 5, the causal model, with both QP and NSHP neighborhoods, is applied in content-based image retrieval [1], [2], [6] and compared to the MRSAR version used by Liu and Picard [12] in terms of search relevance as well as efficiency; And finally, in Section 5, we give a conclusion and a brief description of future investigations related to this work.

Section snippets

Definition

The simultaneous (2D) autoregressive model (SAR) model is defined as follows: $(X_{s} - μ) = a_{s} W_{s} + \sum_{r \in Ω^{+}} a_{r} (X_{s + r} - μ)$ where s corresponds to position (i, j) on rows and columns, X_s is the grey-level at position s, Ω⁺ is the neighborhood on rows and columns of X_s (excluding X_s itself), Ω = Ω⁺ ∪ {s}, μ is the local grey-level average in the neighborhood Ω and [a_s, a_r, r ∈ Ω⁺] are the parameters of the model to be estimated.

W_s is a Gaussian white noise, a stationary signal made of non-correlated random variables,

Evaluation criteria

We have made an evaluation of both the causal NSHP AR and the causal QP AR models by considering several orders. This evaluation concerns two main aspects: 1. the ability of the model to capture texture content of images; 2. the efficiency of the model. The ability of the model to capture texture content was measured by a qualitative criterion and a quantitative criterion:

•
The qualitative criterion used was visual comparison between the synthesized images and the corresponding original images.
•

Perceptual interpretation

Remember that parameters [a_r, r ∈ Ω⁺] are estimated from the covariance matrix computed in the considered neighborhood Ω⁺. Each of the estimated parameters can then be seen as the correlation between the pixel corresponding to this parameter and the pixel of interest (0, 0): parameter a(0, −1) represents the correlation of pixel (0, 0) with pixel (0, −1), parameter a(−1, 0) represents the correlation of pixel (0, 0) with pixel (−1, 0), parameter a(−1, −1) represents the correlation of pixel (0, 0) with

Application to content-based image retrieval

In the rest of this paper we use the following notations:

–
QP, QP-V: QP denotes the autoregressive model with a quarter-plane neighborhood and with order (1, 1) while QP-V denotes the same model except that each parameter was weighted by the inverse of its variance.
–
NSHP, NSHP-V: NSHP denotes the autoregressive model with a non-symmetric half-plane neighborhood and with order (1, 1) while NSHP-V denotes the same model except that each parameter was weighted by the inverse of its variance.

In the next

Summary and conclusions

In this paper, we have considered the causal AR model with QP and NSHP neighborhoods. We have briefly defined the model, the parameter estimation scheme and the synthesis process before showing experimental results and an evaluation in terms of the ability of the model to capture texture content and also in terms of efficiency.

We have also proposed a perceptual interpretation of the estimated parameters and showed experimental results that support this interpretation: the sum (or mean) of the

References (19)

J. Mao et al.
Texture classification and segmentation using multiresolution simultaneous autoregressive models
Pattern Recognition
(1992)
R. Datta et al.
Image retrieval: ideas, influences, and trends of the new age
ACM Transactions on Computing Surveys
(2008)
M. Lew et al.
Content-based multimedia information retrieval: state of the art and challenges
ACM Transactions on Multimedia Computing, Communications, and Applications
(2006)
N. Abbadeni, Perceptual interpretation of the estimated parameters of the AR model, IEEE ICIP, Genoa, Italy,...
N. Abbadeni , A new similarity matching measure: application to texture-based image retrieval, in: Proceedings of the...
N. Abbadeni, Recherche d’images basée sur leur contenu, Représentation de la texture par le modèle autorégressif,...
A. Del Bimbo
Visual Information Retrieval
(1999)
P. Brodatz
Textures: A Photographic Album for Artists and Designers
(1966)
R.T. Frankot et al.
Lognormal random-field models and their applications to radar image synthesis
IEEE Transactions on Geoscience and Remote Sensing
(1987)

There are more references available in the full text version of this article.

Cited by (13)

A variational model for multiscale texture extraction
2016, Optik
Citation Excerpt :
Of all these applications of texture analysis, texture extraction may be the most important preliminary work. By far there are many different methods used to extract textural information from images, which can be categorized into four major classes [1,2]: characterized as statistical (e.g. [3,4]), structural (e.g. [5,6]), model-based (e.g. [7–13]) and frequency-based (e.g. [14–23]). The models mentioned above are examples for a larger class of variational decompositions with fixed scales; the scale parameters in these models are fixed.
We propose a hierarchical (BV, G) variational decomposition model for multiscale texture extraction in this paper, which can offers a hierarchical, separated representation of image texture in different scales. The proposed hierarchical decomposition is obtained by replacing the fixed scale parameter of the A²BC model with a varying sequence. Some properties of this hierarchical decomposition are presented and its convergence is proved. We adopt Euclidean projection algorithm to solve this hierarchical decomposition model numerically. In addition, we use this hierarchical decomposition to achieve the multiscale texture extraction. The performance of the proposed model is demonstrated with both synthetic and real images.
Adjacent evaluation of local binary pattern for texture classification
2015, Journal of Visual Communication and Image Representation
Citation Excerpt :
The statistical approaches use the statistical features to describe the textures [3,4]. In the model-based approaches, the texture image is modeled as a probability model or as a linear combination of a set of basis functions [5], e.g., auto-regressive model [6] and Orthogonal polynomials model [7]. The signal processing approaches are generally recognized as filtering approaches [8], in which the texture image is analyzed by filters, including Fourier filters [9], Gabor filters [10], Wavelet filters [11], Morphological filters [12], and spatial filters [13,14].
This paper presents a novel, simple, yet robust texture descriptor against noise named the adjacent evaluation local binary patterns (AELBP) for texture classification. In the proposed approach, an adjacent evaluation window is constructed to modify the threshold scheme of LBP. The neighbors of the neighborhood center g_c are set as the evaluation center a_p. Surrounding the evaluation center, we set up an evaluation window and calculate the value of a_p, and then extract the local binary codes by comparing the value of a_p with the value of the neighborhood center g_c. Moreover, this adjacent evaluation method is generalized and can be integrated with the existing LBP variants such as completed local binary pattern (CLBP) and local ternary pattern (LTP) to derive new image features against noise for texture classification. The proposed approaches are compared with the state-of-the-art approaches on Outex and CUReT databases, and evaluated on three challenging databases (i.e. UIUC, UMD and ALOT databases) for texture classification. Experimental results demonstrate that the proposed approaches present a solid power of texture classification under illumination and rotation variations, significant viewpoint changes, and significant large-scale challenging conditions. Furthermore, the proposed approaches are more robust against noise and consistently outperform all the basic approaches in comparison.
Rotation invariant color texture classification using multiple sub-DLBPs
2015, Journal of Visual Communication and Image Representation
Citation Excerpt :
Texture analysis is an active research topic in many application fields, such as image retrieval [1,2], surface inspection for the industrial quality control [3], medical image analysis [4,5], remote sensing [6], and video surveillance [7].
It is known that the rotations of real-world color textures will vary arbitrarily. This paper presents a novel, simple, yet powerful method for rotation-invariant color texture classification. Firstly, we define a Distance-based Local Binary Pattern (DLBP) descriptor to characterize the color texture. By learning the joint distribution of the rotation-invariant DLBP and color intensity information, we define our Multiple Sub-DLBPs ( $MS_DLBP$ ) descriptor. The $MS_DLBP$ features defined in this paper are invariant to rotation. Here, we also compared seven important color spaces in terms of their effectiveness in our proposed $MS_DLBP$ approach. The experimental results on the Outex and CUReT databases show the defined DLBP descriptor performs better than the existing color LBP descriptors and the proposed $MS_DLBP$ approach is very robust to rotation invariance and outperforms state-of-the-art texture analysis methods. Also, HSV color space is shown to outperform the other color spaces in many cases.
Noise robust rotation invariant features for texture classification
2013, Pattern Recognition
Citation Excerpt :
The third class of texture methods defines textures as probability models. Some well-known models are Markov random field (MRF) [23,24], auto-regressive (AR) model [25,26], and Gibbs random field [27]. The key issue in these models is how to choose the correct model for a given texture and how to effectively map a texture into the selected probability model [22].
This paper presents a novel, simple, yet powerful and robust method for rotation invariant texture classification. Like the Local Binary Patterns (LBP), the proposed method considers at each pixel a neighboring function defined on a circle of radius R. We define local frequency components as the magnitude of the coefficients of the 1D Fourier transform of the neighboring function. By applying different bandpass filters on the 2D Fourier transform of the local frequency components, we define our Local Frequency Descriptors (LFD). The LFD features are added dynamically from low frequencies to high. The features defined in this paper are invariant to rotation. As well, they are robust to noise. The experimental results on the Outex, CUReT, and KTH-TIPS datasets show that the proposed method outperforms state-of-the-art texture analysis methods. The results also show that the proposed method is very robust to noise.
New Algorithms for the Estimation of Two-Dimensional Cyclic Spectral Information Based on Tensor Equations
2021, Journal of Signal Processing Systems
Review of the application of machine learning to the automatic semantic annotation of images
2019, IET Image Processing

View all citing articles on Scopus

^☆: Part of this work was done while the author was with University of Sherbrooke – Canada and with Al-Ain University of Science and Technology – UAE.

View full text

Texture representation and retrieval using the causal autoregressive model☆

Abstract

Introduction

Section snippets

Definition

Evaluation criteria

Perceptual interpretation

Application to content-based image retrieval

Summary and conclusions

Pattern Recognition

Image retrieval: ideas, influences, and trends of the new age

ACM Transactions on Computing Surveys

Content-based multimedia information retrieval: state of the art and challenges

ACM Transactions on Multimedia Computing, Communications, and Applications

Visual Information Retrieval

Textures: A Photographic Album for Artists and Designers

Lognormal random-field models and their applications to radar image synthesis

IEEE Transactions on Geoscience and Remote Sensing