Elsevier

Pattern Recognition

Volume 40, Issue 9, September 2007, Pages 2521-2529
Pattern Recognition

Use of power law models in detecting region of interest

https://doi.org/10.1016/j.patcog.2007.01.004Get rights and content

Abstract

In this paper, we shall address the issue of semantic extraction of different regions of interest. The proposed approach is based on statistical methods and models inspired from linguistic analysis. Here, the models used are Zipf law and inverse Zipf law. They are used to model the frequency of appearance of the patterns contained in images as power law distributions. The use of these models allows to characterize the structural complexity of image textures. This complexity measure indicates a perceptually salient region in the image. The image is first partitioned into sub-images that are to be compared in some sense. Zipf or inverse Zipf law are applied to these sub-images and they are classified according to the characteristics of the power law models involved. The classification method consists in representing the characteristics of the Zipf and inverse Zipf model of each sub-image by a point in a representation space in which a clustering process is performed. Our method allows detection of regions of interest which are consistent with human perception, inverse Zipf law is particularly significant. This method has good performances compared to more classical detection methods. Alternatively, a neural network can be used for the classification phase.

Introduction

Automatic detection of regions of interest in images is one of the most critical problems in computer vision. For a human observer, detecting a perceptually important region in an image is a natural task which is done instantaneously, but for a machine it is far more difficult, as the machine lacks the cultural references and knowledge to identify the content of the scene. One of the causes for this difficulty is the subjective nature of the notion of region of interest (ROI). In the most general sense, a ROI, as its name suggests, is a part of the image for which the observer of the image shows interest. Of course, the interest shown by the observer in viewing the image is determined not only by the image itself, but also by the observer's own sensitivity. For a given image, different people could find different regions of interest. However, it can be said, in most cases, regions of interest generally have visually and structurally distinctive features than the rest of the image. Then some structural characteristics can be used to detect the ROI of an image without making hypotheses about the semantic content of the picture. The detection of the ROI consists in finding a region of the image which appears different from the background with respect to low-level features such as contrast, colour, region size and shape, distribution of contours or texture pattern. Different methods have been proposed to detect regions of interest in an image. Some are based on models of low-level human vision, such as the method proposed by Osberger and Maeder [1] which detects perceptually important regions on the image by building importance maps based on various visual characteristics. The method proposed by Itti et al. [2] is based on multiscale centre-surround contrast and the method presented by Syeda-Mahmood [3] uses a segmentation in homogenous colour regions. Other methods are based on different structural characteristics of the image without explicit reference to the human vision. The method proposed by Di Gesu et al. [4] uses symmetry transforms, Stentiford [5] uses dissimilarities in local neighbourhoods, Kadir and Brady [6] use a local measure of entropy to detect salient features in an image, Wang et al. [7] use a wavelet transform and Carlotto and Stein [8] use a fractal model.

In this paper, we shall use a new approach based on a statistical model which originates in linguistic analysis: Zipf law. This model proved efficient to represent the distribution of the frequencies of words in the form of a power law. It has been discovered in 1949 by Zipf [9] in written English texts, and Zipf distributions have been found in every natural language. Similar power laws have been put into evidence in different domains of human and natural sciences, such as income distribution by Pareto [10] and Gibrat [11], population of cities by Zipf [12] and Gabaix [13], distribution of biological species by Yule [14], distribution of galactic voids by Gaite and Manrubia [15], Internet traffic by Breslau et al. [16] and Huberman et al. [17] and in the structure of music by Manaris et al. [18] and of natural noises by Dellandrea et al. [19]. In the domain of digital images, Zipf law has been used by Vincent et al. [20] to evaluate the distortion of compressed images and by Caron et al. [21] to detect artificial objects in natural environments. The method described in this paper aims to detect the regions of interest in various digital images. A potential application is automatic determination of the ROI for JPEG2000 compression. Some restrictive constrains have been assumed so that a process is possible. We assume the ROI to be detected is a single connected region in the image. The ROI must both have a significant size and appear different from the background with respect to its structural complexity. Most often, the ROI is belonging to the foreground of the scene and contains more details than the background but our method also handles the case when the ROI is more uniform than the background.

The power law model will first be recalled, and then we will present the adaptation that is needed to apply it to image analysis. Then, the detection methods based on Zipf and inverse Zipf laws will be presented, and the influence of initial image segmentation will be discussed.

Section snippets

Power law models

The power law distribution model known as Zipf law or rank-frequency law has been determined empirically by Zipf. According to Zipf law, in a topologically ordered set of symbols like a text, the frequency of occurrence of the different symbol patterns, for example the words of the text, follows a power law. To be more precise, different patterns occur in the signal, they are n-tuples of symbols. Let us note as P1,P2,,PR these patterns. The frequencies of those patterns depend on the signal,

Application to images

Some years ago, Lindsey and Strömberg [28] used the frequencies of simple features. In this study, the approach finds its base in some law that has been verified in different fields. We show both Zipf law and inverse Zipf law can be used for image analysis. To adapt the model to image analysis, it is necessary to define the equivalent of the notion of word in the case of images. In order to respect the usual topology that structure the plane of the image, we will use image patterns defined as

Detection using Zipf law

The ROI of the image can be defined as a part of the image whose structural content differs from the background of the picture. Power law models can provide an indication on the structural content of the picture, however, they provide no spatial information by themselves. Therefore, we are introducing a more local study followed by a global analysis of the local results. In order to reveal some local information, it is necessary to divide the image into sub-images as shown in Fig. 5. In each of

Detection using inverse Zipf law

Like Zipf law, inverse Zipf law can be used to detect regions of interest in an image. The image is again partitioned into sub-images and the characteristics of the inverse Zipf plots corresponding to the sub-image are represented graphically by points in a graph. The significant characteristics of an inverse Zipf plot we are using are its slope and the number of patterns which appear only once in the picture. On the graphical representation, the horizontal coordinate represents the slope and

Influence of initial segmentation

The size of the ROI depends on the size of the whole image. Here we are going to study the influence of the image size on the characteristics of its pattern distribution. Indeed, the size of the sub-images must be chosen properly. They must be large enough to present a statistically significant pattern distribution but also be small enough to allow a precise determination of the ROI. As an image always aims to show something, and to bring some information to the observer, we have chosen to

Experimental results

The method has been tested on a base of 100 digital photographic images containing a distinct ROI which has been determined by a human observer. The images have been gathered among different public image databases. The images are uncompressed or mildly compressed and their size varies from 256×256 to 2000×1500. The images have been segmented into the optimal number of sub-images according to their size. The detection is considered to be successful if the subjective ROI is included in the region

Conclusion

Power law models such as Zipf and inverse Zipf laws can be used to characterize the structural complexity of an image. Our former nonconstrainable hypotheses show that the use of a method based on those models allows to detect automatically the ROI of an image, given that this ROI appears sufficiently distinct from the background. Both Zipf law and inverse Zipf law can be used to detect ROI, however, the method based on inverse Zipf law produces better results for our application. The method

About the Author—YVES CARON has received his Ph.D. in Computer Science in 2004 from University of Tours and is now working in a private company in the field of image analysis. He has worked on different applications of Zipf law in image analysis.

References (28)

  • V. Di Gesu et al.

    Local operators to detect regions of interest

    Pattern Recognition Lett.

    (1997)
  • C.S. Lindsey et al.

    Image classification using the frequencies of simple features

    Pattern Recognition Lett.

    (2000)
  • W. Osberger, A.J. Maeder, Automatic identification of perceptually important regions in an image, in: 11th...
  • L. Itti et al.

    A model of saliency-based visual attention for rapid scene analysis

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1998)
  • T.F. Syeda-Mahmood

    Data and model-driven selection using colour regions

    Int. J. Comput. Vision

    (1997)
  • F. Stentiford, An estimator for visual attention through competitive novelty with application to image compression, in:...
  • T. Kadir et al.

    Saliency, scale and image description

    Int. J. Comput. Vision

    (2001)
  • J.Z. Wang et al.

    Unsupervised multiresolution segmentation for images with low depth of field

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2001)
  • M.J. Carlotto et al.

    A method for searching artificial objects on planetary surfaces

    J. Br. Interplanet. Soc.

    (1990)
  • G.K. Zipf

    Human Behavior and the Principle of Least Effort

    (1949)
  • V. Pareto

    Cours d’économie politique

    (1897)
  • R. Gibrat

    Les Inégalites Économiques

    (1931)
  • B.M. Hill

    Zipf's law and prior distributions for the composition of a population

    J. Am. Statist. Assoc.

    (1970)
  • X. Gabaix

    Zipf's law for cities: an explanation

    Q. J. Econ.

    (1999)
  • Cited by (22)

    • A review on the characterization of signals and systems by power law distributions

      2015, Signal Processing
      Citation Excerpt :

      Researchers plotted a log–log chart of the decreasing frequencies of the patterns vs. their rank, and concluded that Zipf law was a good fit, despite the fact that two linear zones appeared in the graph. Caron et al. [10] concluded that both Zipf and the inverse Zipf laws could be used to detect ROI in images, since the differences observed in the log–log graphs of these distributions helped to understand better the details in an image. In Fig. 2, we observe that the cluster {EQ,FF,TO} corresponds to natural phenomena, and the cluster {IP,ID,AC,LC,RP,TE,TA} reflects cases involving human activities.

    • Visual search: A large-scale perspective

      2013, Handbook of Statistics
      Citation Excerpt :

      In this section, we will motivate why we need large data for many interesting problems in computer vision. When dealing with large amounts of categorical data (such as words in dictionary) or distribution of integrated quantities (such as an area of connected components, count of patterns of pixels (Caron et al., 2007)), it has been empirically observed that they follow Zipf’s law (Newman, 2005). Zipf’s law is a discrete power law probability distribution, that is long tailed.

    • A saliency map based on sampling an image into random rectangular regions of interest

      2012, Pattern Recognition
      Citation Excerpt :

      Several approaches to saliency which employ power laws have thus been proposed. A Zipf's law based saliency map has been proposed in [23]. This work is similar to those of [21,22], except that it operates directly on the pixels rather than on the spectral space.

    • A review of power laws in real life phenomena

      2012, Communications in Nonlinear Science and Numerical Simulation
      Citation Excerpt :

      They obtained a linear trace, concluding that this distribution fitted the number of distinct patterns, with respect to their frequencies. Caron et al. [20] concluded that both Zipf and the inverse Zipf laws could be used to detect ROI in image, since the differences observed in the log–log graphs of these distributions helped to understand better the details in an image. This section investigates the cumulative distributions of several random variables of different kinds.

    • Transition thresholds and transition operators for binarization and edge detection

      2010, Pattern Recognition
      Citation Excerpt :

      They divide a gray-intensity image into several regions and decide how to binarize each region. ( 2) Caron et al. [3] detect region of interest characterizing each pixel with a gray-intensity template of 3×3, the frequency of which appears to obey a power law distribution. ( 3) Milewski et al. [4] presented a methodology for separating handwritten foreground pixels, from background in carbon copied medical forms.

    View all citing articles on Scopus

    About the Author—YVES CARON has received his Ph.D. in Computer Science in 2004 from University of Tours and is now working in a private company in the field of image analysis. He has worked on different applications of Zipf law in image analysis.

    About the Author—PASCAL MAKRIS is actually working as Associate Professor in Laboratoire d’Informatique in University of Tours where he received his Ph.D. He is member of the Reconnaissance de Formes et Analyse d’Image team (RFIA). He is interested in Power Law use in the field of different phenomena study, sound and image. He works particularly in the field of medical images.

    About the Author—NICOLE VINCENT is full Professor since 1996. She presently heads the Center of Research in Computer Science (CRIP5) and the team Systèmes Intelligents de Perception (SIP) in the university Paris Descartes–Paris 5. After studying in Ecole Normale Supérieure and graduation in Mathematics, Nicole Vincent received a Ph.D. in Computer Science in 1988 from Lyon Insa. She has been involved with several projects in pattern recognition, signal and image processing and video analysis. Her research interest concerns environmental perception by vision in control, security and for medical applications.

    View full text