Statistics and category systems for the shape index descriptor of local 2nd order natural image structure
Introduction
The notions of local image structure and image features are as old as the field of computer vision itself. Both topics has always attracted much attention and the literature is vast. Although much progress has been made, complete lists of local structure types or features in say natural images still seem elusive and most approaches based on features are constrained by the availability of feature detectors more or less suited for the tasks at hand. Our aim with this paper is to investigate the feature types governed by 2nd order structure of natural images. We will continue this introduction with a short review of the ideas that have dominated the way features are conceived, describe our overall approach, and give a summary of how we choose to measure quantitative image structure.
Local image structure in its simplest form can be described through the pixels in a neighbourhood around a point – typically in form of a patch of image pixels. Although pure patch based descriptions have recently had a revival [52], [8], [58], [1], a next step has been to apply a bank of filters to the image patches to further facilitate analysis – often this is realised without explicit patch extraction using convolution across the entire image. Many collections of filters have been proposed over time: Gabor Filters, Wavelets, and derivatives of Gaussians [24], [40], [57], [25]. The output of such filter banks is typically a response vector in an N-dimensional response space. Many vision algorithms use these quantitative descriptions of local image structure directly and with great success [4], [61], [11], [7]. Nevertheless, it remains an intriguing question whether local image structure can be described through a limited alphabet of qualitative distinct letters or feature types. Filter responses have been used as the basis for a number of feature detectors [41], [3], [20], [38], [38], [39] – all of these are, however, results of conscious design choices and do not bring us much closer to answering the posed question.
The main aim of this work is to investigate whether such an alphabet naturally arises from geometrical and statistical studies of a simple model of early visual processing (a filter bank) exposed to natural image stimuli. We adopt the general notion that an alphabet can be found through a partitioning [29], [26], [31], [35], [54], [60], [15], [17] of response space into categories of qualitatively similar structures. We will, however, refine this and rather discuss the partitioning of a factored response space. A description that factors out extrinsic factors such as translation, orientation, and absolute grey value and generally provides a lower dimensional and more invariant starting point than the full response space. As mentioned above, we focus on the 2nd order structure in this work and thus another aim of the paper is to describe and discuss a suitable factored response space and its properties for natural images – see Section 2.
In terms of categorising or partitioning the factored response space, we will investigate three separate approaches. One is based on the idea of categories being defined as Voronoi cells around prototypes [12]. The second approach is inspired by Koenderink [29], [26], [31] and refined through Geometric Texton Theory [15], [17]. Here the category structure is induced through qualitative similarity of canonical representatives across response space. In the third and final approach, we consider stability – a desirable quality of any category system and investigate the most stable category systems under simple image perturbations such as additive Gaussian noise. These ideas are presented in Section 3.
The output of V1 simple cells are modelled as the inner product of Receptive Field weighting functions (RFs) and the retinal irradiance [22]. Several models of V1 simple cell ensembles such as Gabor filters have been reported in the literature [6], [24].
In this work we, however, prefer the Gaussian Derivatives (DtGs) as an uncommitted model to measure local image structure. Several arguments favour the DtGs: in terms of biological relevance DtGs up to 4th or 5th order provide a well fitting set of models (a simple linear combination brings the DtGs into the ‘wave-train’ basis very similar to Gabors [59], [30]). It was recently demonstrated [14] that DtGs provide as good a fit to reported Gabors as, e.g. filters obtained from ICA [2], [53], [23] or sparse coding [44] based approaches. Furthermore, the DtGs are well studied and have many attractive properties [25], [28], [10], [32], [14]. The real strength of DtGs is that the measured quantities are much easier to interpret than corresponding Gabors. Local measurements of an image with DtGs up to some order n is also referred to as the n-jet. The n-jet can be understood as the coefficients of truncated Taylor-series of the blurred image.
The one-dimensional Gaussian kernel of is given by , where is the standard deviation and controls the scale or blur at which the jet is calculated. Due to separability the two-dimensional Gaussian kernel is easily constructed as . We will use , etc. to denote partial derivatives of the two-dimension kernel and to denote the inner product between an image patch L and the suitable derivative of the Gaussian, i.e. . The n-jet is the vector of inner products up to total order of derivation n – the 2-jet is thus given by .
Section snippets
Intrinsic 2nd order structure and the shape index
The first focus of the paper is an investigation of ‘pure’ 2nd order structure in natural images. In the following, we motivate that this is best studied through the shape index and present properties and novel statistics of this descriptor.
Shape index category systems
The shape index constitutes a natural re-parameterisation of the factored jet-space and we now address the second main topic of this paper – an investigation of category systems (or partitionings of the factored response space) that naturally arise from the geometry and natural image statistics of the shape index descriptor. As described in the introduction, three different avenues will be investigated and the results related.
Discussion
The novel contributions of this paper are:
- •
Statistics for the shape index, curvedness, and principal direction for a large database of natural images yielding results clearly distinct from that of Gaussian random noise controls. Furthermore, we have re-argued Koenderink’s point that the shape index is a natural, invariant, and decoupled re-parameterisation of 2nd order (image) fundamental structure. In combination with the curvedness and principal direction, we have a very intuitive triple
Acknowledgement
EPSRC-funded project ‘Basic Image Features’ EP/D030978/1.
References (61)
- et al.
The “independent components” of natural scenes are edge filters
Vision Research
(1997) - et al.
Natural image profiles are most likely to be step edges
Vision Research
(2004) - et al.
A two-layer sparse coding model learns simple and complex cell receptive fields and topography from natural images
Vision Research
(2001) - et al.
Receptive field assembly specificity
Journal of Visual Communication and Image Representation
(1992) - et al.
Nonlinear total variation based noise removal algorithms
Physica D
(1992) Characteristics and models of human symmetry detection
Trends in Cognitive Sciences
(1997)- J. Aghajanian, Patch-based face recognition using latent identity variables, in: The Inaugural BMVA Students Paper...
A computational approach to edge detection
IEEE Transactions on Pattern Analysis and Machine Intelligence
(1986)- V. Caselles, R. Kimmel, G. Sapiro, Geodesic active contours, in: ICCV95, 1995, pp....
- F. Casorati, Nuova definitione della curvatura delle superficie e suo confronto con quella di gauss, Rend. Maem. Accd....
Uncertainty relations for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters
Journal of the Optical Society of America A
Shape particle filtering for image segmentation
Relations between the statistics of natural images and the response properties of cortical cells
Journal of the Optical Society of America
Cartesian differential invariants in scale-space
Journal of Mathematical Imaging and Vision
Locating articular cartilage in MR images
Proceedings of SPIE
Conceptual Spaces: The Geometry of Thought
Image features and the 1-d, 2nd order gaussian derivative jet
Proceedings of Scale Space
The 2nd order local-image-structure solid
IEEE Transactions on Pattern Analysis and Machine Intelligence
Hypotheses for image features, icons and textons
International Journal of Computer Vision
Feature category systems for 2nd order local image structure induced by natural image statistics and otherwise
Proceedings of SPIE
Receptive fields and functional architecture of monkey striate cortex
Journal of Physiology
The two-dimensional spatial structure of simple receptive fields in cat striate cortex
Journal of Neurophysiology
The structure of images
Biological Cybernetics
What is a feature?
Journal of Intelligent Systems
Cited by (7)
A comprehensive survey on computer vision based concepts, methodologies, analysis and applications for automatic gun/knife detection
2021, Journal of Visual Communication and Image RepresentationLocal image structure and procrustes metrics
2018, SIAM Journal on Imaging SciencesLocal solid shape
2015, i-PerceptionHEp-2 cell classification using shape index histograms with donut-shaped spatial pooling
2014, IEEE Transactions on Medical ImagingShading, a view from the inside
2012, Seeing and PerceivingAutomatic detection and quantification of tree-in-bud (TIB) opacities from CT Scans
2012, IEEE Transactions on Biomedical Engineering