Relaxation by Hopfield network in stereo image matching

doi:10.1016/S0031-3203(97)00069-1

Pattern Recognition

Volume 31, Issue 5, 1 March 1998, Pages 561-574

https://doi.org/10.1016/S0031-3203(97)00069-1 Get rights and content

Abstract

This paper outlines a relaxation approach using the Hopfield neural network for solving the global stereovision matching problem. The primitives used are edge segments. The similarity, smoothness and uniqueness constraints are transformed into the form of an energy function whose minimum value corresponds to the best solution of the problem. We combine two methods: (a) optimization/relaxation[1]and (b) relaxation merit[2]with the above three constraints mapped in an energy function. The main contribution is made (1) by applying a learning strategy in the similarity constraint and (2) by introducing specific conditions to overcome the violation of the smoothness constraint and to avoid the serious problem arising from the required fixation of a disparity limit. So, we improve the stereovision matching process. A better performance of the proposed method is illustrated with a comparative analysis against a classical relaxation method.

Introduction

A major portion of the research efforts of the computer vision community has been directed towards the study of the three-dimensional (3-D) structure of the objects using machine analysis of images.3, 4, 5Analysis of video images in stereo has emerged as an important passive method for extracting the 3-D structure of a scene.

Following the Barnard and Fischler[6]terminology, we can view the problem of stereo analysis as consisting of the following steps: image acquisition, camera modeling, feature acquisition, image matching, depth determination and interpolation. The key step is that of image matching, that is, the process of identifying the corresponding points in two images that are cast by the same physical point in 3-D space. This paper is devoted solely to this problem.

The basic principle involved in the recovery of depth using passive imaging is triangulation. In stereopsis the triangulation needs to be achieved with the help of only the existing environmental illumination. Hence, a correspondence needs to be established between features from two images that correspond to some physical feature in space. Then, provided the position of centers of projection, the effective local length, the orientation of the optical axis, and the sampling interval of each camera are known, the depth can be established using triangulation.[7]

A review of the state-of-the-art-of stereovision matching allows us to distinguish two sorts of techniques broadly used in this discipline, area-based and feature-based.2, 4, 8Area-based stereo techniques use correlation between brightness (intensity) patterns in the local neighborhood of a pixel in one image with brightness patterns in the local neighborhood of the other image9, 10, 11, 12, 13where the number of possible matches becomes high, while feature-based methods use sets of pixels with similar attributes, normally, either pixels belonging to edges14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25or the corresponding edges themselves.2, 5, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36These latter methods lead to a sparse depth map only, leaving the rest of the surface to be reconstructed by interpolation; but they are faster than area-based methods, because there are many fewer points (features) to be considered.[4]We select a feature-based method as justified below.

Due to the nature of the stereovision system, there are intrinsic and extrinsic factors affecting it: (a) extrinsic, in a practical stereovision system, the left and right images are obtained at different positions/angles; (b)intrinsic, the stereovision system is equipped with two different physical cameras (i.e. with different components), which are always placed at the same relative position (left and right). A systematic noise appears for each one.

As a result of the above-mentioned factors the corresponding features in both images may display different values. This may lead to incorrect matches. Thus, it is very important to find features in both images which are unique or independent of possible variation in the images.[37]This research paper uses a feature-based method where edge segments are to be matched, because our experiment has been carried out in an artificial environment where the edge segments are abundant. Such features have been studied in terms of reliability8, 38and robustness[37]and, as mentioned before, have also been used in previous stereovision matching works. This fact justifies our choice of features, although they may be too local. Four average attribute values (module and direction gradient, variance and Laplace) are computed for each edge segment, as we will see latter.

Our stereo correspondence problem can be defined in terms of finding pairs of true matches, namely, pairs of edge segments in two images that are generated by the same physical edge segment in space. These true matches satisfy some competing constraints, generally three, proposed by Marr and Poggio:21, 22(1) uniqueness, each edge segment in an image should be matched to a unique edge segment in the other image; (2) similarity, matched edge segments have similar local properties or attributes; (3) smoothness, disparity values in a given neighborhood change smoothly, except at a few depth discontinuities.

We, subsequently, introduce the very important disparity concept. Assume a stereo pair of edge segments, where a segment belongs to the left image and the other to the right one. If we superpose the right image over the left one, for example, the two edge segments of the stereo pair appear horizontally displaced. Following horizontal (epipolar) lines from top to bottom in the superimposed images, we determine the common length of the two edge segments from the parallelogram formed by the set of epipolar lines containing a point of the left edge segment and other of the right one. We compute the disparity value for the pair of edge segments as the average displacement between points belonging to both edge segments along the common length.

Following the above-introduced constraints, we can say that the similarity and smoothness constraints are associated to local and global matching processes, respectively. The major difficulty of stereo processing arises due to the need to make global correspondences. A local edge segment in one image may match equally well with a number of edge segments in the other image (this problem is compounded by the fact that the local matches are not perfect due to the above-mentioned extrinsic and intrinsic factors). These ambiguities in local matches can only be resolved by considering sets of local matches globally. Hence, to make global correspondences given a pair of edge segments, we consider a set of neighboring edge segments, where a bound on the disparity range defines the neighborhood. Relaxation is a technique commonly used to find the best matches globally and it refers to any computational mechanism that employs a set of locally interacting parallel processes, one associated with each image unit, that in an iterative fashion update each unit’s current labeling in order to achieve a globally consistent interpretation of image data.39, 40

We, next, perform a review of relaxation approaches in stereovision matching. Relaxation labeling is a technique (proposed by Rosenfeld et al.4, 5, 11developed to deal with uncertainty in sensory data interpretation systems and to find the best matches. It uses contextual information as an aid in classifying a set of interdependent objects by allowing interactions among the possible classifications of related objects. In the stereo paradigm the problem involves assigning unique labels (or matches) to a set of features in an image from a given list of possible matches. So, the goal is to assign each feature (edge segment) a value corresponding to disparity in a manner consistent with certain predefined constraints.

Two main relaxation processes can be distinguished in stereo correspondence: optimization based and probabilistic/merit based. In the optimization-based processes, stereo correspondence is carried out by minimizing an energy function which is formulated from the applicable constraints. It represents a mechanism for the propagation of constraints among neighboring match features for the removal of ambiguity of multiple stereo matches in an iterative manner. The optimal solution is ground state, that is, the state (or states) of the lowest energy. In the probabilistic/merit-based processes, the initial probabilities/merits, established from a local stereo correspondence process and computed from similarity in the feature values, are updated iteratively depending on the matching probabilities/merits of neighboring features and also from the applicable constraints. The following papers use a relaxation technique: (a) probabilistic/merit5, 13, 16, 17, 20, 21, 24, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56(b) optimization through a Hopfield neural network.11, 13, 23, 28, 57We use (b).

These systems impose global constraints, i.e. similarity, uniqueness and smoothness. The smoothness constraint requires, as stated, to fix a bound on the disparity range for any feature. As a result, they give good performance on the images shown. In our experiments, we have found complex images with highly repetitive structures (features) or with objects very close to the cameras and occlusions violating the smoothness constraint. Hence, setting a disparity limit is a difficult task, because the global correspondence, for a given pair of features, is based on how well their neighboring pairs of features match. The neighborhood is a concept related strongly to the disparity limit. Therefore, global correspondence for complex images is the difficult issue with which this paper is mainly concerned. We propose an approach for solving the edge-segment stereovision matching problem, in which we have made good use of existing global matching strategies, where the similarity, uniqueness and continuity are applied as a result of the combination of two methods (we have followed a very important conclusion of Pavlidis[58]: (1) the Yu and Tsai (YT)[1]optimization relaxation process, implemented by a Hopfield neural network; (2) the Minimum Differential Disparity (MDD) algorithm of Medioni and Nevatia.[2]The main contribution of this paper is made (1) by introducing specific conditions during the relaxation process to overcome the violation of the smoothness constraint and (2) to avoid the problem arising from the fixation of the disparity limit. We have also applied a learning strategy in the similarity constraint. So, our method works properly even if the images are complex. We have chosen the Hopfield neural network because the convergence of the relaxation process is guaranteed and has been used in previous stereovision matching works with success.11, 13, 23, 25, 28, 57Additionally, this computational model has a massively parallel execution capability, as we are only worried with the method effectiveness (from the viewpoint of the percentage of matching successes), this is a possibility not implemented in our work.

The paper is organized as follows: in Section 2the stereovision optimization relaxation matching schema and the relationship to Hopfield neural networks are described. In Section 3we summarize the proposed matching procedure. The performance of the method is illustrated in Section 4, where a comparative study against an existing global relaxation matching method is carried out. The necessity of a relaxation matching strategy against a local matching technique is also made clear. Finally in Section 5, there is a discussion of some related topics.

Section snippets

Stereovision relaxation matching by the hopfield neural network

As mentioned before, this paper proposes an optimization relaxation technique which is a synthesis of two existing methods, YT and MDD, where the presence of repetitive structures is taken into account and discontinuities in the disparity values are allowed (violation of the smoothness constraint). This last idea is based upon the work of Dhond and Aggarwal[59]which deals with narrow near objects that produce occlusion problems.

From YT we take the optimization relaxation scheme and the Hopfield

Summary of the matching procedure

After mapping the energy function onto the Hopfield neural network, the correspondence process is achieved by letting the network to evolve so that it reaches a stable state, i.e. when no change occurs in the states of its neurons during the updating procedure.

The whole matching procedure can be summarized as follows (according to the steps provided in Ruichek and Postaire):[25]

1.
(A) Neural network topology
2.
(1) The network is organized with a set of neurons representing pairs of edge segments (from

Design of a test strategy

In order to assess the validity and performance of the proposed method (i.e. how well our approach works in images with and without additional complexity), we have selected 30 stereo pairs of realistic stereoimages from an indoor environment (each pair consists of two left and right original images and two left and right images of labeled edge segments). All tested images are 512×512 pixels in size, with 256 gray levels. The two cameras have equal focal lengths and are situated in a parallel

Concluding remarks

An approach to stereo correspondence using the Hopfield neural network is presented. The stereo correspondence problem is formulated as an optimization task where an energy function, which represents the mapping of three constraints (similarity, smoothness and uniqueness) on the solution, is minimized, thanks to a Hopfield neural network. The similarity constraint is doubly applied: (a) during the computation of the local matching probability (see Section 2.2); and (b) during the mapping of the

About the Author—G. PAJARES received B.S. and Ph.D. degrees in Physics from UNED (Distance University of Spain) (1987, 1995) discussing a thesis on the application of pattern recognition techniques to stereovision. Since 1990 he worked at ENOSA in critical software development. He joined the Complutense University in 1995 as an Associate Professor in Robotics. His current research interests include robotics vision systems and applications of automatic control to robotics.

References (71)

S.S Yu et al.
Relaxation by the Hopfield neural network
Pattern Recognition
(1992)
G Medioni et al.
Segment based stereo matching
Comput. Vision Graphics Image Process
(1985)
D.H Kim et al.
Analysis of quantization error in linebased stereo matching
Pattern Recognition
(1994)
J.J Lee et al.
Stereo correspondence using the Hopfield neural network of a new energy function
Pattern Recognition
(1994)
J.M Cruz et al.
A neural network approach to the stereovision correspondence problem by unsupervised learning
Neural Networks
(1995)
J.M Cruz et al.
Stereo matching technique based on the perceptron criterion function
Pattern Recognition Lett.
(1995)
D.H Kim et al.
Stereo matching technique based on the theory of possibility
Pattern Recognition Lett.
(1992)
T.M Breuel
Finding lines under bounded error
Pattern Recognition
(1996)
S Lloyd et al.
A parallel binocular stereo algorithm utilizing dynamic programming and relaxation labelling
Comput Vision Graphics Image Process
(1987)
S Ranade et al.
Point pattern matching by relaxation
Pattern Recognition
(1980)

C.Y Wang et al.

Some experiments in relaxation image matching using corner features

Pattern Recognition

(1983)

T Pavlidis

Why progress in machine vision is so slow

Pattern Recognition Lett.

(1992)

J.G Leu et al.

Detecting the dislocations in metal crystals from microscopic images

Pattern Recognition

(1991)

R Nevatia et al.

Linear feature extraction and description

Computer Vision Graphics Image Processing

(1980)

A.R Dhond et al.

Structure from stereo — a review

IEEE Trans Systems Man Cybernnet

(1989)

T Ozanian

Approaches for stereo matching—A review

Modeling Identification Control

(1995)

G Pajares

Estrategia de solucion al problema de la correspondencia en vision estereoscopica por la jerarquıa metodologica y la integracion de criterios, Ph.D. thesis

(1995)

S Barnard et al.

Computational stereo

ACM Comput. Surveys

(1982)

K.S Fu et al.

RobóticaControl, detección, visión e inteligencia

(1988)

H. H. Baker, Building and using scene representations in image understanding, AGARD-LS-185 Machine Perception 3.1–3.11...

P Fua

A parallel algorithm that produces dense depth maps and preserves image features

Machine Vision Appl.

(1993)

Y Shirai

Three-Dimensional Computer Vision

(1983)

Y Zhou et al.

Artificial Neural Networks for Computer Vision

(1992)

W.E.L Grimson

Computational experiments with a feature-based stereo algorithm

IEEE Trans Pattern Anal. Mach. Intell.

(1985)

A Khotanzad et al.

Stereopsis by constraint learning feed-forward neural networks

IEEE Trans. Neural Networks

(1993)

Y.C Kim et al.

Positioning three-dimensional objects using stereo images

IEEE J. Robotics Automat.

(1987)

V.R Lutsiv et al.

On the use of a neurocomputer for stereoimage processing

Pattern Recognition Image Anal.

(1992)

D. Maravall and E. Fernandez, Contribution to the matching problem in stereovision, Proc. 11th IAPR: Internat. Conf....

D Marr

La Vision

(1985)

D Marr

Vision

(1982)

D Marr et al.

A computational theory of human stereovision

Proc. Roy. Society of London

(1979)

D Marr et al.

Cooperative computation of stereo disparity

Science

(1976)

M.S Mousavi et al.

ANN implementation of stereo vision using a multi-layer feedback architecture

IEEE Trans. Systems Man Cybernnet

(1994)

P Rubio

RPun algoritmo eficiente para la búsqueda de correspondencias en visión estereoscópica

Informática y Automática

(1993)

Y Ruycheck et al.

A neural network algorithm for 3-D reconstruction from stereo pairs of linear images

Pattern Recognition Lett.

(1996)

Cited by (43)

Soft computing strategy for stereo matching of multi spectral urban very high resolution IKONOS images
2012, Applied Soft Computing Journal
This work aims to define a new strategy for extracting and stereo matching of buildings using very high resolution multi spectral IKONOS images having a ratio base/height about 0.53, we do not have the intrinsic and extrinsic parameters of the images acquisition system. These images contain dense urban scenes including various kinds of roads, cars, vegetation and buildings. We are interested by buildings, some of them have different shapes or colours and others have close colours or shapes, so, they generate a lot of “false matches”. To solve this issue, we propose in this paper an approach based on soft computing field in order to extract regions of interest (buildings) and to match them, it contains two main steps: region segmentation and thresholding step using a specific fuzzy thresholding algorithm and a neural Hopfield matching stage based on new constraints including geometric and photometric regions properties. The presented strategy is nearly all automatic, it is fast and simple and the results of its applied tests on several kinds of stereo dense urban images are satisfactory.
Combining Support Vector Machines and simulated annealing for stereovision matching with fish eye lenses in forest environments
2011, Expert Systems with Applications
Citation Excerpt :
This is a global approach belonging to the category of methods that incorporate explicit smoothness assumption and determine all disparities simultaneously by applying an energy minimization process. Other methods, considered as global approaches, are those based on graph cuts (Bleyer & Gelautz, 2005b), belief propagation (Felzenszwalb & Huttenlocher, 2004) or Hopfield Neural Networks (Pajares, Cruz, & Aranda, 1998) among others. As reported in Klaus, Sormann, and Karner (2006) some advances and good performances in stereovision matching have been obtained by applying consecutive processes under different layers (Bleyer & Gelautz, 2005a).
We present a novel strategy for computing disparity maps from omni-directional stereo images obtained with fish-eye lenses in forest environments. At a first segmentation stage, the method identifies textures of interest to be either matched or discarded. Two of them are identified by applying the powerful Support Vector Machines approach. At a second stage, a stereovision matching process is designed based on the application of four stereovision matching constraints: epipolarity, similarity, uniqueness and smoothness. The epipolarity guides the process. The similarity and uniqueness are mapped once again through the Support Vector Machines, but under a different way to the previous case; after this an initial disparity map is obtained. This map is later filtered by applying the Discrete Simulated Annealing framework where the smoothness constraint is conveniently mapped. The combination of the segmentation and stereovision matching approaches makes the main contribution. The method is compared against the usage of simple features and combined similarity matching strategies.
The extraction of features and disparities from images by a model based on the neurological organisation of the visual system
2008, Vision Research
A computational simulation of the early stages of mammalian visual processing, from the retina to the primary visual cortex, is described. The simulation uses elements that are organised according to the anatomical connections of the biological visual system. It explores how observed responses of simple cells of the primary visual cortex can be generated by a small number of stages of the types of processing that are observed in the nervous system. Edge features are extracted from single images and disparities between stereoscopic image pairs are detected with good reliability. An important parameter affecting processing was found to be the strength of the surround inhibition between the elements that represent neurones of the primary visual cortex.
Neural disparity computation for dense two-frame stereo correspondence
2008, Pattern Recognition Letters
This work aims at defining a new method for matching correspondences in stereoscopic image analysis. A representation of occlusions drives the overall matching process. Based on the taxonomy proposed by Scharstein and Szelinsky (2002, IJCV, 47, 7–42), the dense stereo matching process is divided into three tasks: matching cost computation, aggregation of local evidence and computation of disparity values. Within the second and third phases new strategies are introduced in an attempt to improve the reliability of results. Aggregation is based on a new local matching measure, and neural techniques compute disparities adaptively. Two experimental studies were conducted to evaluate and compare the solutions proposed. The first uses a standard well-known dataset including data with true disparity maps; the second study was conducted on complex real images acquired by a scanning electron microscope (SEM).
Fuzzy Cognitive Maps for stereovision matching
2006, Pattern Recognition
This paper outlines a method for solving the stereovision matching problem using edge segments as the primitives. In stereovision matching the following constraints are commonly used: epipolar, similarity, smoothness, ordering and uniqueness. We propose a new matching strategy under a fuzzy context in which such constraints are mapped. The fuzzy context integrates both Fuzzy Clustering and Fuzzy Cognitive Maps. With such purpose a network of concepts (nodes) is designed, each concept represents a pair of primitives to be matched. Each concept has associated a fuzzy value which determines the degree of the correspondence. The goal is to achieve high performance in terms of correct matches. The main findings of this paper are reflected in the use of the fuzzy context that allows building the network of concepts where the matching constraints are mapped. Initially, each concept value is loaded via the Fuzzy Clustering and then updated by the Fuzzy Cognitive Maps framework. This updating is achieved through the influence of the remainder neighboring concepts until a good global matching solution is achieved. Under this fuzzy approach we gain quantitative and qualitative matching correspondences. This method works as a relaxation matching approach and its performance is illustrated by comparative analysis against some existing global matching methods.
Projective invariant object recognition by a Hopfield network
2004, Neurocomputing
Citation Excerpt :
Nasrabadi and Choo [14] first used a Hopfield network to solve the correspondence problem for a set of feature points extracted from a pair of stereo images. In [17], a Hopfield network was also employed for solving the global stereo matching problem using edge segments. A fifth-order relaxation network has been proposed by Branca et al. [3,4] to find the feature correspondences for motion estimation by taking advantage of some good initial guess.
This paper presents a Hopfield neural network model for matching features invariant to projective transformations. The projective invariance has been embedded into the compatibility constraint for the first time, such that the problem of finding point correspondences can be formulated by minimizing the predefined energy function through a Hopfield network. The neighborhood information of the data can help to reduce the fifth-order constraint to a second-order one, such as points along the silhouettes, or convex hull of a discrete set of points. The proposed method has been tested with a series of real images and performs well.

View all citing articles on Scopus

About the Author—J.M. CRUZ received M.Sc. degree in Physics and Ph.D. from the Complutense University in 1979 and 1984, respectively. From 1985 to 1990 he was with the Department of Automatic Control, UNED (Distance University of Spain), and from October 1990 to 1992 with the Department of Electronic, University of Santander. In October 1992, he joined the Department of Computer Science and Automatic Control of the Complutense University where he is a Professor. His current research interests include robotics vision systems, fusion sensors and applications of automatic control to robotics and flight control.

About the Author—J. ARANDA received B.S. and M.S. degrees in Physics from Complutense University of Madrid (1983) and Ph.D. degree from the UNED (Distance University of Spain) in 1989. From 1985 to 1987 he was a Teaching Assistant in the Department of Automatic Control and Computer Science at the Complutense University. He joined in the UNED in 1987 as an Assistant Professor and he has since then held positions as Associate Professor (1991). His current research interests include robotics vision systems, fusion sensors and applications of automatic control to robotics and flight control.

View full text

Relaxation by Hopfield network in stereo image matching

Abstract

Introduction

Section snippets

Stereovision relaxation matching by the hopfield neural network

Summary of the matching procedure

Design of a test strategy

Concluding remarks

Pattern Recognition

Comput. Vision Graphics Image Process

Pattern Recognition

Pattern Recognition

Neural Networks

Pattern Recognition Lett.

Pattern Recognition Lett.

Pattern Recognition

Comput Vision Graphics Image Process

Pattern Recognition

Pattern Recognition

Pattern Recognition Lett.

Pattern Recognition

Computer Vision Graphics Image Processing

Structure from stereo — a review

IEEE Trans Systems Man Cybernnet

Approaches for stereo matching—A review

Modeling Identification Control

Estrategia de solucion al problema de la correspondencia en vision estereoscopica por la jerarquıa metodologica y la integracion de criterios, Ph.D. thesis

Computational stereo

ACM Comput. Surveys

RobóticaControl, detección, visión e inteligencia

A parallel algorithm that produces dense depth maps and preserves image features

Machine Vision Appl.

Three-Dimensional Computer Vision

Artificial Neural Networks for Computer Vision

Computational experiments with a feature-based stereo algorithm

IEEE Trans Pattern Anal. Mach. Intell.

Stereopsis by constraint learning feed-forward neural networks

IEEE Trans. Neural Networks

Positioning three-dimensional objects using stereo images

IEEE J. Robotics Automat.

On the use of a neurocomputer for stereoimage processing

Pattern Recognition Image Anal.

La Vision

Vision

A computational theory of human stereovision

Proc. Roy. Society of London

Cooperative computation of stereo disparity

Science

ANN implementation of stereo vision using a multi-layer feedback architecture

IEEE Trans. Systems Man Cybernnet

RPun algoritmo eficiente para la búsqueda de correspondencias en visión estereoscópica

Informática y Automática

A neural network algorithm for 3-D reconstruction from stereo pairs of linear images

Pattern Recognition Lett.