Fusion of FLIR automatic target recognition algorithms

doi:10.1016/S1566-2535(03)00043-5

Information Fusion

Volume 4, Issue 4, December 2003, Pages 247-258

https://doi.org/10.1016/S1566-2535(03)00043-5 Get rights and content

Abstract

In this paper, we investigate several fusion techniques for designing a composite classifier to improve the performance (probability of correct classification) of forward-looking infrared (FLIR) automatic target recognition (ATR). The motivation behind the fusion of ATR algorithms is that if each contributing technique in a fusion algorithm (composite classifier) emphasizes on learning at least some features of the targets that are not learned by other contributing techniques for making a classification decision, a fusion of ATR algorithms may improve overall probability of correct classification of the composite classifier. In this research, we propose to use four ATR algorithms for fusion. The individual performance of the four contributing algorithms ranges from 73.5% to about 77% of probability of correct classification on the testing set. The set of correctly classified targets by each contributing algorithm usually has a substantial overlap with the set of correctly identified targets by other algorithms (over 50% for the four algorithms being used in this research). There is also a significant part of the set of correctly identified targets that is not shared by all contributing algorithms. The size of this subset of correctly identified targets generally determines the extent of the potential improvement that may result from the fusion of the ATR algorithms. In this research, we propose to use Bayes classifier, committee of experts, stacked-generalization, winner-takes-all, and ranking-based fusion techniques for designing the composite classifiers. The experimental results show an improvement of more than 6.5% over the best individual performance.

Introduction

Automatic target recognition (ATR) systems generally consist of three stages as shown in Fig. 1 [1]: (1) a preprocessing stage (target detection stage) that operates on the entire image and extracts regions containing potential targets, (2) a clutter¹ rejection stage that uses a sophisticated classification technique to identify true targets by discarding the clutter images (false alarms) from the potential target images provided by the detection stage, and (3) a classification stage that determines the type of the target.

ATR using forward-looking infrared (FLIR) imagery is an integral part of the ongoing research at US Army Research Laboratory (ARL) for digitization of the battlefield. The real-life FLIR imagery (for example the database available at the ARL, see Fig. 2, Fig. 3) demonstrates a significantly high level of variability of target thermal signatures. The high variability of target thermal signatures is due to several reasons, including meteorological conditions, times of the day, locations, ranges, etc. This highly unpredictable nature of thermal signatures makes FLIR ATR a very challenging problem. In recent years a number of FLIR ATR algorithms have been developed by the scientists at ARL as well as by the researchers in academia and industry working under ARL-sponsored research. These research activities have used a common development set of FLIR data (17,318 target images). The performance of these independently developed algorithms is measured in terms of the probability of correct classification using a common testing FLIR data collected under relatively unfavorable conditions. The testing FLIR data is not used during the algorithm development. The performance of these algorithms seems to be topped off around 77% of probability of correct classification.

In this paper, we investigate several fusion techniques for designing a composite classifier to improve the performance (probability of correct classification) of FLIR ATR. The motivation behind the fusion of ATR algorithms is that if each contributing technique in a fusion algorithm (composite classifier) emphasizes on learning at least some features of the targets that are not learned by other contributing techniques for making a classification decision, a fusion of ATR algorithms may improve overall probability of correct classification of the composite classifier. In this research, we propose to use four ATR algorithms for fusion with individual performance of the four contributing algorithms ranging from 73.5% to about 77% of probability of correct classification on the testing set (ROI data set with 3456 target images).

The first algorithm uses a multi-layer convolution neural network (MLCNN) [2] for designing the ATR system [3], [4]. An MLCNN is a variation of time-delay neural network (TDNN), which has been applied for speech and phoneme recognition [5]. MLCNN has been used for handwritten word recognition [6]. An MLCNN is a multi-layer feed-forward structure having three different kinds of layers: a convolution layer, a subsampling layer, and a summing layer that makes a partition of the results of the convolution operations and sums the results in each block. A nonlinear squashing function is usually applied to the outputs of each kind of layer. For the first convolution layer, each convolution operation can be interpreted as extracting the same feature in different parts of the input. The final output of the MLCNN gives scores for the classification of the tested target chip (image).

The next classification algorithm is based on learning vector quantization (LVQ) algorithm. The LVQ-based classifier consists of four stages [7]: a set of aspect windows of different size, a stage in which the extracted area is enlarged to a fixed size, a stage for wavelet decomposition of the enlarged extraction, and a dedicated VQ for each subband within each aspect window. In the first stage, each aspect window is a background-clipping rectangle whose size is determined by the type of target and the range of aspects that it operates on. After the background removal in the first stage, each extracted target area is enlarged in the second stage into a fixed dimension that is common to all the aspect windows. In the third stage, the enlarged extraction is decomposed into four subbands based on a wavelet decomposition process. After the wavelet decomposition, the final stage uses a set of VQ codebooks for feature matching and target recognition. In this stage, the target recognition problem is treated as a template-matching task through a similarity metric-based approach. A set of subband templates or code vectors is constructed for each subband of a particular target at a specific range of aspects. Each set of code vectors forms a codebook, representing the target signatures for a given subband of a particular target at a specific range of aspects.

The third classification algorithm used in the fusion is based on modular neural network (MNN) [8] approach in which the classification of targets is realized hierarchically [9]. The local structure of a target image is captured by extraction of the directional variance features at different resolutions in small image blocks. These image blocks are organized, by location, into several larger image regions, called receptive fields. An individual neural network (expert network) in the modular network is designed to classify targets using only the local features in a single receptive field. At the final level, the classification results of the expert networks for the individual receptive fields are further combined to produce the final classification decision.

The fourth technique uses expansion matching (EXM) filtering and Karhunen–Loeve transform (KLT) for feature extraction [10], [11]. The extracted features (feature vectors) are then used to train supporting vector machines (SVM), which finally identify the class of the input target image. In this research, we propose to use averaged Bayes classifier, committee of experts, stacked-generalization, winner-takes-all, and ranking-based fusion techniques for designing the composite classifiers. The experimental results show an improvement of more than 6.5% over the best individual performance.

This paper is organized as follows. Section 2 presents several fusion techniques proposed in this paper. Section 3 presents the experimental results and Section 4 concludes the paper discussing the future research directions.

Section snippets

Composite classifiers

Several approaches have been used in the previous research for improving generalization performance of a classification system. For example, the boosting and committee-of-experts techniques have been used successfully in character recognition applications for improving generalization performance [12]. These approaches generally require that a number of experts be trained on subsets of the training data, where these subsets could be disjoint as well as overlapping. These approaches may be

Experimental results

In all experiments performed, we have used a total of 17,318 target images from US Army Comanche data set as our development set. The development set consists of two database (1) SIG, which is collected under favorable conditions and has 13,862 target images (10 target types), (2) ROI,² which is collected under less favorable conditions and has 3456 target images (five target types). Specifically, this

Conclusions

In this paper, we have investigated six different fusion techniques and demonstrated that the performance of an ATR system can be improved by using a fusion algorithm. We obtained about 6.5% improvement over the best performing contributing classifier, which is a significant improvement considering the difficult testing data used in the experiments performed in this paper. An interesting observation was that the stacked generalization technique did not perform better than the averaged Bayes

Acknowledgements

This research was in part supported by Army Research Office Grant #DAAD19-001-0533 and PSC-CUNY Grant #65742-00-34.

References (21)

D.H Wolpert
Stacked generalization
Neural Networks
(1992)
B Bhanu
Automatic target recognition: state of the art survey
IEEE Trans. Aerospace Electron. Syst.
(1986)
Y Le Cun
Generalization and network design strategies
V. Mirelli, S.A. Rizvi, Automatic target recognition using a multi-layer convolution neural network, in: Proceedings of...
S.A. Rizvi, A. Chen, V. Mirelli, N.M. Nasrabadi, L.-C. Wang, S. Der, M. Hamilton, Mixture-of-expert approach to target...
Waibel et al.
Phoneme recognition using time-delay neural networks
IEEE Trans. Acoust. Speech Signal Process.
(1989)
Y Bengio et al.
Globally trained handwritten word recognizer using spatial representation, space displacement neural networks and hidden Markov models
Adv. Neural Inf. Process. Syst.
(1994)
L.A Chan et al.
Multi-stage target recognition using modular vector quantizers and multilayer perceptrons
Proc. Comput. Vis. Pattern Recogn.
(1996)
S.S Haykin
Neural Networks: A Comprehensive Foundation
(1994)
L.-C. Wang, S. Der, N.M. Nasrabadi, S.A. Rizvi, Automatic target recognition using neural networks, in: Proceedings of...

There are more references available in the full text version of this article.

Cited by (21)

Hierarchical distance learning by stacking nearest neighbor classifiers
2016, Information Fusion
Citation Excerpt :
Sen and Erdogan analyze [3] various weighted and sparse linear combination methods which aggregate decisions of heterogeneous base-layer classifiers, such as decision trees and k-NN. Several SG algorithms are reported in the literature for data fusion, regression, classification and Natural Language Processing [14–22]. An experimental study of the analysis of diversity of linear classifiers in bagging and boosting is provided by Kuncheva et al. [23].
We propose a two-layer decision fusion technique, called Fuzzy Stacked Generalization (FSG) which establishes a hierarchical distance learning architecture. At the base-layer of an FSG, fuzzy k-NN classifiers receive different feature sets each of which is extracted from the same dataset to gain multiple views of the dataset. At the meta-layer, first, a fusion space is constructed by aggregating decision spaces of all the base-layer classifiers. Then, a fuzzy k-NN classifier is trained in the fusion space by minimizing the difference between the large sample and N-sample classification error. In order to measure the degree of collaboration among the base-layer classifiers and the diversity of the feature spaces, a new measure called, shareability, is introduced. Shearability is defined as the number of samples that are correctly classified by at least one of the base-layer classifiers in FSG. In the experiments, we observe that FSG performs better than the popular distance learning and ensemble learning algorithms when the shareability measure is large enough such that most of the samples are correctly classified by at least one of the base-layer classifiers. The relationship between the proposed and state-of-the-art diversity measures is experimentally analyzed. The tests performed on a variety of artificial and real-world benchmark datasets show that the classification performance of FSG increases compared to that of state-of-the art ensemble learning and distance learning methods as the number of classes increases.
On statistical approaches to target silhouette classification in difficult conditions
2008, Digital Signal Processing: A Review Journal
In this paper we present a methodical evaluation of the performance of a new and two traditional approaches to automatic target recognition (ATR) based on silhouette representation of objects. Performance is evaluated under the simulated conditions of imperfect localization by a region of interest (ROI) algorithm (resulting in clipping and scale changes) as well as occlusions by other silhouettes, noise and out-of-plane rotations. The two traditional approaches are holistic in nature and are based on moment invariants and principal component analysis (PCA), while the proposed approach is based on local features (object parts) and is comprised of a block-by-block 2D Hadamard transform (HT) coupled with a Gaussian mixture model (GMM) classifier. Experiments show that the proposed approach has good robustness to clipping and, to a lesser extent, robustness to scale changes. The moment invariants based approach achieves poor performance in advantageous conditions and is easily affected by clipping and occlusions. The PCA based approach is highly affected by scale changes and clipping, while being relatively robust to occlusions and noise. Furthermore, we show that the performance of a silhouette recognition system subject to mismatches between training and test angles of silhouettes (caused by an out-of-plane rotation) can be considerably improved by extending the training set using only a few angles which are widely spaced apart. The improvement comes without affecting the performance at “side-on” views.
Dynamic management of multiple classifiers in complex recognition system
2012, International Journal of Pattern Recognition and Artificial Intelligence
Maritime automatic target recognition for ground-based scanning radars by using sequential range profiles
2021, Turkish Journal of Electrical Engineering and Computer Sciences
DeepTarget: An Automatic Target Recognition Using Deep Convolutional Neural Networks
2019, IEEE Transactions on Aerospace and Electronic Systems
Recognizing objects in 3D data with distinctive self-similarity features
2018, Proceedings of SPIE - The International Society for Optical Engineering

View all citing articles on Scopus

View full text

Published by Elsevier B.V.

Fusion of FLIR automatic target recognition algorithms

Abstract

Introduction

Section snippets

Composite classifiers

Experimental results

Conclusions

Acknowledgements

Neural Networks

Automatic target recognition: state of the art survey

IEEE Trans. Aerospace Electron. Syst.

Generalization and network design strategies

Phoneme recognition using time-delay neural networks

IEEE Trans. Acoust. Speech Signal Process.

Globally trained handwritten word recognizer using spatial representation, space displacement neural networks and hidden Markov models

Adv. Neural Inf. Process. Syst.

Multi-stage target recognition using modular vector quantizers and multilayer perceptrons

Proc. Comput. Vis. Pattern Recogn.

Neural Networks: A Comprehensive Foundation