Elsevier

Information Fusion

Volume 5, Issue 1, March 2004, Pages 65-71
Information Fusion

Algorithm for retrieval and verification of personal identity using bimodal biometrics

https://doi.org/10.1016/j.inffus.2003.09.001Get rights and content

Abstract

A scheme for integrating two modalities for person recognition is proposed. The system works in the open-set identification mode, which means that it identifies with possibility of rejecting impostors. One classifier is used to retrieve a ranking of n most probable labels, which are consecutively verified using the second classifier. Verification procedure takes into consideration the probability that a label is genuine. The system has been implemented to combine face identification and speaker verification. Testing demonstrated that fusion yields up to 3.9% reduction of FRR at constant FAR with 67% increase of system response time.

Introduction

With increasing global need for security, the demand for robust automatic person recognition systems is evident. Thanks to advances in computer technology, pattern recognition methods of identity authentication based on biometric features, formerly considered as too computationally demanding, become feasible. For applications involving the flow of confidential information, the authentication accuracy of the system is always the prior concern. However, such a system, to be commercially attractive, must also be accepted by users. Although the most accurate biometrics, like iris or retinal scan, satisfy the strictest accuracy requirements, people may not feel comfortable with the feature collection process. Unfortunately, less intrusive and hence the most accepted biometrics, like voice or face image, are inherently more prone to recognition errors.

Significant improvement of recognition accuracy can be achieved through multimodal biometrics [4], [5], which combines information provided by multiple sources, like different human characteristics. The recognition is usually performed in two steps: firstly, unimodal subsystems quantify the similarity of the collected biometrics to the template stored in the database. In the second step, the statistical analysis of scores provided by the subsystems is performed according to employed decision fusion scheme. Multimodal biometrics does not necessarily need to combine multiple biometrics. Prabhakar and Jain [10] distinguish five scenarios a multimodal system may work in: (i) using physically different biometrics; (ii) multiple sensors to collect the same biometrics; (iii) multiple units of the same biometrics, like both hands or 10 fingerprints; (iv) multiple instances of the same biometrics, like different impressions of the fingerprint; finally, (v) combining different methods of feature extraction or matching using the same biometric signal. The first two scenarios require using multiple sensors which increases the cost of a system, but also ensures that signals are low or not correlated.

Several integration schemes have been developed, for an extensive overview of the state of the art the reader is encouraged to read Verlinde’s PhD Thesis [12]. Among others, Kittler et al. [7] have presented an identification scheme, in which the Bayesian probability of correct classification is evaluated using multiple snapshots of a single characteristics (face image). Brunelli and Falavigna [1] have proposed a system in which each incorporated module returns a score for each label. After normalization, scores associated with one label are used to determine the multimodal confidence value. With this approach, a feature vector must be compared with all models stored in the database. The system uses two acoustic and three visual features, which requires 5N comparisons, where N is the number of records in the database. If N is high, the response time may be too long for practical purposes.

As an alternative, Hong and Jain [5] have suggested the retrieval and verification approach and as an example developed a system integrating fingerprints and face images. It combines the speed of a unimodal system with the accuracy of the multimodal system. A comparatively fast biometrics (face image) is used to retrieve a list of n top matches (n<N), which are subsequently verified by a highly accurate biometrics (fingerprint). Therefore only N+n comparisons are required in total. The system has been extended to combine fingerprint identification with face and speaker verifications [6].

This paper presents a new scheme based on retrieval and verification approach. In contrast to the original method of Hong and Jain, the number of labels subject to verification is variable and depends on the retrieval accuracy. If the correct label is at the top position in the ranking, only one verification is needed. Therefore as few as N+1 comparisons in total may be enough to recognize a client. The scheme has been tested with face and speaker recognition modules, however it is designed to integrate any two biometrics.

The next section describes the principles of biometric recognition. Section 3 formulates the tasks of the retrieval and verification steps. Section 4 presents the new verification procedure, together with the derivation of the threshold determination scheme. At the end of this section the algorithm which summarizes the proposed method is developed. Section 5 demonstrates a simple implementation of the system and reports test results. Section 6 contains discussion and concluding remarks.

Section snippets

Biometric recognition

The general concept of person recognition embraces two specific tasks, namely identification and verification. The former aims to establish the client’s identity using only their biometric token; closed-set identification always assigns the query to one of known labels, whereas open-set identification allows classification as “unknown”. Verification task is to decide whether the collected biometric token belongs to a person indicated by claimed identity––in this case the system may either

Retrieval and verification

Proposed integration scheme includes two steps: retrieval, which is in fact a closed-set identification, and verification. Consequently, it works in the open-set identification mode. Both classifiers are regarded here as black boxes, we are interested only in their outcomes while the recognition mechanisms may be hidden. The integration is performed at the rank level [1], which means that the output of the retrieval module is a ranked set of n labels, but actual confidence values are not taken

Verification procedure

Let us denote the ranking of n labels passed to the verification module as R, and let R(k) be the label at rank k. Since the confidence values determined by the retrieval biometrics decrease as rank of the label increases, a natural first identity guess is the label R(1). Positive verification of this label by means of a second biometrics reinforces such a hypothesis, and it becomes the final decision. If a non-match occurs, verification proceeds with subsequent labels in the ranking. This

Implementation

The algorithm has been implemented for face identification and speaker verification. Face recognition is performed using eigenfaces and PCA [11]. Speaker verification is text dependent and is based on vector quantization. The mel-frequency cepstral coefficients (MFCC) are used as feature vectors [2]. To quantify the similarity, the Euclidean distance between a feature vector and a template is employed. Applied methods are very popular and have already been incorporated in multimodal biometrics

Discussion and conclusions

The concept of retrieval and verification assumes that identification can be carried out using a low accuracy biometrics, but the verification should be conducted by means of a possibly strong classifier. ROC curves given in Fig. 2 suggest that the roles of employed biometrics should be interchanged, because face recognition outperforms speaker recognition. Unfortunately, the scarcity of training face images has hindered the estimation of distribution parameters, and hence the verification task

Acknowledgements

This research was done for the author’s MSc Thesis, supervised by Prof. A. Materka. The author would like to thank Dr. J.L. Wayman for discussion on evaluation of identification systems, and Prof. M. Schuckers for valuable suggestions concerning the client distribution problem.

References (14)

  • S. Prabhakar et al.

    Decision-level fusion in fingerprint verification

    Pattern Recognition

    (2002)
  • R. Brunelli et al.

    Person identification using multiple cues

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    (1995)
  • J.R. Deller et al.

    Discrete-Time Processing of Speech Signals

    (1987)
  • G.R. Doddington

    Speaker recognition – Identifying people by their voices

    Proceedings of the IEEE

    (1985)
  • L. Hong, A. Jain, S. Pankanti, Can multibiometrics improve performance?, in: Proceedings AutoID’99, Summit, NJ, October...
  • L. Hong et al.

    Multimodal biometrics

  • A. Jain, L. Hong, Y. Kulkarni, A multimodal biometric system using fingerprint, face, and speech, in: Proc. 2nd Int....
There are more references available in the full text version of this article.

Cited by (6)

  • A robust color image watermarking with Singular Value Decomposition method

    2011, Advances in Engineering Software
    Citation Excerpt :

    Sometimes the problem is associated with noisy data from biometric sensors and environmental conditions. All these lead to an increase in false acceptance and false rejection rates [5–7]. To overcome these problems, multi-modal biometrics relies on more than one form of biometric data.

  • Multimodal biometric person authentication : A review

    2012, IETE Technical Review (Institution of Electronics and Telecommunication Engineers, India)
  • Combining geometric and gabor features for face recognition

    2006, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
  • Audio-visual system for robust speaker recognition

    2005, Proceedings of the 2005 International Conference on Machine Learning; Models, Technologies and Applications, MLMTA'05
View full text