Abstract
In this paper the person identification system developed at Athens Information Technology is presented. It comprises of an audio-only (speech), a video-only (face) and an audiovisual fusion subsystem. Audio recognition is based on the Gaussian Mixture modeling of the principal components of the Mel-Frequency Cepstral Coefficients of speech. Video recognition is based on linear subspace projection methods and temporal fusion using weighted voting on the results. Audiovisual fusion is done by fusing the unimodal identities into the multimodal one, using a suitable confidence metric for the results of the unimodal classifiers.
Keywords
- Principal Component Analysis
- Face Recognition
- Linear Discriminant Analysis
- Gaussian Mixture Model
- Smart Space
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Phillips, P., et al.: Overview of the Face Recognition Grand Challenge. In: CVPR (2005)
Ekenel, H., Pnevmatikakis, A.: Video-Based Face Recognition Evaluation in the CHIL Project – Run 1. In: Face and Gesture Recognition 2006, Southampton, UK, pp. 85–90 (April 2006)
Waibel, A., Steusloff, H., Stiefelhagen, R., et al.: CHIL: Computers in the Human Interaction Loop. In: 5th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS), Lisbon, Portugal (April 2004)
Brunelli, R., Falavigna, D.: Person Recognition Using Multiple Cues. IEEE Trans. Pattern Anal. Mach. Intell. 17(10), 955–966 (1995)
Kittler, J., Hatef, M., Duin, R.P.W., Matas, J.: On combining classifiers. IEEE Trans. Pattern Anal. Mach. Intell. 20(3), 226–239 (1998)
Turk, M., Pentland, A.: Eigenfaces for Recognition. J. Cognitive Neuroscience 3, 71–86 (1991)
Belhumeur, P., Hespanha, J., Kriegman, D.: Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection. IEEE Trans. Pattern Analysis and Machine Intelligence 19(7), 711–720 (1997)
Rentzeperis, E., Stergiou, A., Pnevmatikakis, A., Polymenakos, L.: Impact of Face Registration Errors on Recognition. In: Artificial Intelligence Applications and Innovations, Peania, Greece (June 2006)
Jesorsky, O., Kirchberg, K., Frischholz, R.: Robust Face Detection Using the Hausdorff Distance. In: Bigun, J., Smeraldi, F. (eds.) AVBPA 2001. LNCS, vol. 2091, pp. 90–95. Springer, Heidelberg (2001)
Yu, H., Yang, J.: A direct LDA algorithm for high-dimensional data with application to face recognition. Pattern Recognition 34, 2067–2070 (2001)
Sohn, J., Kim, N.S., Sung, W.: A Statistical Model Based Voice Activity Detection. IEEE Sig. Proc. Letters 6(1) (1999)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Stergiou, A., Pnevmatikakis, A., Polymenakos, L. (2007). A Decision Fusion System Across Time and Classifiers for Audio-Visual Person Identification. In: Stiefelhagen, R., Garofolo, J. (eds) Multimodal Technologies for Perception of Humans. CLEAR 2006. Lecture Notes in Computer Science, vol 4122. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69568-4_19
Download citation
DOI: https://doi.org/10.1007/978-3-540-69568-4_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69567-7
Online ISBN: 978-3-540-69568-4
eBook Packages: Computer ScienceComputer Science (R0)