Skip to main content

Person Identification Based on Multichannel and Multimodality Fusion

  • Conference paper
Multimodal Technologies for Perception of Humans (CLEAR 2006)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4122))

Abstract

Person ID is a very useful information for high level video analysis and retrieval. In some scenario, the recording is not only multimodality and also multichannel(microphone array, camera array). In this paper, we describe a Multimodal person ID system base on multichannel and multimodal fusion. The audio only system is combining 7 channel microphone recording at decision output individual audio-only system. The modeling technique of audio system is Universal Background Model(UBM) and Maximum a Posterior adaptation framework which is very popular in speaker recognition literature. The visual only system works directly on the appearance space via l 1 norm and nearest neighbor classifier. The linear fusion is then combining the two modalities to improve the ID performance. The experiments indicate the effectiviness of micropohone array fusion and audio/visual fusion.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Doddington, G.: Speaker recognition - identifying people by their voices, pp. 1651–1664 (1985)

    Google Scholar 

  2. Reynolds, D.A.: Speaker identification and verification using Gaussian mixture speaker models. Speech Communication 17, 91–108 (1995)

    Article  Google Scholar 

  3. Furui, S.: An overview of speaker recognition technology, pp. 31–56 (1996)

    Google Scholar 

  4. Zhao, W., Chellappa, R., Phillips, P.J., Rosenfeld, A.: Face recognition: A literature survey. ACM Comput. Surv. 35(4), 399–458 (2003)

    Article  Google Scholar 

  5. http://clear-evaluation.org/

  6. Reynolds, D.A.: Comparison of background normalization methods for text-independent speaker verification. In: Proc. Eurospeech ’97, Rhodes, Greece, pp. 963–966 (1997)

    Google Scholar 

  7. Reynolds, D., Quatieri, T., Dunn, R.: Speaker verification using adapted gaussian mixture models. In: Digital Signal Processing (2000)

    Google Scholar 

  8. Dupont, S., Luettin, J.: Audio-visual speech modelling for continuous speech recognition. IEEE Transactions on Multimedia (to appear, 2000)

    Google Scholar 

  9. Garg, A., Potamianos, G., Neti, C., Huang, T.S.: Frame-dependent multi-stream reliability indicators for audio-visual speech recognition. In: Proc. of international conference on Acoustics, Speech and Signal Processing (ICASSP) (2003)

    Google Scholar 

  10. Potamianos, G.: Audio-Visual Speech Recognition. In: Encyclopedia of Language and Linguistics (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Rainer Stiefelhagen John Garofolo

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Liu, M., Tang, H., Ning, H., Huang, T. (2007). Person Identification Based on Multichannel and Multimodality Fusion. In: Stiefelhagen, R., Garofolo, J. (eds) Multimodal Technologies for Perception of Humans. CLEAR 2006. Lecture Notes in Computer Science, vol 4122. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69568-4_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-69568-4_21

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-69567-7

  • Online ISBN: 978-3-540-69568-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics