Skip to main content

Combining Evidence from Temporal and Spectral Features for Person Recognition Using Humming

  • Conference paper
Perception and Machine Intelligence (PerMIn 2012)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7143))

Included in the following conference series:

Abstract

In this paper, hum of a person is used to identify a speaker with the help of machine. In addition, novel temporal features (such as zero-crossing rate & short-time energy) and spectral features (such as spectral centroid & spectral flux) are proposed for person recognition task. Feature-level fusion of each of these features with state-of-the art spectral feature set, viz., Mel Frequency Cepstral Coefficients (MFCC) is found to give better recognition performance than MFCC alone. In addition, it is shown that the person identification rate is competitive over baseline MFCC. Furthermore, the reduction in equal error rate (EER) by 1.46 % is obtained when a feature-level fusion system is employed by combining evidences from MFCC, temporal and proposed spectral features.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Amino, K., Arai, T.: Perceptual Speaker Identification Using Monosyllabic Stimuli-Effects of the Nucleus Vowels and Speaker Characteristics Contained in Nasals. In: INTERSPEECH 2008, Brisbane, Australia, pp. 1917–1920 (2008)

    Google Scholar 

  2. Patil, H.A., Jain, R., Jain, P.: Identification of Speakers from their Hum. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2008. LNCS (LNAI), vol. 5246, pp. 461–468. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  3. Jin, M., Kim, J., Yoo, C.D.: Humming-based Human Verification and Identification. In: Proc. Int. Conf. on Acoustic, Speech and Signal Processing, ICASSP 2009, Taipei, Taiwan, pp. 1453–1456 (2009)

    Google Scholar 

  4. Patil, H.A., Parhi, K.K.: Novel Variable Length Teager Energy based Features for Person Recognition from Their Hum. In: Proc. Int. Conf. on Acoustic, Speech and Signal Processing, ICASSP 2010, Dallas, Texas, USA, pp. 4526–4529 (2010)

    Google Scholar 

  5. Huang, R., Hansen, J.H.L.: Advances in Unsupervised Audio Classification and Segmentation for the Broadcast News and NGSW Corpora. IEEE Transactions on Audio, Speech, and Language Processing 14(3), 907–919 (2006)

    Article  Google Scholar 

  6. Kedem, B.: Spectral Analysis and Discrimination by Zero-Crossings. Proc. IEEE 74(11), 1477–1493 (1986)

    Article  Google Scholar 

  7. Schubert, E., Wolfe, J., Tarnopolsky, A.: Spectral Centroid and Timbre in Complex, Multiple Instrumental Textures. In: Proceedings of the 8th International Conference on Music Perception & Cognition, Evanston, IL, pp. 654–657 (2004)

    Google Scholar 

  8. Davis, S.B., Mermelstein, P.: Comparison on Parametric Representation for Monosyllabic Word Recognition in Continuously Spoken Sentences. IEEE, Transactions on Acoustics, Speech, And Signal Processing ASSP-28(4), 357–366 (1980)

    Article  Google Scholar 

  9. Campbell, W.M., Assaleh, K.T., Broun, C.C.: Speaker Recognition with Polynomial Classifiers. IEEE Transactions on Speech and Audio Processing 10(4), 205–212 (2002)

    Article  Google Scholar 

  10. Martin, A.F., Doddington, G., Kamm, T., Ordowski, M., Przybocki, M.: The DET Curve in Assessment of Detection Task Performance. In: Proc. EUROSPEECH 1997, Rhodes, Greece, vol. 4, pp. 1895–1898 (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Patil, H.A., Madhavi, M.C., Jain, R., Jain, A.K. (2012). Combining Evidence from Temporal and Spectral Features for Person Recognition Using Humming. In: Kundu, M.K., Mitra, S., Mazumdar, D., Pal, S.K. (eds) Perception and Machine Intelligence. PerMIn 2012. Lecture Notes in Computer Science, vol 7143. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27387-2_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-27387-2_40

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-27386-5

  • Online ISBN: 978-3-642-27387-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics