research-article

Gender classification using pitch and formants

Authors:

Pawan Kumar,

Nitika Jakhanwal,

Anirban Bhowmick,

Mahesh ChandraAuthors Info & Claims

ICCCS '11: Proceedings of the 2011 International Conference on Communication, Computing & Security

Pages 319 - 324

https://doi.org/10.1145/1947940.1948007

Published: 12 February 2011 Publication History

Get Access

Abstract

A gender classification system is proposed based on pitch, formants and combination of both. Ten Hindi digits database has been prepared for fifty speakers. Each Speaker has spoken each digit ten times. Formants derived from speech samples have been used for gender classification. Gender classification has been also done by using pitch extracted from different methods. Autocorrelation, Cepstrum and Average Magnitude Difference (AMDF) methods have been used for pitch determination from speech samples. Formants in combination with pitch are also used for gender classification. A feature vector consisting of pitches derived from all the above mentioned pitch determination methods was also used for gender classification. Experiments were performed for both open-set and closed-set gender classification. Autocorrelation method performed best for gender classification in open-set. Hybrid method (Autocorrelation +AMDF+ Cepstrum) performed best for gender classification in closed-set.

References

[1]

David Gerhard, November 2003, Pitch Extraction and Fundamental Frequency: History and Current Techniques, Technical Report.

Google Scholar

[2]

Goangshiuan S. Ying, Leah H. Jamieson and Carl D. Michell. A probabilistic approach to AMDF pitch detection by School of Electrical and Computer engineering, Purdue University, West Lafayette, (IN 47907-1285). URL: http://purcell.ecn.purdue.edu/~speechg

Google Scholar

[3]

A. Michael Noll. 1967. Cepstrum Pitch Determination by, Journal of Acoustic Society of America, Vol. 14 No. 2, pp. 1.

Google Scholar

[4]

Zeng Y.-M., Wu Z.-Y., Falk T. & Chan W.-Y. 2006. Robust GMM based gender classification using pitch and RASTA-PLP parameters of speech. Proceedings of 5th Int. Conference on Machine learning and Cybernetics, pp. 3376--3379, Dalian, 2006.

Google Scholar

[5]

Margarita Kotti and Constantine Kotropoulos. December 2008. Gender Classification in Two Emotional Speech Databases, IEEE International Conference on Pattern Recognition, pp. 1--4, Dec. 2008.

Crossref

Google Scholar

[6]

B. Gold and N. Morgan. 2003. Speech and audio signal Processing, John Wiley and Sons, 2003.

Digital Library

Google Scholar

[7]

L. R. Rabiner and B. H. Juang. 2003. Fundamental of Speech Recognition, 1st ed., Pearson Education, Delhi, 2003.

Digital Library

Google Scholar

[8]

Y. Linde, A. Buzo and R. M. Gray. 1980. An Algorithm for Vector Quantizer Design, IEEE Transactions on Communication, Vol. 28, pp. 84--95, 1980.

Crossref

Google Scholar

[9]

R. M. Gray. 1984. Vector Quantization. IEEE ASSP Magazine 1(2), pp. 4--29, April, 1984.

Crossref

Google Scholar

Cited By

View all

Alali ATheodorakopoulos G(2025)Partial Fake Speech Attacks in the Real World Using Deepfake AudioJournal of Cybersecurity and Privacy10.3390/jcp50100065:1(6)Online publication date: 8-Feb-2025
https://doi.org/10.3390/jcp5010006
Rajeshwari MSambhavi KKarthik GVindya G(2023)Voice Based Gender Detection2023 International Conference on Computational Intelligence, Communication Technology and Networking (CICTN)10.1109/CICTN57981.2023.10140810(288-291)Online publication date: 20-Apr-2023
https://doi.org/10.1109/CICTN57981.2023.10140810
Shagi GAji S(2022)A machine learning approach for gender identification using statistical features of pitch in speechesApplied Acoustics10.1016/j.apacoust.2021.108392185(108392)Online publication date: Jan-2022
https://doi.org/10.1016/j.apacoust.2021.108392
Show More Cited By

Index Terms

Gender classification using pitch and formants
1. Hardware
  1. Communication hardware, interfaces and storage
    1. Signal processing systems

Recommendations

Text-Independent Speaker Identification Using Formants and Convolutional Neural Networks
Advances in Soft Computing
Abstract
Text-Independent Speaker Identification consists in finding out the identity of an individual using his/her voice independently of the content of the speech signal, that is, regardless the words uttered by the speaker. This problem is harder than ...
Formants Analysis of L2 Arabic Short Vowels: The Impact of Gender and Foreign Accent
Artificial Intelligence and Soft Computing
Abstract
The paper examines the formant of short vowels in Modern Standard Arabic (MSA) language produced by native and non-natives speakers. The experiment displays variations in MSA vowel quality when the mother tongue of L2 speakers is English. The ...
Supervised and unsupervised separation of convolutive speech mixtures using f0 and formant frequencies

In this paper we discuss the role of fundamental frequency f0 and formants F1, F2 and F3 of the speech signal in supervised and unsupervised source separation of real recorded convolutive speech mixtures. Initially supervised source separation is ...

Comments

Information & Contributors

Information

Published In

ICCCS '11: Proceedings of the 2011 International Conference on Communication, Computing & Security

February 2011

656 pages

ISBN:9781450304641

DOI:10.1145/1947940

General Chairs:
Sanjay Kumar Jena
NIT Rourkela, India
,
Rajeev Kumar
IIT Kharagpur, India
,
Program Chairs:
Ashok Kumar Turuk
NIT Rourkela, India
,
Manoranjan Dash
NTU, Singapore

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 February 2011

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICCCS '11

ICCCS '11: International Conference on Communication, Computing & Security

February 12 - 14, 2011

Odisha, Rourkela, India

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

11
Total Citations
View Citations
221
Total Downloads

Downloads (Last 12 months)20
Downloads (Last 6 weeks)1

Reflects downloads up to 18 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Alali ATheodorakopoulos G(2025)Partial Fake Speech Attacks in the Real World Using Deepfake AudioJournal of Cybersecurity and Privacy10.3390/jcp50100065:1(6)Online publication date: 8-Feb-2025
https://doi.org/10.3390/jcp5010006
Rajeshwari MSambhavi KKarthik GVindya G(2023)Voice Based Gender Detection2023 International Conference on Computational Intelligence, Communication Technology and Networking (CICTN)10.1109/CICTN57981.2023.10140810(288-291)Online publication date: 20-Apr-2023
https://doi.org/10.1109/CICTN57981.2023.10140810
Shagi GAji S(2022)A machine learning approach for gender identification using statistical features of pitch in speechesApplied Acoustics10.1016/j.apacoust.2021.108392185(108392)Online publication date: Jan-2022
https://doi.org/10.1016/j.apacoust.2021.108392
Lopatovska IBrown DKorshakova E(2022)Contextual Perceptions of Feminine-, Masculine- and Gender-Ambiguous-Sounding Conversational AgentsInformation for a Better World: Shaping the Global Future10.1007/978-3-030-96957-8_38(459-480)Online publication date: 23-Feb-2022
https://doi.org/10.1007/978-3-030-96957-8_38
Rupasinghe LN AO RD PN K(2021)Robust Speech Analysis Framework Using CNN2021 3rd International Conference on Advancements in Computing (ICAC)10.1109/ICAC54203.2021.9671080(485-490)Online publication date: 9-Dec-2021
https://doi.org/10.1109/ICAC54203.2021.9671080
Jasuja LRasool AHajela G(2020)Voice Gender Recognizer Recognition of Gender from Voice using Deep Neural Networks2020 International Conference on Smart Electronics and Communication (ICOSEC)10.1109/ICOSEC49089.2020.9215254(319-324)Online publication date: Sep-2020
https://doi.org/10.1109/ICOSEC49089.2020.9215254
Jain SAjay AKumaraswamy S(2017)Categorical perception of pitch: Influence of language tone, linguistic meaning, and pitch contourJournal of Indian Speech Language & Hearing Association10.4103/jisha.JISHA_24_1731:2(66)Online publication date: 2017
https://doi.org/10.4103/jisha.JISHA_24_17
Srivastava SChandra MSahoo G(2016)Phase Based Mel Frequency Cepstral Coefficients for Speaker IdentificationInformation Systems Design and Intelligent Applications10.1007/978-81-322-2757-1_31(309-316)Online publication date: 4-Feb-2016
https://doi.org/10.1007/978-81-322-2757-1_31
Archana GMalleswari M(2015)Gender identification and performance analysis of speech signals2015 Global Conference on Communication Technologies (GCCT)10.1109/GCCT.2015.7342709(483-489)Online publication date: Apr-2015
https://doi.org/10.1109/GCCT.2015.7342709
Stachl CBühner M(2015)Show me how you Drive and I’ll Tell you who you are Recognizing Gender Using Automotive Driving ParametersProcedia Manufacturing10.1016/j.promfg.2015.07.7433(5587-5594)Online publication date: 2015
https://doi.org/10.1016/j.promfg.2015.07.743
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Index Terms

Recommendations

Text-Independent Speaker Identification Using Formants and Convolutional Neural Networks

Formants Analysis of L2 Arabic Short Vowels: The Impact of Gender and Foreign Accent

Supervised and unsupervised separation of convolutive speech mixtures using f0 and formant frequencies

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations