Abstract
One of the most important processes in speech processing is gender classification. Generally gender classification is done by considering pitch as feature. In general the pitch value of female is higher than the male. In some cases, pitch value of male is higher and female is low, in that cases this classification will not obtain the exact result. By considering this drawback here proposed a gender classification method which considers three features and uses fuzzy logic and neural network to identify the given speech signal belongs to which gender. For training fuzzy logic and neural network, training dataset is generated by considering the above three features. After completion of training, a speech signal is given as input, fuzzy and neural network gives an output, for that output mean value is taken and this value gives the speech signal belongs to which gender. The result shows the performance of our method in gender classification.
Similar content being viewed by others
References
Devi, T. M., Kasthuri, N., & Natarajan, A. M. (2010). Performance comparison of noise classification using intelligent networks. International Journal of Electronics Engineering, 2(1), 49–54.
Faúndez-Zanuy, M., McLaughlin, S., Esposito, A., Hussain, A., Schoentgen, J., Kubin, G., Kleijn, W. B., & Maragos, P. (2002). Non-linear speech processing: overview and applications. Control & Intelligent Systems, 30(1), 1–10.
Gomathy, M., Meena, K., & Subramaniam, K. R. (2011, to be published). Gender grouping in speech recognition using statistical metrics of pitch strength. EJSR J.
Gudi, A. B., & Nagaraj, H. C. (2009). Optimal curve fitting of speech signal for disabled children. International Journal of Computer science & Information Technology (IJCSIT), 1(2), 99–107.
Gudi, A. B., Shreedhar, H. K., & Nagaraj, H. C. (2010). Signal processing techniques to estimate the speech disability in children. IACSIT International Journal of Engineering and Technology, 2(2), 169–176.
Haraty, R. A., & El Ariss, O. (2007). CASRA+: a colloquial Arabic speech recognition application. American Journal of Applied Sciences, 4(1), 23–32.
Hasegawa, Y., & Hata, K. (1994). Non-physiological differences between male and female speech: Evidence from the delayed F0 fall phenomenon in Japanese. In Proceedings of international conference on spoken language processing (pp. 1179–1182).
Hasegawa, Y., & Hata, K. (1995). The function of F0-peak delay in Japanese. In Proceedings of 21st annual meeting of the Berkeley linguistics society (pp. 141–151).
Kotti, M., & Kotropoulos, C. (2008). Gender classification in two emotional speech databases. In Proceedings of 19th international conference on pattern recognition (pp. 1–4). Tampa.
Mahdi, A. E., & Jafer, E. (2008). Two-feature voiced/unvoiced classifier using wavelet transform. The Open Electrical and Electronic Engineering Journal, 2, 8–13.
McAulay, R. J., & Quatieri, T. F. (1988). Speech processing based on a sinusoidal model. The Lincoln Laboratory Journal, 1(2), 153–168.
Othman, A. M., & Riadh, M. H. (2008). Speech recognition using scaly neural networks. World Academy of Science, Engineering and Technology, 38, 253–258.
Patel, I., & Rao, Y. S. (2010). Speech recognition using HMM with MFCC—an analysis using frequency spectral decomposition technique. Signal & Image Processing: An International Journal (SIPIJ), 1(2), 101–110.
Qi, Y., & Hunt, B. R. (1993). Voiced-unvoiced-silence classifications of speech using hybrid features and a network classifier. IEEE Transactions on Speech and Audio Processing, 1(2), 250–255.
Rakesh, K., Dutta, S., & Shama, K. (2011). Gender recognition using speech processing techniques in LABVIEW. International Journal of Advances in Engineering & Technology, 1(2), 51–63.
Rao, R. R., & Prasad, A. (2011). Glottal excitation feature based gender identification system using ergodic HMM. International Journal of Computers & Applications, 17(3), 31–36.
Rodger, J. A., & Pendharkar, P. C. (2004). A field study of the impact of gender and user’s technical experience on the performance of voice-activated medical tracking application International Journal of Human-Computer Studies, 60, 529–544.
Sedaaghi, M. H. (2009). A comparative study of gender and age classification in speech signals. Iranian Journal of Electrical & Electronic Engineering, 5(1), 1–12.
Shue, Y.-L., & Iseli, M. (2008). The role of voice source measures on automatic gender classification. In Proceedings of IEEE international conference on acoustics, speech and signal processing (pp. 4493–4496). Las Vegas.
Sigmund, M. (2008). Gender distinction using short segments of speech signal. International Journal of Computer Science and Network Security, 8(10), 159–162.
Silovsky, J., & Nouza, J. (2006). Speech, speaker and speaker’s gender identification in automatically processed broadcast stream. Radio Engineering Journal, 15(3), 42–48.
Singh, G., Junghare, A., & Chokhani, P. (2010). Multi utility E-controlled cum voice operated farm vehicle. International Journal of Computers & Applications, 1(13), 109–113.
Zengi, Y.-M., Wu, Z.-Y., Falk, T., & Chan, W.-Y. (2006). Robust GMM based gender classification using pitch and rasta-PLP parameters of speech. In Proceedings of fifth international conference on machine learning and cybernetics (pp. 13–16). Dalian.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Gomathy, M., Meena, K. & Subramaniam, K.R. Classification of speech signal based on gender: a hybrid approach using neuro-fuzzy systems. Int J Speech Technol 14, 377–391 (2011). https://doi.org/10.1007/s10772-011-9118-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10772-011-9118-0