Skip to main content
Log in

Empirical analysis of linguistic and paralinguistic information for automatic dialect classification

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

Current research in automatic speech recognition is primarily concerned with the correct evaluation of linguistic information transmitted in the speech signal and with the identification of variations, naturally present in speech. These differences in speech may be due to the individual’s age; gender; or speaking style influenced by his dialect. Undoubtedly, the focus of research in this field is to strengthen further the techniques developed thus far, regarding their reliability and accuracy. The endeavour of this research paper is to primarily concentrate on analysis and modelling of linguistic and paralinguistic information embedded in the speech signal for discovering the similarities and dissimilarities among acoustic characteristics arising out of different dialects. This paper investigates the influence of dialectal variations, by measuring and analysing certain acoustic features such as formant frequencies, pitch, pitch slope, duration and intensity of vowel sounds. For automatic identification of native dialect, these differences are further exploited, given a sample of native speaker’s speech. For the classification of dialect in the spoken utterances support vector machines along with dialect-specific Gaussian mixture models were used. The system performance is compared with human perception of dialects. The proposed study focuses on various dialects of one of the world’s major language; Hindi.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  • Adank P, Van Hout R, Van de Velde H (2007) An acoustic description of the vowels of northern and southern standard Dutch II: regional varietiesa. J Acoust Soc Am 121(2):1130–1141

    Article  Google Scholar 

  • Aggarwal RK, Dave M (2012) Integration of multiple acoustic and language models for improved Hindi speech recognition system. Int J Speech Technol 15(2):165–180

    Article  Google Scholar 

  • Agrawal SS, Jain A, Sinha S (2016) Analysis and modeling of acoustic information for automatic dialect classification. Int J Speech Technol 19(3):593–609

    Article  Google Scholar 

  • Barkat M, Ohala J, Pellegrino F (1999) Prosody as a distinctive feature for the discrimination of Arabic dialects. Eurospeech 99:395–398

    Google Scholar 

  • Behravan H, Hautamäki V, Kinnunen T (2015) Factors affecting i- vector based foreign accent recognition: a case study in spoken Finnish. Speech Commun 66:118–129

    Article  Google Scholar 

  • Biadsy F (2011) Automatic dialect and accent recognition and its application to speech recognition. Ph.D. Thesis, Columbia University

  • Biadsy F, Hirschberg J, Ellis DPW (2011) Dialect and accent recognition using phonetic-segmentation supervectors. In: INTERSPEECH, pp 752–756

  • Bianchini M, Frasconi P, Gori M (1995) Learning in multilayered networks used as autoassociators. IEEE Trans Neural Netw 6(2):512–515

    Article  Google Scholar 

  • Blackburn CS, Vonwiller J, King RW (1993) Automatic accent classification using artificial neural networks. In: EUROSPEECH, vol 2, pp 1241–1244

  • Chambers JK, Trudgill P (1998) Dialectology. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Chan MV , Feng X , Heinen JA, Niederjohn RJ (1994) Classification of speech accents with neural networks. In: Neural networks, world congress on computational intelligence, vol 7, pp 4483–4486. IEEE

  • Chen T, Huang C, Chang E, Wang J (2001) Automatic accent identification using Gaussian mixture models. In: Workshop on automatic speech recognition and understanding, pp 343–346. IEEE

  • Cho T, Keating PA (2001) Articulatory and acoustic studies on domain-initial strengthening in Korean. J Phonetics 29(2):155–190

    Article  Google Scholar 

  • Deivapalan PG, Jha M, Guttikonda R, Murthy HA (2008) DONLabel: an automatic labeling tool for Indian languages. Energy 2:4

    Google Scholar 

  • DeMarco A, Cox SJ (2013) Native accent classification via i-vectors and speaker compensation fusion. In :INTERSPEECH, pp 1472–1476

  • Dyrud LO (2001) Hindi-Urdu: stress accent or non-stress accent?. Ph.D. Thesis, University of North Dakota

  • Ganapathiraju A, Hamaker J, Picone J, Ordowski M, Doddington GR (2001) Syllable-based large vocabulary continuous speech recognition. IEEE Trans Speech Audio Process 9(4):358–366

    Article  Google Scholar 

  • Gang L, Lei Y , Hansen JHL (2010) Dialect identification: impact of differences between read versus spontaneous speech. In: Signal processing conference, 2010 18th European, pp 2003–2006. IEEE

  • Hanani A, Russell MJ, Carey MJ (2013) Human and computer recognition of regional accents and ethnic groups from British English speech. Comput Speech Lang 27(1):59–74

    Article  Google Scholar 

  • Hansen JHL, Arslan JHL (1995) Foreign accent classification using source generator based prosodic features. In: Proceeding acoustics, speech, and signal processing, vol 1, pp 836–839. IEEE

  • Hou J, Liu Y, Zheng TF, Olsen J, Tian J (2010) Multi- layered features with SVM for Chinese accent identification. In: Proceeding audio language and image processing (ICALIP), pp 25–30. IEEE

  • Huang R, Hansen JHL, Angkititrakul P (2007) Dialect/accent classification using unrestricted audio. IEEE Trans Audio Speech Lang Process 15(2):453–464

    Article  Google Scholar 

  • Koolagudi SG, Maity S, Vuppala AK, Chakrabarti S, Rao KS (2009) IITKGP-SESC: speech database for emotion analysis. In: Contemporary computing. Springer, Berlin, pp 485–492

  • Kulshreshtha M, Mathur R (2012) Dialect accent features for establishing speaker identity: a case study. Springer, Berlin

    Book  Google Scholar 

  • Kumar M, Rajput N, Verma A (2004) A large-vocabulary continuous speech recognition system for Hindi. IBM J Res Dev 48(5.6):703–715

    Article  Google Scholar 

  • Kumpf K, King K (1997) Foreign speaker accent classification using phoneme-dependent accent discrimination models and comparisons with human perception benchmarks. In: EUROSPEECH, pp 2323–2326

  • Ladefoged P, Broadbent DE (1957) Information conveyed by vowels. J Acoust Soc Am 29(1):98–104

    Article  Google Scholar 

  • Lazaridis A, Goldman J-P, Avanzi M, Garner PN (2014) Syllable-based regional Swiss French accent identification using prosodic features. In: Nouveaux cahiers de linguistique francaise, number EPFL-CONF-199821

  • Levent M, Hansen JHL (1996) Language accent classification in American English. Speech Commun 18(4):353–367

    Article  Google Scholar 

  • Liu M, Xu B, Hunng T, Deng Y, Li C ( 2000) Mandarin accent adaptation based on context-independent/context-dependent pronunciation modeling. In: Proceedings acoustics, speech, and signal processing, vol 2, pp II1025–II1028. IEEE

  • Ljolje A, Fallside F (1987) Recognition of isolated prosodic patterns using Hidden Markov models. Comput Speech Lang 2(1):27–34

    Article  Google Scholar 

  • Ma B, Zhu D, Tong R (2006) Chinese dialect identification using tone features based on pitch flux. In :Acoustics, speech and signal processing, vol 1, pp I–I. IEEE

  • Mehrabani M, Boril H, Hansen JHL (2010) Dialect distance assessment method based on comparison of pitch pattern statistical models. In: Acoustics speech and signal processing (ICASSP), pp 5158–5161. IEEE

  • Mishra D, Bali K (2011) A comparative phonological study of the dialects of Hindi. In: Proceedings of ICPhS XVII, Hong Kong, pp 17–21

  • Ohala M (1986) A search for the phonetic correlates of Hindi stress. In: Krishnamurti B, Masica C, Sinha A (eds) South Asian languages: structure, convergence, and diglossia, pp 81–92

  • OShaughnessy D (2008) Automatic speech recognition: history, methods and challenges. Pattern Recogn 41(10):2965–2979

    Article  MATH  Google Scholar 

  • Peters J, Gilles P, Auer P, Selting M (2002) Identification of regional varieties by intonational cues: an experimental study on Hamburg and Berlin German. Lang Speech 45(2):115–138

    Article  Google Scholar 

  • Rabiner L, Juang B-H (1993) Fundamentals of speech recognition. Prentice Hall, Upper Saddle River

    Google Scholar 

  • Raman S (1985) Speech recognition of Hindi stop consonants. Ph.D. Thesis, Indian Institute of Technology, Madras

  • Rao PVS (1993) VOICE: an integrated speech recognition synthesis system for the Hindi language. Speech Commun 13(1):197–205

    Article  MathSciNet  Google Scholar 

  • Rao KS, Koolagudi SG (2012) Emotion recognition using speech features. Springer, Berlin

    MATH  Google Scholar 

  • Rao KS, Yegnanarayana B (2009) Intonation modeling for Indian languages. Comput Speech Lang 23(2):240–256

    Article  Google Scholar 

  • Ryan R (2008) Multiclass classification. http://www.mit.edu/~9.520/spring09/Classes/. Accessed 20 Sept 2014

  • Rym H, Melissa B-D, Emmanuel F, François P (2004) Speech timing and rhythmic structure in Arabic dialects: a comparison of two approaches. Interspeech 4:1613–1616

    Google Scholar 

  • Sekhar CC, Yegnanarayana B (2002) A constraint satisfaction model for recognition of stop consonant-vowel (SCV) utterances. IEEE Trans Speech Audio Process 10(7):472–480

    Article  Google Scholar 

  • Sinha S, Agrawal SS, Jain A (2013) Dialectal influences on acoustic duration of Hindi phonemes. In: Conference on Asian spoken language research and evaluation (O- COCOSDA/CASLRE), pp 1–5. IEEE

  • Sinha S, Jain A, Agrawal SS (2015) Fusion of multi-stream speech features for dialect classification. CSI Trans ICT 2(4):243–252

    Article  Google Scholar 

  • Tang H, Ghorbani AA (2003) Accent classification using support vector machine and hidden Markov model. In: Advances in artificial intelligence. Springer, Berlin, pp 629–631

  • Torres-Carrasquillo PA , Gleason TP , Reynolds DA (2004) Dialect identification using Gaussian mixture models. In: ODYSSEY 04-The speaker and language recognition workshop, pp 297–300

  • Yan Q, Vaseghi S (2003) Analysis, modelling and synthesis of formants of British, American and Australian accents. In: Proceeding acoustics, speech, and signal processing, vol 1, pp I–712. IEEE

  • Zheng DC, Dyke D, Berryman F, Morgan C (2012) A new approach to acoustic analysis of two British regional accents: Birmingham and Liverpool accents. Int J Speech Technol 15(2):77–85

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shweta Sinha.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sinha, S., Jain, A. & Agrawal, S.S. Empirical analysis of linguistic and paralinguistic information for automatic dialect classification. Artif Intell Rev 51, 647–672 (2019). https://doi.org/10.1007/s10462-017-9573-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-017-9573-3

Keywords

Navigation