Abstract
We test a method of clustering dialects of English according to patterns of shared phonological features. Previous linguistic research has generally considered phonological features as independent of each other, but context is important: rather than considering each phonological feature individually, we compare the patterns of shared features, or Mutual Information (MI). The dependence of one phonological feature on the others is quantified and exploited. The results of this method of categorizing 59 dialect varieties by 168 binary internal (pronunciation) features are compared to traditional groupings based on external features (e.g., ethnic, geographic). The MI and size of the groups are calculated for taxonomies at various levels of granularity and these groups are compared to other analyses of geographic and ethnic distribution. Applications that could be improved by using MI methods are suggested.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Fetzer, A.: Recontextualizing context: Grammaticality meets appropriateness. Benjamins, Philadelphia (2004)
Giunchiglia, F., Bouquet, P.: Introduction to contextual reasoning. An Artificial Intelligence Perspective. In: Kokinov, B. (ed.) Perspectives on Cognitive Science 3. NBU Press, Sofia (1997)
Sarkar, P., Nagy, G.: Style consistent classification of isogenous patterns. IEEE Trans. Pattern Analysis and Machine Intelligence 27(1), 88–98 (2005)
Veeramachaneni, S., Nagy, G.: Style context with second order statistics. IEEE Trans. Pattern Analysis and Machine Intelligence 27(1), 14–22 (2005)
Carver, C.M.: American Regional Dialects: A Word Geography. University of Michigan Press, Ann Arbor (1987)
Labov, W., Ash, S., Boberg, C.: Atlas of North American English. Mouton de Gruyter, Paris (2005)
Hughes, A., Trudgill, P.: English Accents and Dialects: An Introduction to Social and Regional Varieties of British English. Edward Arnold, London (1987)
Trudgill, P.: The Dialects of England. Blackwell, London (1999)
Nerbonne, J., Kleiweg, P.: Lexical distance in LAMSAS. Computers and the Humanities 37(3), 339–357 (2003)
Gooskens, C., Heeringa, W.: Perceptive evaluation of Levenshtein dialect distance measurements using Norwegian dialect data. Language Variation and Change 16(3), 189–207 (2004)
Cheng, C.-C.: Measuring Relationship among Dialects: DOC [Dictionary on computer] and Related Resources. Computational Linguistics and Chinese Language Processing 2(1), 41–72 (1997)
Heeringa, W., Braun, A.: The Use of the Almeida-Braun System in the Measurement of Dutch Dialect Distances. Computers and the Humanities 37(3), 257–271 (2003)
Heeringa, W.: Measuring dialect pronunciation differences using Levenshtein distance. University of Groningen, Groningen (2004)
Heggarty, P.A.: Measured Language: From First Principles to New Techniques for Putting Numbers on Language Similarity. Blackwell, Oxford (in prep.)
Schneider, E.W., et al. (eds.): A Handbook of Varieties of English: A Multimedia Reference Tool. Mouton de Gruyter, Berlin (2005)
Nagy, N.: Addenda to Categorization of phonemic dialect features in context (2005), http://pubpages.unh.edu/~ngn/papers/Context05/CONTEXT05_addenda
Wells, J.C. (ed.): Accents of English. Cambridge University Press, Cambridge (1982)
Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, Hoboken (1980)
Day, W.H.E., Edelsbrunner, H.: Efficient algorithms for agglomerative hierarchical clustering methods. Journal of Classification 1(1), 7–24 (1984)
Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs (1988)
Theodoridis, S., Koutroumbas, K.: Pattern Recognition. Academic, NY (1999)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley-Interscience, Hoboken (2001)
Topchy, A., et al.: Adaptive Clustering Ensembles. In: Proc. ICPR, Cambridge (2004)
Jain, A.K., et al.: Landscape of Clustering Algorithms. In: Proc. ICPR, Cambridge (2004)
Redner, R.A., Walker, H.F.: Mixture densities, maximum likelihood, and the EM algorithm. SIAM Review 26(2), 195–235 (1984)
Topchy, A., Jain, A.K., Punch, W.: A Mixture Model for Clustering Ensembles. In: Proc. SIAM International Conference on Data Mining (SDM 2004), Florida (2004)
Foulkes, P.: Current trends in British sociophonetics. Univ. of PA Working Papers in Linguistics: A Selection of Papers from NWAV 30 8(3), 75–86 (2002)
Rabiner, L.R., Juang, B.H.: Fundamentals of Speech Recognition. Prenctice Hall, Englewood Cliffs (1993)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nagy, N., Zhang, X., Nagy, G., Schneider, E.W. (2005). A Quantitative Categorization of Phonemic Dialect Features in Context. In: Dey, A., Kokinov, B., Leake, D., Turner, R. (eds) Modeling and Using Context. CONTEXT 2005. Lecture Notes in Computer Science(), vol 3554. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11508373_25
Download citation
DOI: https://doi.org/10.1007/11508373_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26924-3
Online ISBN: 978-3-540-31890-3
eBook Packages: Computer ScienceComputer Science (R0)