Skip to content
Publicly Available Published by De Gruyter Mouton September 6, 2023

Acoustic correlates of Burmese voiced and voiceless sonorants

  • Chiara Repetti-Ludlow ORCID logo EMAIL logo
From the journal Phonetica

Abstract

Voiceless sonorant consonants are typologically rare segments, appearing in only a few of the world’s languages, including Burmese. In this study, Burmese sonorants and their adjacent vowels are investigated in an attempt to (1) determine what acoustic correlates distinguish voiced and voiceless sonorants and (2) determine whether there are multiple realizations of voiceless sonorants and, if so, establish what acoustic correlates distinguish them. In order to pursue these questions, a production study was carried out and target words were analyzed, demonstrating that Burmese voiceless sonorants have a spread glottis period resulting in turbulent airflow 78 % of the time. Findings from linear mixed-effects models showed that voiced and voiceless sonorants are significantly different in terms of duration of the sonorant, F0 of the sonorant, and strength of excitation measured over the following vowel. A linear discriminant analysis was able to predict voicing category with 86.7 % accuracy, with the duration of the spread glottis period being the best indicator of voicelessness, followed by the cues that were significant in the linear mixed-effects models. In cases when the spread glottis period is absent from voiceless sonorants, the sonorant only has correlates that are associated with voicelessness (such as F0 and strength of excitation) but not correlates that are associated with the spread glottis gesture (such as duration and harmonics-to-noise ratio). These results have implications both for our understanding of the acoustics of Burmese sonorants and for our understanding of voiceless sonorants more generally.

1 Introduction

Voiceless sonorant consonants are cross-linguistically rare, appearing in only 2 % of the world’s languages, according to PHOIBLE (Moran and McCloy 2011). Burmese has 6 pairs of non-vowel sonorants with voiced and voiceless counterparts, providing an ideal case study for examining what distinguishes the two acoustically. However, no extensive acoustic analysis has yet been carried out on Burmese sonorants, with existing work focusing on aspects such as oral and nasal airflow measures (Bhaskararao and Ladefoged 1991; Chirkova et al. 2019), EGG measures (Chirkova et al. 2019), and durational measures (Bhaskararao and Ladefoged 1991). Descriptions of Burmese sonorants also vary, although there is a general consensus that they consist of two parts: a voiceless portion followed by a voiced portion (Bhaskararao and Ladefoged 1991; Dantsuji 1984). The voiceless portion involves a spread glottis (SG) gesture resulting in turbulent airflow immediately preceding the voiced portion of the sonorant. This is acoustically similar to preaspiration, but unlike preaspiration, the timing of the oral gesture and the SG gesture are concurrent, with some voicing at the end (realized as a typical sonorant consonant) to allow for the differentiation of sonorants (Bhaskararao and Ladefoged 1991). Despite the fact that a SG gesture seems to be key in this contrast, previous work has not focused on voice quality measures. This issue is further complicated by the fact that, in many languages, SG gestures like preaspiration are ‘non-normative,’ meaning they do not surface consistently (Helgason 2002). In languages like Central Standard Swedish (van Dommelen 1998), Itunyoso Trique (DiCanio 2010), and Sienese Italian (Stevens and Reubold 2014), preaspiration is only variably realized as an acoustic correlate. Thus, the task is not only to determine the acoustic correlates of Burmese sonorants, but also to determine whether there are variable realizations of these sonorants and consider what this means for a phonetic characterization. Taking these considerations together, the research questions for the present paper are as follows:

  1. What acoustic correlates distinguish voiced and voiceless sonorants?

  2. Are there multiple realizations of voiceless sonorants and what acoustic correlates distinguish them?

The rest of Section 1 will focus on the Burmese segment inventory, the literature on voiceless sonorants, the acoustic correlates that will be considered, and predictions. Section 2 discusses the methods of this experiment. Section 3 describes the results of the production study. Section 4 presents a discussion of the findings and what they mean for the stated research questions. Finally, Section 5 provides a conclusion and proposes directions for future research.

1.1 Burmese inventory

Burma, also known as Myanmar, is a country in Southeast Asia with a population of approximately 50 million people, of which over 30 million are Burmese speakers (Bradley 1996). Burmese, a Tibeto-Burman language, has 34 consonants, as shown in Figure 1, and two sets of vowels, as shown in Figure 2 (Watkins 2001). The first set of vowels consists of eight monophthongs that are not nasalized and appear in open syllables. The second set of vowels consists of eight monophthongs and diphthongs that are either nasalized in an open syllable (except for [ɛ]) or not nasalized in a closed syllable. Of these vowels, [a] is the most common. Burmese is also a tonal language with four attested tones: high, low, creaky, and killed (Watkins 2000).

Figure 1: 
Burmese consonants, with voiced-voiceless sonorant pairs highlighted. Figure based on Watkins (2001).
Figure 1:

Burmese consonants, with voiced-voiceless sonorant pairs highlighted. Figure based on Watkins (2001).

Figure 2: 
First set of vowels (left) that are non-nasal in open syllables, and second set of vowels (right) that are nasal in open syllables (except for ɛ) or non-nasal in closed syllables. Figure based on Watkins (2001).
Figure 2:

First set of vowels (left) that are non-nasal in open syllables, and second set of vowels (right) that are nasal in open syllables (except for ɛ) or non-nasal in closed syllables. Figure based on Watkins (2001).

1.2 Voiceless sonorants

Burmese has 12 sonorant consonants with contrastive voicing: [m, m̥, n, n̥, ɲ, ɲ̥, ŋ, ŋ̥, l, l̥, w, w̥] (Watkins 2001). The voiceless sonorants have been described as having two parts: a voiceless portion followed by a voiced portion (Bhaskararao and Ladefoged 1991; Dantsuji 1984). Unlike some languages with voiceless sonorants, the phonemic status of Burmese voiceless sonorants is not questioned (Ladefoged and Maddieson 1998; Watkins 2001).[1] Work on Burmese voiceless sonorants so far has focused on the nasal sonorants [m̥, n̥, ɲ̥, ŋ̥] (Bhaskararao and Ladefoged 1991; Dantsuji 1984), or some subset of the nasal sonorants (Chirkova et al. 2019; Ladefoged 1971). Only Maddieson (1984) considers [l̥] in addition to the nasal sonorants, but this analysis focuses on fundamental frequency (F0), ultimately finding that F0 as measured at the post-sonorant vowel onset is lower for voiced sonorants and higher for voiceless sonorants, in line with cross-linguistic findings that voiceless segments tend to be associated with a higher F0 (Kingston 2011). Other studies on Burmese voiceless sonorants consider durational measurements (Bhaskararao and Ladefoged 1991; Dantsuji 1984), oral and nasal airflow measurements (Bhaskararao and Ladefoged 1991; Chirkova et al. 2019), and EGG measurements (Chirkova et al. 2019). While these works have helped to further our understanding of Burmese voiceless sonorants, they focus primarily on the temporal and aerodynamic aspects of voiceless nasals. Therefore, there remains a gap in the literature which can be filled by a more thorough acoustic analysis of Burmese voiceless sonorants.

The analysis of Burmese sonorants can be contextualized by comparing them to accounts of voiced and voiceless sonorants in other languages. Cross-linguistically, voiceless sonorants are found most commonly in Tibeto-Burman languages, such as Burmese, Mizo, and Angami (Bhaskararao and Ladefoged 1991; Chirkova et al. 2019), and Indo-European languages, like Icelandic and Welsh (Bell et al. 2021; Jessen and Pétursson 1998).[2] Previous work has found some consistent acoustic correlates with voiceless sonorants. For example, voiceless sonorants tend to have a longer duration (considering the voicelesss and voiced portions together), as in Khonoma Angami, where voiceless nasals have been found to be 7–10 % longer than their voiced counterparts, (Blankenship et al. 1993), and in Icelandic, where voiceless sonorants have been found to be an average of 42 % longer (Bombien 2006). Voiceless sonorants also tend to have a higher spectral tilt than their voiced counterparts, as in Icelandic, where voiceless sonorants have a higher H1–H2, signifying a more breathy production (Bombien 2006). They also tend to have more spectral noise, as has been observed both in Iaai (Maddieson and Anderson 1994) and in Icelandic, where voiceless sonorants have more frication (Bombien 2006).[3] Taken together, these studies suggest that there are some common cross-linguistic features of voiceless sonorants. However, voiceless sonorants tend to have very little acoustic information segment-internally, instead being largely identifiable based on adjacent segments. This is similar to how many correlates of contrastive breathy sonorants in Marathi are realized on the following vowel, rather than on the sonorant itself (Berkson 2019), or how stop voicing contrasts in English are correlated with duration and F0 on adjacent vowels (Kingston 2011; Lisker 1972).

In addition to cross-linguistic accounts of voiceless sonorants, historical reconstructions also provide important information for the study of voiceless sonorants. Blevins (2018) has argued that voiceless sonorants have their origins in sonorant-SG clusters, which ultimately phonologize to become a contrastive phoneme. Put another way, most voiceless sonorants, including Burmese voiceless sonorants (Dantsuji 1984), historically were clusters consisting of an /h/ + sonorant or sonorant + /h/, which were eventually reanalyzed as a single phoneme, with all previous environment restrictions lost. Thus, SG gestures seems to be a key element of voiceless sonorants, but existing acoustic accounts do not focus on this aspect.

Taken together, it seems that perhaps Burmese voiceless sonorants are better described as having a SG period. This is compatible with both historical reconstructions and existing descriptions of the voiceless sonorants. This is also in accordance with the cross-linguistic link between preaspiration (which results from a SG gesture) and sonorant devoicing. For example, Hansson (2001) found that Scandinavian languages with preaspiration also invariably have sonorant devoicing and Stevens and Hajek (2004) replicated this finding with Sienese Italian, which was been found to have preaspiration with voiceless geminate stops in addition to sonorant consonant devoicing before voiceless stops. In both the Scandinavian languages and in Sienese Italian, the voiceless sonorants are realized with some degree of aspiration, which Stevens and Hajek (2004) argue “confirms the phonetic similarity between sonorant devoicing and preaspiration,” and which Hansson (2001) argues could be evidence for preaspiration and sonorant devoicing being a unified and general process. However, at least in the case of Sienese Italian, sonorant devoicing and the corresponding realization of aspiration is not ubiquitous, and indeed only occurs 85 % of the time (Stevens and Hajek 2004). The variable realization of preaspiration has been called ‘non-normative preaspiration’ (Helgason 2002), and it has been noted for a variety of contrasts. For example, it has been observed with Norwegian and Central Standard Swedish stop voicing contrasts (van Dommelen 1998; Helgason 2002), Itunyoso Trique and Sienese Italian stop gemination contrasts (DiCanio 2010; Stevens and Reubold 2014), and Central Standard Swedish fricative voicing contrasts (Gordeeva and Scobbie 2010). It is therefore possible that the [−voice] period described in Burmese could be an instance of a non-normative SG gesture, similar to preaspiration. However, if the SG gesture is non-normative, voiceless sonorants will not consistently be produced with turbulent airflow. The implications of this are discussed further in Section 1.4.

1.3 Acoustic correlates of voicelessness and SG gestures

In order to acoustically characterize Burmese voiced and voiceless sonorants, a variety of different acoustic correlates must be considered. In line with previous work on laryngeal contrasts (Berkson 2019; Garellek 2020) and voiceless sonorants (Blevins 2018; Bombien 2006), duration, H1–H2, strength of excitation, harmonics-to-noise ratio, and F0 are considered with respect to Burmese sonorants. The remainder of this section provides an outline of what each of these measures indicates and how they are expected to pattern for voiced and voiceless sonorants.

1.3.1 Duration (ms)

The connection between voicing and duration is attested cross-linguistically, with voicing-dependent durational differences observed on both the consonant itself and on adjacent vowels (Denes 1955; Lisker 1972). This measurement is especially relevant for the present study, as cross-linguistic work has found that voiceless sonorants (considering the voiceless and voiced portions together) tend to have longer durations than their voiced counterparts (Blevins 2018; Bombien 2006). Furthermore, vowel duration has been found to be a correlate of voicing contrasts in many languages, with vowels preceding voiced phonemes having a longer duration than those preceding voiceless phonemes (Dinnsen and Charles-Luce 1984; Port et al. 1981). Thus, the prediction is that the voiced portion (not including the SG portion) of Burmese voiceless sonorants will have a shorter duration compared to their voiced counterparts, and the adjacent vowels may be impacted too.

1.3.2 H1–H2 (dB)

H1–H2 is a spectral measure that represents the difference in amplitude between the first harmonic and the second harmonic. Corrected H1*–H2* was used for measurements taken over vowels, given that it corrects for effects of formants and bandwidths, while uncorrected H1–H2 was used for measurements taken over the voiced portions of the sonorant consonant, following work by Garellek et al. (2016) and Chirkova et al. (2021). H1–H2 has been used to study non-modal voice quality and phonation types, and specifically contrasts between creaky and breathy voice (Garellek 2019). Breathy phonation has been found to correlate with a higher H1–H2, while lower H1–H2 has been found to correlate with creaky phonation (Berkson 2019; Garellek 2012, 2019, 2020). Given evidence from languages like Icelandic, where voiceless sonorants have a higher H1–H2 (Bombien 2006), signifying a more breathy than creaky production, it might be predicted that Burmese voiceless sonorants will also have a higher H1–H2.

1.3.3 SoE

Strength of excitation (SoE) is a measure of glottal excitation, reflecting the amplitude of voicing such that higher SoE correlates with more voicing, regardless of noise (Garellek 2020; Mittal et al. 2014). Put another way, SoE provides information about the amplitude of voicing, while controlling for amplitude of non-speech noise in the signal. Thus, one might expect SoE to be lower for voiceless sonorants compared to voiced sonorants.

1.3.4 HNR (dB)

Harmonics-to-noise ratio (HNR) measures are correlated with increased noise associated with SG gestures and non-modal phonation, such that sounds with more aspiration noise have a lower HNR and modal sounds have a higher HNR (Garellek 2012, 2020). Likewise, breathy and creaky sounds tend to have a lower HNR compared to modal sounds (Garellek et al. 2016). HNR can be measured at different frequency bands, but given that HNR < 3,500 Hz has been found to help differentiate voice quality in previous studies, this is the range adopted for the present study (Garellek 2020). Voiceless sonorants will likely have a lower HNR than their voiced counterparts, given their tendency towards non-modal voicing (Blevins 2018; Bombien 2006; Jessen and Pétursson 1998).

1.3.5 F0 (Hz)

Fundamental frequency, or F0, has been found to correlate with voicing cross-linguistically, with F0 being lower after voiced than voiceless phonemes (Kingston 2011). In line with this fact, work on Burmese has found that F0 is lower for voiced than voiceless sonorants (Maddieson 1984). Furthermore, F0 also seems to interact with aspiration, which results from a SG gesture. In languages like Korean and Madurese, which have three-way stop systems, voiced stops have the lowest F0, followed by voiceless unaspirated stops, followed by voiceless aspirated stops, which have the highest F0 (Bang et al. 2018; Misnadin et al. 2015). However, it is worth noting that the raised F0 observed with voiceless segments is not exclusively due to effects of aspiration. As Ladd and Schmid (2018) explain, even though /p/ in the word “spam” and /b/ in absolute initial position (as in a word like “bam”) are both voiceless unaspirated in some varieties of American English, the /p/ is still followed by a vowel with a higher F0. This suggests that F0 can be impacted both by phonological voicing category and by aspiration, but in either case, it can be predicted that Burmese voiceless sonorants will have a higher F0 than their voiced counterparts.[4] For the present study, F0 is calculated using STRAIGHT (Kawahara et al. 1999), following Garellek (2020).

1.4 Predictions

Given the information laid out so far, predictions can be made for the stated research questions. With respect to the first question (what acoustic correlates distinguish voiced and voiceless sonorants?), acoustic correlates were chosen that are likely to capture the most acoustic information on voiceless sonorants based on previous accounts: duration, H1–H2, SoE, HNR, and F0. The correlates of these measures and the predictions for how they will correspond to voiceless sonorants compared to voiced sonorants are laid out in Table 1. Measurements will be taken both on the sonorant itself, and on the preceding and following vowels. This will allow for a deeper understanding of where correlates of voiceless sonorants can be realized. Given findings from Blevins (2018) and Berkson (2019), it is likely that some correlates of voicing category will be realized on the sonorant and the vowel following the sonorant. It can also be anticipated that a period of voiceless turbulent airflow will generally occur immediately preceding the voiceless sonorant, in line with previous descriptions of Burmese, e.g. Dantsuji (1984), who noted a period of voiceless nasal frication before voiceless nasals, and in line with descriptions of voiceless sonorants in other languages (Bell et al. 2021; Blevins 2018; Bombien 2006). If Burmese voiceless sonorants are indeed better thought of as having a SG gesture, then we might expect both correlates related to the SG gesture (shorter duration, lower HNR) and voicelessness (lower SoE, higher F0) to appear.

Table 1:

Acoustic measures and predictions for how they will pattern for voiceless compared to voiced sonorants.

Acoustic measure Correlate Prediction for voiceless sonorants
Duration Length of segment Lower
H1–H2 Breathiness/Creak Higher
SoE Degree of voicing Lower
HNR Noise Lower
F0 Pitch Higher

To answer the second question (are there multiple realizations of voiceless sonorants and what acoustic correlates distinguish them?), the first task is to establish the possible realizations of voiceless sonorants, particularly with respect to the SG gesture. If the SG gesture is always present, then it can be concluded that the period of frication is normative in Burmese, with no further investigations necessary. If, however, it is non-normative, then a deeper investigation of the cues present for voiceless sonorants with the SG gesture and voiceless sonorants without the SG gesture is called for. To preview the results, data suggests that there are two realizations of voiceless sonorants in Burmese. Voiceless sonorants are realized with a SG gesture 78 % of the time, but 22 % of the time no SG gesture is present.

There are a few possible explanations for what could be happening when the SG gesture is absent from voiceless sonorants. First, it is possible that there is gestural shifting, such that the SG gesture is timed completely concurrently with the sonorant or an adjacent vowel, and thus it cannot be segmented as distinct from the voiced portion of the sonorant or an adjacent vowel. If the SG gesture is shifting, we would expect to see correlates of SG, such as a lower HNR, on one of the adjacent segments. A second option is that the phonologically voiceless sonorants without the SG gesture are mispronunciations, and are articulated exactly the same as voiced sonorants. If this is the case, we would not expect significant differences in any correlates for voiced versus voiceless non-SG sonorants. Third, it is possible that the SG gesture is absent, but other correlates of voicelessness remain. This would mean correlates associated with the SG gesture (such as a lower HNR or a difference in duration) would not be significantly different from voiced sonorants, but those associated with voicelessness (such as higher F0 and lower SoE) would be different.

2 Methods

2.1 Stimuli

The acoustics of Burmese voiced and voiceless sonorants were examined in a production study with speakers of Burmese. Target words were all monosyllabic minimal pairs consisting of every combination of 8 consonants ([w, w̥ , l, l̥, ŋ, ŋ̥, m, m̥]), 2 tones (high and low), and one nucleus vowel ([a]).[5] In order to ensure that target words were as phonologically controlled as possible, the tone, consonant onset, and vowel were balanced, but words were not balanced based on syntactic category. Target words can be found in Table 2. Each target word appeared in 9 different carrier phrases, 3 for each phrasal position (phrase-initial, phrase-medial, and phrase-final), but in order to limit differences caused by prosodic effects, and because prosodic position was not the focus of the current study, only target words in phrase-medial position were considered for the present analysis. Phrases were controlled so that each phrase had the same number of syllables, as shown in Table 3, and target sonorants were preceded and followed by [a]. It is worth noting that this resulted in some phrases that were strange or syntactically marked, but participants had no problem reading the phrases fluently due to the transparency of Burmese orthography. Furthermore, all stimuli were checked by a speaker of Burmese and there is no evidence that the content of the phrases impacted the pronunciation of target words or adjacent segments. Altogether, these combinations resulted in 48 target phrases, such as [ŋa la hãʊ̃] ‘my old mule.’ Once all the stimuli were compiled, three lists were created with the stimuli randomized in different orders. Participants were randomly assigned to one of the three lists, for a total of three participants per list.

Table 2:

Target words differing in terms of consonant, voicing, and tone.

Table 3:

Carrier phrases used to present stimuli.

2.2 Participants

Nine speakers of Burmese between the ages of 19 and 42 were recorded, and all were living in New York. Table 4 shows demographic information about participants, including their age at the time of the 2021 study, their gender, and the number of years they had been living in the United States. All participants reported normal speech and hearing as well as literacy in Burmese. Participants were contacted primarily through personal channels and Facebook groups and were reimbursed $25 for participating in the hour-long study. No participants were excluded, although occasionally participants skipped a phrase, resulting in uneven N values between participants.

Table 4:

Participant demographics.

ID Gender Age Years in U.S.
pt1 Woman 30 6
pt2 Woman 39 20
pt3 Woman 21 4
pt4 Man 30 12
pt5 Man 42 2
pt6 Woman 19 3
pt7 Man 22 4
pt8 Woman 24 6
pt9 Woman 29 1

2.3 Procedure

Phrases were presented to participants in Burmese orthography one at a time in a slideshow. Participants were instructed to read the phrase to themselves first, and then out loud once they had familiarized themselves with the phrase. They were told that they were not reading complete sentences and that they should read the phrases as if they were responding to a friend, in order to encourage a more casual speaking style. Participants were given a break after every 10 phrases. All recordings were taken in a sound-attenuated booth using a Zoom H4n Pro recorder and a head-mounted mic. The recordings were done with a bit depth of 16-bit for a sampling rate of 44,100 Hz.

2.4 Labeling and measurements

After collecting the data, target words were labeled in Praat (Boersma and Weenink 2015). These words were further segmented for the preceding vowel, sonorant (further subdivided into SG portion, where relevant, and voiced portion), and following vowel, as shown in Figure 3. The SG portion (hereafter “SG”) was segmented according to the beginning and end of clear frication in the spectrogram and waveform. The voiced portion (hereafter “sonorant”) was segmented separately in order to get reliable acoustic measures, and segmentation was done on the basis of the beginning and end of the second formant. This is in line with measurements in Bhaskararao and Ladefoged (1991), where the voiceless and voiced portions are measured separately. Finally, vowels were segmented based on the beginning and end of the second formant in order to reliably attain voice quality measures. The categories are referred to hereafter as “preceding vowel,” “SG,” “sonorant,” and “following vowel.”

Figure 3: 
Segmentation of the phrase [ŋa 




l


◦



$\underset{{\circ}}{\text{l}}$


a 



w
ã
і
∽


$\text{w}\text{\tilde{a}}\text{&#x456;}\text{{\backsim}}$


] ‘my round sword’ with SG.
Figure 3:

Segmentation of the phrase [ŋa l a w ã і ] ‘my round sword’ with SG.

The averaged acoustic measures for each of these labeled portions were obtained using VoiceSauce (Shue 2010). The measures considered in the analysis were duration, H1–H2, SoE, HNR < 3500 Hz, and F0, in line with recent work by Berkson (2019) and Garellek (2020). Averaged data was considered rather than time-course data in order to establish a baseline acoustic description of voiceless sonorants and to allow for linear discriminant analysis classification.

Linear mixed-effects regressions were carried out in R using the lme4 package to determine how accurately the voiced and voiceless sonorants can be classified using these acoustic correlates (Baayen et al. 2008). Linear mixed-effects models consider how the combination of fixed and random effects correlate with a given variable. This method has been regularly used in the phonetics literature to examine topics such as Marathi breathy sonorants (Berkson 2019) and Spanish vowel nasalization (Bongiovanni 2021). The two tones and four sonorants were collapsed in these models to allow for model convergence and interpretable results. Data was collected from multiple tones and consonant onsets in order to gather a more representative sample of Burmese sonorants without the experiment becoming too onerous for participants.

A linear discriminant analysis (LDA) was also run in R using the MASS package lda() function to determine the acoustic correlates that were best able to predict the voicing status of the sonorant (Venables and Ripley 2002). LDA is a statistical classification tool that, given a set of features, finds a linear combination of those features that successfully distinguishes a set of predefined variables. This methodology has been increasingly used in the phonetics literature, with researchers using it to study topics such as Yoloxóchitl Mixtec fricatives (DiCanio et al. 2020) and the complex phonation system of !Xóõ (Garellek 2020). This methodology is well-suited to address Burmese sonorants, as the key correlates of voiceless sonorants are not yet clear. It seems likely that Burmese voiced and voiceless sonorants differ in terms of a variety of acoustic correlates, which have different weights in terms of their importance in classifying the sound.

3 Results

3.1 Acoustics of Burmese sonorants

Spectrograms of Burmese sonorants, shown in Figure 4, reveal one typical realization of voiced sonorants and two common realizations of voiceless sonorants. An inspection of the top left spectrogram reveals that Burmese voiced sonorants have formant patterns typical of voiced sonorants cross-linguistically. Voiceless sonorants, on the other hand, are generally produced with a two-part structure: a SG gesture followed by the voiced portion of the sonorant, as shown in the top right spectrogram. This results in a realization like [hla]. However, in 22 % of voiceless tokens, there was no evidence for a SG gesture, as shown in the bottom spectrogram. The proportion of voiceless sonorants with a SG gesture to voiceless sonorants without a SG gesture is largely consistent between participants.

Figure 4: 
/la/ ‘mule’ (top left), /




l


◦



$\underset{{\circ}}{\text{l}}$


a/ ‘sword’ (top right) with SG, and /




l


◦



$\underset{{\circ}}{\text{l}}$


a/ ‘sword’ (bottom) without a SG gesture. All are spoken by the same participant.
Figure 4:

/la/ ‘mule’ (top left), / l a/ ‘sword’ (top right) with SG, and / l a/ ‘sword’ (bottom) without a SG gesture. All are spoken by the same participant.

The overall averages of each acoustic correlate for voiced and voiceless sonorants and the overall averages of each acoustic correlate for voiceless SG and voiceless non-SG sonorants are reported in Table 5 and Table 6, respectively. There is a high degree of consistency within and across speakers for how voiced sonorants are realized, and by-participant details are reported in the Appendix (Table 12). The average duration of the SG period (when present) was 100.74 ms.

Table 5:

Average value of each correlate for voiced and voiceless sonorants.

Acoustic correlate Preceding vowel Sonorant Following vowel
Voiced Voiceless Voiced Voiceless Voiced Voiceless
(N = 210) (N = 209) (N = 215) (N = 212) (N = 213) (N = 210)
Duration (ms) 232 237 121 75.5 253 246
H1–H2 (dB) 9.37 8.71 8.71 9.37 3.82 4.45
SoE 0.0074 0.0070 0.0056 0.0052 0.0060 0.0050
HNR (dB) 40.1 40.1 48.5 46.1 43.5 43.6
F0 (Hz) 205 205 171 192 164 172
Table 6:

Average value of each correlate for voiceless SG and voiceless non-SG sonorants.

Acoustic correlate Preceding vowel Sonorant Following vowel
SG Non-SG SG Non-SG SG Non-SG
(N = 162) (N = 47) (N = 164) (N = 48) (N = 162) (N = 48)
Duration (ms) 239 230 61.3 124 248 239
H1–H2 (dB) 5.57 4.74 9.57 8.70 4.62 3.87
SoE 0.0067 0.0079 0.0054 0.0047 0.0050 0.0052
HNR (dB) 39.3 42.5 45.1 49.6 43.4 44.1
F0 (Hz) 205 205 195 183 173 168

3.2 Distinguishing voiced and voiceless sonorants

Linear mixed-effects models were run with the goal of determining whether voicing is predictive of differences in the acoustic correlates. Models were run on data from the preceding vowel, the sonorant, and the following vowel. These models took the acoustic correlate (duration, H1–H2, SoE, HNR, and F0) as the dependent variable, voicing (voiced or voiceless) as a fixed effect, and participant and word as random intercepts. The baseline for all the models was a voiced sonorant. The output of the models is summarized in Table 7. The coefficients (β and z) reflect how much change is indicated when the sonorant is voiceless compared to voiced. The results suggest that both the sonorant and following vowel have acoustic correlates that distinguish voiced and voiceless sonorants. Measures for different voiceless sonorant realizations can be seen in Table 6.

Table 7:

Results of the linear mixed-effects models for each acoustic correlate using data from the preceding vowel, sonorant, and following vowel.

Acoustic correlate Preceding vowel Sonorant Following vowel
β z p value β z p value β z p value
Duration 4.99 1.11 0.29 −45.00 −6.82 <0.01 −6.37 −0.60 0.56
H1–H2 −0.21 −0.49 0.63 0.50 0.32 0.75 0.48 1.19 0.23
SoE −0.0004 −1.40 0.18 −0.0004 −0.81 0.43 −0.0009 −2.53 0.02
HNR 0.14 0.30 0.77 −2.31 −2.10 0.06 0.32 0.22 0.83
F0 0.46 0.16 0.87 21.18 6.30 <0.01 8.59 1.41 0.18
  1. Bolded cells indicate statistical significance at the p < 0.05 level.

First, voiced sonorants (121 ms) are significantly longer in duration compared to the voiced portion of voiceless sonorants (75.5 ms). Without the SG portion, the voiceless sonorants are on average 45.5 ms shorter than the voiced sonorants. However, if the average 100.74 ms SG gesture were included, the voiceless sonorants would be on average 55.24 ms longer than the voiced sonorants. In fact, when duration includes both the SG portion and the voiced sonorant portion, there is a significant difference between voiced and voiceless sonorants such that voiceless sonorants are longer (β = 40.21, z = 5.96, p < 0.01). This is in line with previous work on Burmese.

F0 as measured over the sonorant also has significant differences, such that the voiced sonorant portion (192 Hz) have a significantly higher F0 than phonemically voiced sonorants (171 Hz). This is consistent with previous literature that has found a cross-linguistic correlation between voiceless phonemes and raised F0 (Kingston 2011).

Finally, SoE as measured over the vowel following the sonorant was found to be significantly different depending on if the sonorant was voiced or voiceless. Vowels following voiced sonorants (SoE = 0.006) had significantly higher SoE than vowels following voiceless sonorants (SoE = 0.005), which suggests that voiced sonorants are associated with a greater degree of voicing. It is interesting to note that this difference is only significantly different when measured on the following vowel, and not on the sonorant. This could be due to the sonorant being a poor carrier of this acoustic correlate, as has been discussed in previous literature on voiceless sonorants (Blevins 2018).

Finally, a linear discriminant analysis (LDA) was carried out with the goal of determining how accurately a model can distinguish between the two phonological categories and what the most important cues were for distinguishing them. The LDA was given information about average duration, H1–H2, SoE, HNR, and F0 for the preceding vowel, sonorant, and following vowel of each token. It was also provided with information about the duration of the SG gesture, which was 0 if there was no SG gesture. This resulted in the LDA having information about whether a sonorant was voiced or voiceless (based on the production as intended), and a total of 16 acoustic correlates. After the model was trained on 80 % of the data, it was tested on 20 % of the data to determine how accurately it was able to distinguish between voicing categories. It had a success rate of 86.7 % in accurately categorizing a given token as either voiced or voiceless, correctly categorizing voiced tokens 99.4 % of the time, and correctly categorizing voiceless tokens 73.6 % of the time. Within the voiceless sonorants, voiceless sonorants with a SG gesture were correctly categorized 93.7 % of the time, while those without a SG gesture were correctly categorized only 5.4 % of the time. All coefficients of the linear discriminant are reported in Table 8, with negative values reflecting a correlate that is more predictive of voiced sonorants, and positive values reflecting a correlate that is more predictive of voiceless sonorants. Taken together, these results suggest that the correlates examined in the present study are adequate for accurately and consistently distinguishing voiced and voiceless sonorants. Of these correlates, the duration of the SG period is the most important. However, other correlates may also play a role; most notably, F0 of the sonorant, the duration of the sonorant, and SoE as measured on the following vowel, which were the same cues that were found to statistically differentiate voiced and voiceless sonorants in the linear mixed-effects models. It is also worth noting that acoustic correlates measured on the preceding vowel were consistently found to be among the least useful to the model, suggesting that the preceding vowel does not hold much acoustic information for sonorant voicing.

Table 8:

Coefficients of the linear discriminant for each acoustic correlate in descending order of importance.

Acoustic correlate LD1 coefficient
Duration of SG gesture 1.05
F0 of sonorant 0.38
Duration of sonorant −0.21
SoE of following vowel −0.18
Duration of following vowel −0.15
HNR of sonorant −0.14
HNR of following vowel 0.08
F0 of following vowel 0.08
HNR of preceding vowel 0.08
F0 of preceding vowel −0.08
SoE of sonorant 0.07
H1–H2 of sonorant −0.02
H1*–H2* of following vowel 0.02
SoE of preceding vowel 0.02
H1*–H2* of preceding vowel 0.02
Duration of preceding vowel 0.01

3.3 Variable realizations of voiceless sonorants

Having understood the acoustic correlates that distinguish Burmese voiced and voiceless sonorants, the next step is to carry out a closer analysis of voiceless sonorants. As demonstrated in Figure 4, voiceless sonorants do not surface in only one way. 78 % of the time they surface with a SG gesture, but 22 % of the time they do not, with remarkable between-speaker consistency and no significant difference based on gender (a SG gesture was present 75 % of the time for women and 82 % of the time for men). As established in Section 1.4, there are a few possibilities for the realization of voiceless non-SG segments: (1) the SG gesture is mistimed, and shifts to an adjacent segment, (2) there is a mispronunciation in which a voiced sonorant is produced, or (3) the SG gesture is absent but other laryngeal correlates of voicelessness remain. The results of linear mixed-effects regressions are reported below with the goal of providing clarity on this issue.

Linear mixed-effects models were run for each acoustic correlate with a three-way divide for voicing (voiced, voiceless SG, and voiceless non-SG) as a fixed effect. Once again, these models used data from the sonorant and adjacent vowels, and took the acoustic correlate (duration, H1–H2, SoE, HNR, and F0) as the dependent variable, and participant and word as random intercepts. A post-hoc test considering the differences between the three voicing categories was carried out with the emmeans package in R using the Holms correction for multiple comparisons. The post-hoc results are reported in the Appendix (Table 13) and support the significant results found in the linear mixed-effects regressions. Average values of acoustic correlates for voiceless SG and voiceless non-SG sonorants are presented in Table 6.

The output for voiceless SG sonorants is summarized in Table 9, where voiced sonorants were the baseline. Duration, HNR, and F0 as measured over the sonorant are significantly different between the two categories, and SoE as measured over the vowel following the sonorant is significantly different between the two.

Table 9:

Results of the linear mixed-effects models for voiceless SG sonorants with voiced sonorants as the baseline.

Acoustic correlate Preceding vowel Sonorant Following vowel
β z p value β z p value β z p value
Duration 7.26 1.59 0.13 −58.21 −10.24 <0.01 −5.87 −0.55 0.59
H1–H2 −0.13 −0.29 0.77 0.58 0.37 0.71 0.61 1.42 0.16
SoE −0.0005 −1.70 0.11 −0.0002 −0.40 0.69 −0.0008 −2.33 0.03
HNR −0.13 −0.27 0.79 −2.81 −2.62 0.02 0.64 0.45 0.66
F0 0.63 0.22 0.83 24.12 7.68 <0.01 10.55 1.74 0.10
  1. Bolded cells indicate statistical significance at the p < 0.05 level.

The output for voiceless non-SG sonorants is summarized in Table 10, where voiced sonorants were the baseline. The sonorant had significant differences in F0, and the following vowel had significant differences in SoE. This suggests that while the correlates associated with SG gestures are no longer significant, correlates of voicing, such as F0 and SoE, remain relevant.

Table 10:

Results of the linear mixed-effects models for voiceless non-SG sonorants with voiced sonorants as the baseline.

Acoustic correlate Preceding vowel Sonorant Following vowel
β z p value β z p value β z p value
Duration −2.71 −0.30 0.69 0.53 0.07 0.94 −8.10 −0.55 0.59
H1–H2 −0.46 −0.69 0.50 0.24 0.15 0.88 0.03 0.04 0.97
SoE 0.00001 0.03 0.98 −0.0012 −1.84 0.08 −0.0011 −2.54 0.02
HNR 1.02 1.63 0.11 −0.62 −0.52 0.61 −0.80 −0.51 0.62
F0 −0.11 −0.03 0.97 11.08 2.82 <0.01 1.84 0.29 0.78
  1. Bolded cells indicate statistical significance at the p < 0.05 level.

Finally, the output comparing voiceless SG sonorants and voiceless non-SG sonorants are summarized in Table 11. Factors were re-leveled, such that voiceless SG sonorants were the baseline, allowing for a direct comparison between the two. The models show that the sonorant itself has significant differences in duration, HNR, and F0 between the two categories, and F0 as measured over the following vowel is also significantly different.

Table 11:

Results of the linear mixed-effects models for voiceless non-SG sonorants with voiceless SG sonorants as the baseline.

Acoustic correlate Preceding vowel Sonorant Following vowel
β z p value β z p value β z p value
Duration −9.98 −1.46 0.14 58.74 9.45 <0.01 −2.23 −0.28 0.78
H1–H2 −0.33 −0.48 0.63 −0.34 −0.61 0.54 −0.58 −0.84 0.40
SoE 0.0005 1.14 0.26 −0.001 −1.93 0.05 −0.0003 −0.80 0.42
HNR 1.14 2.00 0.05 2.19 3.16 <0.01 −1.44 −1.80 0.07
F0 −0.74 −0.30 0.77 −13.04 −4.11 <0.01 −8.71 −3.47 <0.01
  1. Bolded cells indicate statistical significance at the p < 0.05 level.

Duration of the sonorant is significantly different between the voiceless SG sonorants and the two other voicing categories, such that the voiced portion of voiceless SG sonorants (61.3 ms) is significantly shorter than voiceless non-SG sonorants (124 ms) and voiced sonorants (121 ms), as shown in Figure 5. This is likely due to the fact that the SG period results in a shorter voiced portion. However, even when duration includes the voiceless SG portion and the voiced portion, there is still a significant difference between voiceless SG sonorants and the other two categories, such that voiceless SG sonorants are longer than both voiced (β = −51.22, z = −8.15, p < 0.01) and voiceless non-SG (β = −49.07, z = −6.910, p < 0.01) sonorants.

Figure 5: 
Duration measured over the sonorant. In this figure and the following figures, lines in the box plot represent the median, dots represent the mean, and stars represent a significant difference.
Figure 5:

Duration measured over the sonorant. In this figure and the following figures, lines in the box plot represent the median, dots represent the mean, and stars represent a significant difference.

HNR measured over the sonorant also differs significantly, such that the voiced portion of voiceless SG sonorants (45.1 dB) have a significantly lower HNR than either voiced sonorants (48.5 dB) or voiceless non-SG sonorants (49.6 dB), as can be seen in Figure 6. This suggests that voiceless SG sonorants are less modally voiced than either of the other voicing categories. Interestingly, this is the only result that is significant for the voiced versus voiceless SG comparison but not for the general voiced versus voiceless comparison. The lack of a significant difference for voiced versus overall voiceless comparison is likely due to the fact that voiceless non-SG sonorants do not have the SG gesture, and therefore don’t have the lower HNR associated with noise. Since a portion of the voiceless sonorants do not have a SG gesture, it could impact the overall voiced versus voiceless HNR comparison.

Figure 6: 
HNR measured over the sonorant.
Figure 6:

HNR measured over the sonorant.

F0 measured over the sonorant was significantly different for all three voicing categories, with voiceless SG sonorants (195 Hz) having the highest F0, followed by voiceless non-SG sonorants (183 Hz), and finally voiced sonorants (171 Hz), which had the lowest F0. This is reflected in Figure 7. Both voiceless sonorant variants have a higher F0 than the voiced sonorant, in line with the cross-linguistic pattern of voicelessness being associated with a higher F0. The difference observed between the voiceless sonorants, on the other hand, closely mirrors patterns observed in other languages, where F0 is higher with SG segments than modally voiced segments (Ladd and Schmid 2018). This result helps confirms the idea that F0 is correlated with both voicing and SG, and that these two processes can be separate.

Figure 7: 
F0 measured over the sonorant.
Figure 7:

F0 measured over the sonorant.

Turning to the acoustic correlates on the following vowel that are significantly different between voicing categories, SoE is significantly higher for voiced sonorants (SoE = 0.0060) than either voiceless SG sonorants (SoE = 0.0050) or voiceless non-SG sonorants (SoE = 0.0052), as shown in Figure 8. Put another way, the vowel has significantly higher amplitude voicing following a phonemically voiced sonorant as compared to a phonemically voiceless sonorant. There are no significant differences in SoE between the voiceless SG and voiceless non-SG sonorants for the following vowel.

Figure 8: 
SoE measured over the following vowel.
Figure 8:

SoE measured over the following vowel.

Finally, there is a significant difference in F0 measured over the following vowel between voiceless SG sonorants (173 Hz) and voiceless non-SG sonorants (168 Hz), but not voiced sonorants (164 Hz), as shown in Figure 9. This finding reflects the same pattern observed for F0 as measured over the sonorant for voiceless SG and voiceless non-SG sonorants, and suggests that, in line with previous work, segments with a SG gesture tend to have a higher F0 than non-SG segments, even if there is no corresponding difference in phonological voicing category (Ladd and Schmid 2018).

Figure 9: 
F0 measured over the following vowel.
Figure 9:

F0 measured over the following vowel.

4 Discussion

Altogether, these results paint a more comprehensive acoustic picture of Burmese voiced and voiceless sonorants. The LDA indicates that the two are differentiated primarily by a SG gesture that occurs 78 % of the time. This lends support to the claim made by Bhaskararao and Ladefoged (1991) that in languages like Burmese and Mizo the [−voice] portion of the voiceless sonorant is critical for distinguishing it from its voiced counterpart. However, even though the LDA suggests that this is the primary correlate of voicelessness, correlates present on the sonorant and following vowel are also significantly different for the two categories, likely providing more acoustic information for phoneme identification, in line with predictions made in Section 1.4. Compared to voiced sonorants, duration is shorter for the voiced portion of voiceless sonorants, F0 is higher for voiceless sonorants, and SoE as measured over the following vowel is lower for voiceless sonorants. While H1–H2 and HNR were also expected to show significant differences between the two, no significant differences were found.

The present results also suggest that sonorants do not carry all of the information about voicelessness segment-internally. SoE, a primary correlate of voicing, is never significantly different between voiced and voiceless sonorants when considering data from the sonorant itself. However, if data from the following vowel is used, the difference is significant. This suggests that SoE is still a useful correlate of voicing, but since it may not be easily produced or perceived on the sonorant, it is only significant on the following vowel. Other significant correlates, however, are found on the sonorant itself. Duration and F0 as measured on the sonorant are useful in distinguishing voiced and voiceless sonorants, suggesting that voiceless sonorants may be recognizable from both segment-internal correlates as well as correlates on adjacent segments.

Turning to the two possible realizations of the voiceless sonorants; the absence of the SG gesture does not seem to reflect different gestural timings or a mispronunciation. Instead, it seems that the difference between voiceless SG sonorants and voiceless non-SG sonorants is a difference of articulation, in that the SG gesture is not present. In voiceless non-SG sonorants, there is no evidence of a SG gesture in the spectrogram, nor is there evidence of a SG gesture (as measured by HNR) on the sonorant or adjacent vowels. However, the voiceless non-SG sonorants are not produced exactly like voiced sonorants either. In line with the third explanation laid out in Section 1.4, despite the fact that correlates of SG are gone, correlates of voicelessness (including a significantly higher F0 on the sonorant and a significantly lower SoE on the following vowel) remain. This result has implications for our understanding of voiceless sonorants, as the turbulent airflow that has been described as occurring with voiceless sonorants cross-linguistically seems to be an optional correlate rather than an intrinsically linked feature that arises due to articulatory constraints. Furthermore, it seems that non-normative SG gestures occur as a correlate of contrastive sonorant voicing, similar to what has been described for contrastive stop voicing, contrastive fricative voicing, and contrastive stop gemination (DiCanio 2010; Gordeeva and Scobbie 2010; Helgason 2002; Stevens 2011).

Finally, these finding may be consistent with a longstanding issue in the perception literature on how multiple acoustic correlates can give rise to a single coherent speech percept. Traditionally, three explanations have been put forward: (1) acoustic properties covary reliably, (2) acoustic properties result from a single gesture, or (3) acoustic properties produce similar auditory effects (Kingston et al. 2008). The present results support the idea that the historical connection between SG gestures and voiceless sonorants could be arising because of option 1 or option 3. Option 2 does not provide a satisfying explanation, as the perception of voicelessness arises due both to the SG gesture and to other correlates of voicelessness, but the SG gesture can be absent. This suggests that it is not a single gesture that produces both the SG gesture and other voiceless correlates. Put another way, these two elements of voiceless sonorants are not the result of a single gesture; they can be produced separately. A perception experiment could help substantiate this analysis of voiceless sonorants and contribute to the growing body of literature on differences in acoustic cue weights in production and perception (Schertz and Clare 2020).

We can now return to the research questions laid out at the beginning of the paper:

  1. What acoustic correlates distinguish voiced and voiceless sonorants?

    1. A SG gesture is likely the primary acoustic correlate that distinguishes voiced and voiceless sonorants.

    2. Duration and F0 as measured over the sonorant, and SoE as measured over the following vowel are also significantly different for voiced and voiceless sonorants.

  2. Are there multiple realizations of voiceless sonorants and what correlates distinguish them?

    1. 78 % of the time, voiceless sonorants are realized with a SG period, but 22 % of the time, there is no SG period.

    2. In cases where voiceless sonorants do not have a SG period, this does not reflect a gestural shift or a completely voiced pronunciation; instead, the SG gesture is absent altogether, resulting in an articulation in which the correlates associated with SG (duration, HNR) are gone, but the correlates associated with voicelessness (F0, SoE) remain.

5 Conclusions

This paper provides an acoustic analysis of Burmese voiced and voiceless sonorants, demonstrating that the two differ primarily in terms of a SG gesture present with most of the voiceless sonorants. However, current findings show that this isn’t the only correlate of voicelessness. Duration and F0 as measured over the sonorant and SoE as measured over the following vowel significantly differ depending on whether the sonorant is voiced or voiceless. This also suggests that voiceless sonorants carry relevant acoustic information both segment-internally and on adjacent segments. Furthermore, Burmese voiceless sonorants are not always realized in the same way. About 22 % of the time they’re realized without a SG gesture. A closer analysis of the voiceless non-SG sonorants reveals that the acoustic correlates associated with SG gestures (duration, HNR) are absent on these segments, but those associated with voicelessness (F0, SoE) are still present. This suggests that the absence of a SG gesture is neither a production error, in which a voiced sonorant is produced, nor is it a mistiming of the SG gesture, in which the SG portion is present but shifted onto one of the adjacent segments. Instead, it seems that the SG gesture is absent altogether but that other correlates of voicelessness remain. These findings have implications for our understanding of voiceless sonorants and for description of Burmese voiceless sonorants, which are perhaps better described as having non-normative SG gestures.


Corresponding author: Chiara Repetti-Ludlow, NYU Department of Linguistics, 10 Washington Place, New York, NY 10003, USA, E-mail:

Funding source: New York University Linguistics Department Funds

Award Identifier / Grant number: NA

Acknowledgements

I would like to thank Lisa Davidson, Gillian Gallagher, Laurel MacKenzie, Thinzar Kyaw, Sithu Maung, Juliet Stanton, Kelly Berkson, Marc Garellek, and an anonymous reviewer for helpful feedback on this project. I am also extremely grateful to the individuals who participated in this study.

  1. Research funding: NYU’s Linguistics Department provided a grant of $250 from department funds to carry out this research.

  2. Author contributions: The author has accepted responsibility for the entire content of this manuscript and approved its submission.

  3. Competing interests: The author has no conflicts of interest to declare.

  4. Informed consent: Informed consent was obtained from all individuals who participated in this study in accordance with the NYU IRB (IRB 2016-713).

  5. Ethical approval: Research was carried out under FY-IRB 2016-713.

Appendix
Table 12:

Average value of each correlate (measured over the preceding vowel, sonorant, and following vowel) for each participant. Shaded cells represent a speaker who is a woman while unshaded cells represent a speaker who is a man.

Previous v Duration (ms) H1–H2 (dB) SoE HNR (dB) F0 (Hz)
Voiced Vless Voiced Vless Voiced Vless Voiced Vless Voiced Vless
Pt 1 246 247 4.85 5.67 0.0086 0.0078 42.9 43.1 171 178
Pt 2 297 319 11.8 8.73 0.0069 0.0064 42.7 42.5 243 234
Pt 3 239 237 3.78 2.19 0.0093 0.0080 50.9 51.6 223 216
Pt 4 254 265 5.16 5.09 0.0043 0.0044 36.3 36.4 187 192
Pt 5 197 207 3.53 4.16 0.0035 0.0036 29.6 29.9 188 184
Pt 6 192 188 6.00 6.57 0.0086 0.0094 42.0 42.4 240 249
Pt 7 190 188 7.77 7.73 0.0047 0.0044 36.5 36.9 144 147
Pt 8 252 272 10.6 12.4 0.0125 0.0113 41.6 42.0 230 231
Pt 9 210 210 8.97 9.34 0.0078 0.0078 36.4 34.8 227 228
Sonorant Duration (ms) H1–H2 (dB) SoE HNR (dB) F0 (Hz)
Voiced Vless Voiced Vless Voiced Vless Voiced Vless Voiced Vless
Pt 1 139 86.5 9.01 7.40 0.0053 0.0054 53.2 51.3 164 178
Pt 2 147 76.8 10.8 10.7 0.0063 0.0088 52.4 47.2 176 212
Pt 3 175 77.8 11.8 11.0 0.0075 0.0038 55.5 47.7 195 201
Pt 4 79.2 58.5 7.07 7.53 0.0034 0.0024 43.7 40.5 156 184
Pt 5 126 52.5 3.00 6.88 0.0035 0.0019 39.3 40.7 149 188
Pt 6 126 123 12.2 12.4 0.0120 0.0124 53.3 54.2 202 214
Pt 7 107 107 6.39 5.35 0.0038 0.0030 43.6 44.6 111 123
Pt 8 105 83.3 10.5 11.1 0.0042 0.0036 50.6 46.5 191 211
Pt 9 105 46.9 9.21 11.2 0.0060 0.0063 47.1 41.5 182 207
Following v Duration (ms) H1–H2 (dB) SoE HNR (dB) F0 (Hz)
Voiced Vless Voiced Vless Voiced Vless Voiced Vless Voiced Vless
Pt 1 229 254 4.53 6.07 0.0075 0.0061 50.1 50.3 150 165
Pt 2 304 272 9.64 8.22 0.0042 0.0038 48.7 44.4 184 180
Pt 3 238 237 8.74 8.93 0.0043 0.0027 47.9 44.6 170 171
Pt 4 263 274 4.75 5.03 0.0036 0.0031 38.4 41.4 146 159
Pt 5 260 239 3.29 3.25 0.0041 0.0022 36.0 35.8 145 159
Pt 6 198 232 7.34 5.93 0.0084 0.0106 45.6 48.3 167 228
Pt 7 220 224 6.08 7.14 0.0040 0.0031 40.0 42.9 103 114
Pt 8 246 179 7.97 7.90 0.0047 0.0036 37.5 41.5 184 189
Pt 9 239 229 6.56 6.84 0.0078 0.0063 37.9 39.7 175 184
Table 13:

Post-hoc test comparing voicing types for each acoustic correlate.

Preceding vowel Voiced versus SG Voiced versus non-SG SG versus non-SG
β z p value β z p value β z p value
Duration (ms) −7.26 −1.59 0.39 2.71 0.40 0.69 9.98 1.44 0.39
H1*–H2* (dB) 0.13 0.29 1.00 0.46 0.68 1.00 0.33 0.48 1.00
SoE 0.0004 1.69 0.33 −0.00001 −0.03 0.98 −0.0005 −1.12 0.53
HNR (dB) 0.13 0.27 0.79 −1.02 −1.62 0.23 −1.15 −1.98 0.23
F0 (Hz) −0.63 −0.22 1.00 0.12 0.03 1.00 0.74 0.30 1.00
Sonorant Voiced versus SG voiced versus non-SG SG versus non-SG
β z p value β z p value β z p value
Duration (ms) 58.21 10.24 <0.01 −0.53 −0.07 0.94 −58.74 −9.38 <0.01
H1–H2 (dB) −0.58 −0.37 1.00 −0.24 −0.15 1.00 0.34 0.61 1.00
SoE 0.0002 0.40 0.69 0.0012 1.84 0.17 0.0010 1.92 0.17
HNR (dB) 2.81 2.62 0.04 0.62 0.52 0.61 −2.19 −3.15 <0.01
F0 (Hz) −24.10 −7.67 <0.01 −11.10 −2.82 <0.01 13.0 4.09 <0.01
Following vowel Voiced versus SG Voiced versus non-SG SG versus non-SG
β z p value β z p value β z p value
Duration (ms) 5.87 0.55 1.00 8.10 0.66 1.00 2.23 0.27 1.00
H1*–H2* (dB) −0.61 −1.42 0.52 −0.03 −0.04 1.00 0.58 0.83 0.82
SoE 0.0008 2.31 0.07a 0.0011 2.54 0.04 0.0003 0.80 0.43
HNR (dB) −0.64 −0.45 1.00 0.80 0.51 1.00 1.44 1.80 0.22
F0 (Hz) −10.55 −1.74 0.21 −1.84 −0.29 0.78 8.71 3.47 <0.01
  1. aWhile this value is not significant at the level of the post-hoc tests, it is significant in the linear mixed-effects models. In order to determine whether SoE measured over the following vowel is significantly different for voiced and voiceless SG sonorants, mixed-effects logistic regressions were run using the lme4 package in R with voicing (either voiced or voiceless SG) as the response variable, and the acoustic correlates as fixed effects. One model was run with SoE included as a fixed effect and another was run without SoE. After this, an ANOVA analysis was run and found that the model with SoE included was significantly more informative, so the result is included as significant in the paper. Bolded cells indicate statistical significance at the p < 0.05 level.

References

Baayen, R. Harald, Douglas J. Davidson & Douglas M. Bates. 2008. Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language 59(4). 390–412. https://doi.org/10.1016/j.jml.2007.12.005.Search in Google Scholar

Bang, Hye-Young, Morgan Sonderegger, Yoonjung Kang, Meghan Clayards & Tae-Jin Yoon. 2018. The emergence, progress, and impact of sound change in progress in Seoul Korean: Implications for mechanisms of tonogenesis. Journal of Phonetics 66. 120–144. https://doi.org/10.1016/j.wocn.2017.09.005.Search in Google Scholar

Bell, Elise, Diana B. Archangeli, Skye J. Anderson, Michael Hammond, Peredur Webb-Davies & Heddwen Brooks. 2021. Northern Welsh. Journal of the International Phonetic Association 53(2). 1–24. https://doi.org/10.1017/S0025100321000165.Search in Google Scholar

Berkson, Kelly Harper. 2019. Acoustic correlates of breathy sonorants in Marathi. Journal of Phonetics 73. 70–90. https://doi.org/10.1016/j.wocn.2018.12.006.Search in Google Scholar

Bhaskararao, Peri & Peter Ladefoged. 1991. Two types of voiceless nasals. Journal of the International Phonetic Association 21(2). 80–88. https://doi.org/10.1017/S0025100300004424.Search in Google Scholar

Blankenship, Barbara, Peter Ladefoged, Peri Bhaskararao & Nichumeno Chase. 1993. Phonetic structures of khonoma angami. Linguistics of the Tibeto-Burman Area 16(2). 69–88.Search in Google Scholar

Blevins, Juliette. 2018. Evolutionary phonology and the life cycle of voiceless sonorants. In Typological hierarchies in synchrony and diachrony, 29–60. Amsterdam: John Benjamins Publishing Company.10.1075/tsl.121.01bleSearch in Google Scholar

Boersma, Paul & David Weenink. 2015. Praat: Doing phonetics by computer [Computer Program].Search in Google Scholar

Bombien, Lasse. 2006. Voicing alterations in Icelandic sonorants–a photoglottographic and acoustic analysis. Arbeitsberichte des Instituts für Phonetik und digitale Sprachverarbeitung der Universität Kiel (AIPUK) 37. 63–82.Search in Google Scholar

Bongiovanni, Silvina. 2021. Acoustic investigation of anticipatory vowel nasalization in a Caribbean and a non-Caribbean dialect of Spanish. Linguistics Vanguard 7(1). 20200008. https://doi.org/10.1515/lingvan-2020-0008.Search in Google Scholar

Bradley, David. 1996. Burmese as a lingua franca. In Stephen A. Wurm, Peter Mühlhäusler & Darrell T. Tryon (eds.), Trends in linguistics. Documentation [TiLDOC], 13, 745–748. Berlin: De Gruyter Mouton.Search in Google Scholar

Chirkova, Katia, Patricia Basset & Angélique Amelot. 2019. Voiceless nasal sounds in three Tibeto-Burman languages. Journal of the International Phonetic Association 49(1). 1–32. https://doi.org/10.1017/S0025100317000615.Search in Google Scholar

Chirkova, Katia, Tanja Kocjančič Antolík & Angélique Amelot. 2021. Baima. Journal of the International Phonetic Association 49(1). 1–30. https://doi.org/10.1017/S0025100321000219.Search in Google Scholar

Dantsuji, Masatake. 1984. A study on voiceless nasals in Burmese. Studia Phonologica 18. 1–14.Search in Google Scholar

Denes, Peter. 1955. Effect of duration on the perception of voicing. The Journal of the Acoustical Society of America 27(4). 761–764. https://doi.org/10.1121/1.1908020.Search in Google Scholar

DiCanio, Christian T. 2010. Itunyoso Trique. Journal of the International Phonetic Association 40(2). 227–238. https://doi.org/10.1017/S0025100310000034.Search in Google Scholar

DiCanio, Christian T., Caicai Zhang, Douglas H. Whalen & Rey Castillo García. 2020. Phonetic structure in Yoloxóchitl Mixtec consonants. Journal of the International Phonetic Association 50(3). 333–365. https://doi.org/10.1017/S0025100318000294.Search in Google Scholar

Dinnsen, Daniel A. & Jan Charles-Luce. 1984. Phonological neutralization, phonetic implementation and individual differences. Journal of Phonetics 12(1). 49–60. https://doi.org/10.1016/s0095-4470(19)30850-2.Search in Google Scholar

van Dommelen, Wim A. 1998. Production and perception of preaspiration in Norwegian. In Proc. FONETIK, vol. 98, 20–23.Search in Google Scholar

Dutta, Indranil. 2007. Four-way stop contrasts in Hindi: An acoustic study of voicing, fundamental frequency and spectral tilt. Urbana, Illinois: University of Illinois at Urbana-Champaign.Search in Google Scholar

Garellek, Marc. 2012. The timing and sequencing of coarticulated non-modal phonation in English and White Hmong. Journal of Phonetics 40(1). 152–161. https://doi.org/10.1016/j.wocn.2011.10.003.Search in Google Scholar

Garellek, Marc. 2019. The phonetics of voice 1. The Routledge handbook of phonetics, 75–106. London, UK and New York, USA: Routledge.10.4324/9780429056253-5Search in Google Scholar

Garellek, Marc. 2020. Acoustic discriminability of the complex phonation system in !Xóõ. Phonetica 77(2). 131–160. https://doi.org/10.1159/000494301.Search in Google Scholar

Garellek, Marc. 2022. Theoretical achievements of phonetics in the 21st century: Phonetics of voice quality. Journal of Phonetics 94. 1–22. https://doi.org/10.1016/j.wocn.2022.101155.Search in Google Scholar

Garellek, Marc, Amanda Ritchart & Jianjing Kuang. 2016. Breathy voice during nasality: A cross-linguistic study. Journal of Phonetics 59. 110–121. https://doi.org/10.1016/j.wocn.2016.09.001.Search in Google Scholar

Gordeeva, Olga & James Scobbie. 2010. Preaspiration as a correlate of word-final voice in Scottish English fricatives. Turbulent Sounds: An Interdisciplinary Guide 21. 167–207.10.1515/9783110226584.167Search in Google Scholar

Hansson, Gunnar Ólafur. 2001. Remains of a submerged continent: Preaspiration in the languages of Northwest Europe. Historical linguistics 1999, 157–174. Amsterdam, The Netherlands and Philadelphia, USA: John Benjamins.10.1075/cilt.215.12hanSearch in Google Scholar

Helgason, Pétur. 2002. Preaspiration in the Nordic languages: Synchronic and diachronic aspects. Institutionen för lingvistik Dissertation.Search in Google Scholar

Jessen, Michael & Magnús Pétursson. 1998. Voiceless nasal phonemes in Icelandic. Journal of the International Phonetic Association 28(1–2). 43–53. https://doi.org/10.1017/s002510030000623x.Search in Google Scholar

Kawahara, Hideki, Ikuyo Masuda-Katsuse & Alain de Cheveigné. 1999. Restructuring speech representations using a pitch-adaptive time–frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds. Speech Communication 27(3–4). 187–207. https://doi.org/10.1016/S0167-6393(98)00085-5.Search in Google Scholar

Kingston, John. 2011. Tonogenesis: Tonogenesis. In Marc van Oostendorp, Colin J. Ewen, Elizabeth Hume & Keren Rice (eds.), The Blackwell companion to phonology, 1–30. Oxford, UK: John Wiley & Sons, Ltd.10.1002/9781444335262.wbctp0097Search in Google Scholar

Kingston, John, Randy L. Diehl, Cecilia J. Kirk & Wendy A. Castleman. 2008. On the internal perceptual structure of distinctive features: The [voice] contrast. Journal of Phonetics 36(1). 28–54. https://doi.org/10.1016/j.wocn.2007.02.001.Search in Google Scholar

Ladd, D. Robert & Stephan Schmid. 2018. Obstruent voicing effects on F0, but without voicing: Phonetic correlates of Swiss German lenis, fortis, and aspirated stops. Journal of Phonetics 71. 229–248. https://doi.org/10.1016/j.wocn.2018.09.003.Search in Google Scholar

Ladefoged, Peter. 1971. Preliminaries to linguistic phonetics. Chicago, USA: University of Chicago Press.Search in Google Scholar

Ladefoged, Peter & Ian Maddieson. 1998. The sounds of the world’s languages. Language 74(2). 374–376. https://doi.org/10.1353/lan.1998.0168.Search in Google Scholar

Lisker, Leigh. 1972. Stop duration and voicing in English. In Albert Valdman (ed.), Papers in linguistics and phonetics to the memory of Pierre Delattre, vol. 54, 339–343. The Hague: Mouton.10.1515/9783110803877-028Search in Google Scholar

Lynch, John. 1978. A grammar of lenakel. Canberra, Australia: The Australian National University.Search in Google Scholar

Maddieson, Ian. 1984. The effects on F0 of a voicing distinction in sonorants and their implications for a theory of tonogenesis. Journal of Phonetics 12(1). 9–15. https://doi.org/10.1016/S0095-4470(19)30845-9.Search in Google Scholar

Maddieson, Ian. 1984. Patterns of sounds. Cambridge: Cambridge University Press.10.1017/CBO9780511753459Search in Google Scholar

Maddieson, Ian & Victoria B. Anderson. 1994. Phonetic structures of Iaai. UCLA Working Papers in Phonetics 87. 163–182.Search in Google Scholar

Misnadin, Misnadin, James P. Kirby & Bert Remijsen. 2015. Temporal and spectral properties of Madurese stops. In ICPhS.Search in Google Scholar

Mittal, Vinay Kumar, B. Yegnanarayana & Peri Bhaskararao. 2014. Study of the effects of vocal tract constriction on glottal vibration. The Journal of the Acoustical Society of America 136(4). 1932–1941. https://doi.org/10.1121/1.4894789.Search in Google Scholar

Moran, Steven & Daniel McCloy. 2011. PHOIBLE online.Search in Google Scholar

Peña, Jailyn M. 2022. Stød timing and Domain in Danish. Languages 7(1). 50. https://doi.org/10.3390/languages7010050.Search in Google Scholar

Port, Robert, Fares Mitleb & Michael O’Dell. 1981. Neutralization of obstruent voicing in German is incomplete. The Journal of the Acoustical Society of America 70(S1). S13. https://doi.org/10.1121/1.2018716.Search in Google Scholar

Schertz, Jessamyn & Emily J. Clare. 2020. Phonetic cue weighting in perception and production. WIREs Cognitive Science 11(2). 1–24. https://doi.org/10.1002/wcs.1521.Search in Google Scholar

Shue, Yen. 2010. The voice source in speech production: Data, analysis and models. Los Angeles: University of California.Search in Google Scholar

Stevens, Mary. 2011. Consonant length in Italian: Gemination, degemination and preaspiration. In Scott M. Alvord (ed.), Selected proceedings of the 5th conference on laboratory approaches to Romance phonology, 21–32.Search in Google Scholar

Stevens, Mary & John Hajek. 2004. How pervasive is preaspiration? Investigating sonorant devoicing in Sienese Italian. In Proceedings of the 10th australian international conference on speech science & technology, vol. 334, 334–339. Canberra: S.Search in Google Scholar

Stevens, Mary & Ulrich Reubold. 2014. Pre-aspiration, quantity, and sound change. Laboratory Phonology 5(4). https://doi.org/10.1515/lp-2014-0015.Search in Google Scholar

Venables, William N. & Brian D. Ripley. 2002. Random and mixed effects. Modern applied statistics with S, 271–300. New York, NY: Springer.10.1007/978-0-387-21706-2_10Search in Google Scholar

Watkins, Justin. 2000. Notes on creaky and killed tone in Burmese. SOAS Working Papers in Linguisitics and Phonetics 10. 139–149.Search in Google Scholar

Watkins, Justin W. 2001. Burmese. Journal of the International Phonetic Association 31(2). 291–295. https://doi.org/10.1017/S0025100301002122.Search in Google Scholar

Received: 2022-08-03
Accepted: 2023-08-22
Published Online: 2023-09-06
Published in Print: 2023-12-15

© 2023 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 21.5.2024 from https://www.degruyter.com/document/doi/10.1515/phon-2022-0026/html
Scroll to top button