Elsevier

Speech Communication

Volume 132, September 2021, Pages 10-20
Speech Communication

Read speech voice quality and disfluency in individuals with recent suicidal ideation or suicide attempt

https://doi.org/10.1016/j.specom.2021.05.004Get rights and content

Highlights

  • An investigation of suicidal ideation and suicide attempt based on speech-based parameters.

  • Reveals statistically significant differences between the healthy control and inpatients exhibiting suicidal behavior with regards to voice quality and speech disfluency attributes.

  • Demonstrates that voice quality and disfluency information can be applied as a compact feature set for machine learning techniques which can produce suicidal behavior classification with a relatively high degree (i.e. up to 80% classification accuracy).

Abstract

Individuals that have incurred trauma due to a suicide attempt often acquire residual health complications, such as cognitive, mood, and speech-language disorders. Due to limited access to suicidal speech audio corpora, behavioral differences in patients with a history of suicidal ideation and/or behavior have not been thoroughly examined using subjective voice quality and manual disfluency measures. In this study, we examine the Butler-Brown Read Speech (BBRS) database that includes 20 healthy controls with no history of suicidal ideation or behavior (HC group) and 226 psychiatric inpatients with recent suicidal ideation (SI group) or a recent suicide attempt (SA group). During read aloud sentence tasks, SI and SA groups reveal poorer average subjective voice quality composite ratings when compared with individuals in the HC group. In particular, the SI and SA groups exhibit average ‘grade’ and ‘roughness’ voice quality scores four to six times higher than those of the HC group. We demonstrate that manually annotated voice quality measures, converted into a low-dimensional feature vector, help to identify individuals with recent suicidal ideation and behavior from a healthy population, generating an automatic classification accuracy of up to 73%. Furthermore, our novel investigation of manual speech disfluencies (e.g., manually detected hesitations, word/phrase repeats, malapropisms, speech errors, non-self-correction) shows that inpatients in the SI and SA groups produce on average approximately twice as many hesitations and four times as many speech errors when compared with individuals in the HC group. We demonstrate automatic classification of inpatients with a suicide history from individuals with no suicide history with up to 80% accuracy using manually annotated speech disfluency features. Knowledge regarding voice quality and speech disfluency behaviors in individuals with a suicide history presented herein will lead to a better understanding of this complex phenomenon and thus contribute to the future development of new automatic speech-based suicide-risk identification systems.

Introduction

The majority of people who make suicide attempts do not die by suicide; in 2017 there were approximately 47,000 suicide deaths and an estimated 1400,000 suicide attempts in the United States of America (CDC, 2017). Due to the potentially harmful nature of suicide methods, according to Costache et al. (2004), Wazeer et al. (2015), and Zabel et al. (2005), survivors of attempted suicide often exhibit a debilitating range of irreversible health concerns, such as neuropsychological, neuropsychiatric, physiological, cognitive, and speech-language disorders. Brodnitz et al. (1971) was one of the first studies to report psychological problems found in patients with voice disorders. Their study of over 2000 patients, involving all forms of voice disorders, found that 80% of all voice disorder cases were attributed to vocal abuse and/or psychogenic factors (e.g., anxiety, depression). Further, Marmor et al. (2016) noted that depressive symptoms in patients were accompanied by nearly a two-fold increase in a reported voice problem in the past year when compared to a healthy population.

Mood disturbances directly impact the speech system, triggering quantifiable divergences in normal healthy speech physiological mechanisms (e.g., respiratory, muscle tension, motor coordination). For individuals with clinical depression and/or those exhibiting suicidal behavior, recorded changes in their voice characteristics are frequently attributed to psychogenic emotional and stress symptoms (Cummins et al., 2015). Examples of psychogenic symptoms include psychomotor retardation and agitation. Disturbances caused by psychomotor retardation include poorer cognitive processing and muscular incoordination, which adversely impact gross/fine motor movement and speech production (Flint et al., 1993; Hoffman et al., 1985; Silverman et al., 1992). Moreover, psychomotor agitation results in abnormal accelerated motor activity and excessive gross/fine motor movements (Day, 1999). In investigations by France et al. (2000), Ellgring and Scherer, (1996), and Yingthawornsuk et al. (2006), careful evaluation of abnormal acoustic vocal manifestations has helped to motivate new ways to automatically identify mood disorders in patients.

Suicidal speech-based literature, such as Cummins et al. (2015), Ozdas et al. (2004), Scherer et al. (2013), and Yingthawornsuk et al. (2006) have indicated that patients with a history of suicidal ideation exhibit lower acoustic energy, unusual glottal control, and breathy voice quality when compared with healthy populations. But, in these aforementioned studies, only a relatively small number of clinically validated patients with suicidal ideation and/or attempts were analyzed (i.e., less than two dozen per study). Furthermore, in many of these studies, patients’ voice quality attributes were reliant on spectral acoustic-based features rather than grounded on a standard set of clinical descriptive pathological qualities.

Scherer et al. (2013) examined ‘breathiness’ and ‘tenseness’ in a narrow demographic of adolescents with and without suicidal ideation and/or behavior. Their study found that the adolescents’ speech exhibited significantly more breathy qualities than adolescents without suicidal behavior based on peak slope and normalized amplitude quotient acoustic feature values. However, Scherer et al. (2013) did not explore other potential common pathological voice quality attributes (i.e., hoarseness, roughness, instability); or verify that these particular acoustic features only captured ‘breathiness’ quality information (i.e., they could also be capturing information from other voice quality attributes). Studies by Brodnitz et al. (1971), Mamor et al. (2016), and Scherer et al. (2013) hint that abnormal voice quality is associated with suicidal behavior. However, automatic speech-based studies have yet to further investigate several distinct pathological voice qualities associated with suicide present in a more sizable suicidal dataset.

Studies by Esposito et al. (2016), Oxman et al. (1988), Rosenberg et al. (1991), and Rubino et al. (2011) have shown that clinical depression can be identified through patients’ spontaneous speech disfluencies. Further, Stasak et al. (2019) found that when compared with healthy controls, patients with clinical depression exhibited significantly greater numbers of speech disfluencies during specific emotionally charged read aloud sentence tasks. For patients with suicidal behavior, it is anticipated that during simple read sentence tasks this population will show an increase in speech disfluencies due to associated depression and cognitive dysfunction (Levens and Gotlib, 2015; Marzuk et al., 2005; Mitterschiffthaler et al., 2008; Roy-Byrne et al., 1986; Rubino et al., 2011; Weingartner et al., 1981).

This research present herein is one of the largest-scale studies of inpatients with suicidal ideation and behavior to-date that investigates voice quality and speech disfluency behaviors found in such samples using text-dependent read aloud elicitation with a range of mood content. As an elicitation protocol, read speech has many advantages over spontaneous speech because it: (1) constrains the phonetic variability; (2) controls the syntactic order of affective word content; (3) isolates a patient's cognitive-processing demands; (4) offers objective clinical repeatability; and (5) reduces potential patient-observer bias caused by interviewer-adaptation, which influences the speaking style of a participant (Bouhuys and Van Den Hoofdakker, 1991).

In this study, we investigate the subjective GRBASI voice pathology quality attributes (e.g., ‘grade’, ‘roughness’, ‘breathiness’, ‘asthenia’, ‘strain’, ‘instability’) to help establish which of these are most associated with the speech of psychiatric inpatients hospitalized for suicidal ideation or suicide attempt. In addition, the speech of healthy controls is compared to the psychiatric inpatients. We hypothesize that voice quality is an important indicator for individuals who are at higher risk for suicide, regardless of whether they exhibit depression. Based on the previous literature (Costache et al., 2004; Wazeer et al., 2015; Zabel et al. 2005), we hypothesize that inpatients with a history of recent suicide attempts will exhibit abnormal voice qualities along with language processing and production difficulties. It is theorized that descriptive voice quality and speech disfluency measures can be applied as discriminative low-dimensional features to help automatically classify individuals with no suicide history and psychiatric inpatients with a recent history of suicidal thoughts or behaviors.

Section snippets

Database

The Butler-Brown Read Speech (BBRS) database is a privately collected speech corpus consisting of recordings of participants reading a set of sentences into a microphone. All participants were recorded at a psychiatric hospital in the northeastern United States of America. The BBRS database was developed to investigate verbal behaviors of inpatients hospitalized for recent suicidal ideation (SI group) or suicide attempts (SA group), along with a group of healthy controls recruited from the

Voice quality assessment measures

Our subjective voice quality scale was based on the GRBASI voice quality evaluation (Yamauchi et al., 2010). The GRBASI perceptual evaluation scale is one of the most common assessments for pathological voice quality. The GRBASI is based on a four-point scale (i.e., 0-normal, 1-mild, 2-moderate, 3-severe), and requires a human listener to subjectively score six voice quality attributes: ‘grade’ (hoarseness), ‘roughness’ (vibration irregularity), ‘breathiness’ (air escaping), ‘asthenia’

Voice quality

An individual voice quality attribute analysis per group is shown below in Table 3. While some of the automatic speech-based literature (Cummins et al., 2015; Scherer et al., 2013) has mentioned individuals exhibiting depression and/or suicidal behavior having increased ‘breathiness’, our manual evaluation on the BBRS database analysis found that inpatients with suicide attempt history had considerably higher ‘grade’ and ‘roughness’ attributes. For instance, both the SI and SA inpatient groups

Conclusion

Our study examined voice quality and speech disfluency behaviors found in psychiatric inpatients with a suicide history and healthy controls with no suicide history in the BBRS database. When compared with a healthy control, an analysis of voice qualities in inpatients exhibiting suicidal ideation or behavior yielded new valuable insights, revealing that they have increased ‘grade’ and ‘roughness’ voice qualities (i.e., not only ‘breathiness’ as indicated in previous literature). Further, we

Declaration of Competing Interest

None.

Acknowledgements

The authors would like to thank Butler Hospital and the Warren Alpert Medical School of Brown University for providing the BBRS dataset. This research was partly made possible by funding from the National Institute of Health (Grants: R01MH108610; R01MH095786; R01MH097741). We also thank Lifeline Harbour to Hawkesbury in Sydney, Australia for their safety-wellness ‘Accidental Counselor’ training certification. Additionally, we thank Nancy Briggs with the UNSW Mark Wainwright Analytic Centre.

References (52)

  • B. Stasak et al.

    Automatic depression classification based on affective read sentences: opportunities for text-dependent analysis

    Speech Comm.

    (2019)
  • A.T. Beck et al.

    Beck Depression Inventory-II

    (1996)
  • D.W. Brook et al.

    Drug use and the risk of major depressive disorder, alcohol dependence, and substance use disorders

    Arch. Gen. Psychiatry

    (2002)
  • Centers for Disease Control (CDC), 2017. Centers for disease control and prevention data & statistics fatal injury...
  • C. Cortes et al.

    Support-vector networks

    Mach. Learn.

    (1995)
  • S. Couch et al.

    Vocal effectiveness of speech-language pathology students: before and after voice use during service delivery

    S. Afr. J. Comm. Disord.

    (2015)
  • L. de Araújo Pernambuco et al.

    Prevalence of voice disorders in the elderly: a systematic review of population-based studies

    Eur. Arch. Oto-Rhino-Laryngol.

    (2014)
  • P.H. Dejonckere et al.

    Differential perceptual evaluation of pathological voice quality: reliability and correlations with acoustic measurements

    Rev. Laryngol. Otol. Rhinol. (Bord.)

    (1996)
  • H. Ellgring et al.

    Vocal indicators of mood change in depression

    J. Nonverbal. Behav.

    (1996)
  • A. Esposito et al.

    On the significance of speech pauses in depressive disorders: results on read and spontaneous narratives

    (2016)
  • D. Fay et al.

    Malapropisms and the structure of the mental lexicon

    Linguist. Inq.

    (1977)
  • P.A. Frewen et al.

    Visual-verbal self/other-referential processing task: direct vs. indirect assessment, valence, and experimental correlates

    Pers. Individ. Dif.

    (2011)
  • N.M. Henriksson et al.

    Am. J. Psychiatry

    (1993)
  • G.M.A. Hoffman et al.

    Speech pause time as a method for the evaluation of psychomotor retardation in depressive illness

    British J. Psych.

    (1985)
  • M.-.R. Islam et al.

    Detecting depression using k-nearest neighbors (KNN) classification technique

  • A. Kataria et al.

    A review of data classification using k-nearest neighbor algorithm

    Intern. J. Emerg. Technol. Adv. Eng.

    (2013)
  • Cited by (10)

    • Linguistic features of suicidal thoughts and behaviors: A systematic review

      2022, Clinical Psychology Review
      Citation Excerpt :

      In the special case of developing ML algorithms to distinguish posts with non-suicidal depressive or anxiety-related content from posts with STB content, a more contextual approach might be required considering the severity of STBs to prevent classifier failure (Aladağ et al., 2018). Besides social media data, health records including therapy notes (Bantilan et al., 2020; Cohen et al., 2020; Fernandes et al., 2018; Levis et al., 2021; Xu et al., 2021), text message data (Cook et al., 2016; Glenn et al., 2020; Nobles et al., 2018), suicide notes Pestian et al. (2008), and voice samples (Belouali et al., 2021; Bryan et al., 2018; Cohn et al., 2009; Figueroa Saavedra et al., 2020; France et al., 2000; Gideon et al., 2019; Hashim et al., 2012, 2017; Keskinpala et al., 2007; Landau et al., 2007; Ozdas et al., 2001; Ozdas et al., 2004; Pestian et al., 2016; Pestian et al., 2017; Scherer et al., 2013; Stasak et al., 2021; Yingthawornsuk et al., 2006, 2007) have proven useful in predicting STBs. With advances in modern (mobile) technology, ecological momentary assessments can be exploited to collect real-time, real-world data on individuals at risk, which is useful considering the fluctuating nature of suicidal thoughts (Kleiman et al., 2019).

    • Investigating Generalizability of Speech-based Suicidal Ideation Detection Using Mobile Phones

      2024, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
    View all citing articles on Scopus
    View full text