Collective spectral pattern complexity analysis of voicing in normal males and larynx cancer patients following radiotherapy
Introduction
United Kingdom cancer statistics for 2001 show that the larynx is the site for nearly a third of all 7820 new head and neck cancers and that well over 4 times as many men than women suffered from the disease [1]. Hence, it is as prevalent as cervix cancer in women though it attracts far less public attention. The 5 year survival of larynx cancer patients following treatment is good, at approximately two-thirds. Hence, subsequent quality of life, particularly swallowing and voice preservation, is important for a large number of individuals seeking to resume normal life. Radiotherapy arguably has fewer side effects than surgery, which is self evidently more invasive. Whereas surgery involves the excision or ablation of tissues harboring diseased cells, radiotherapy delivers a tumourcidal ionizing radiation dose using penetrating X-rays delivered in a manner that leaves healthy tissues intact and able to recover. Writing in the New England Journal of Medicine in 2003, Forastiere et al. noted that by using chemotherapy and radiotherapy appropriately ‘in most patients with laryngeal cancer, the disease can be managed without a primary surgical approach’ [2]. This reflects a shift in clinical practice from surgery to radiotherapy that began in the 1990s, with larynx preservation a key factor.
Larynx cancer irradiation may leave the targeted tissues intact but it does perturb vocal fold functionality for months after treatment [3], which in turn directly influences voice quality [4], [5], [6], [7]. An essential part of verbal communication is vowel generation, which is underpinned by fold vibration. The ability to produce vowel sounds and voicing in general is traditionally informed by professional opinion, specifically by the speech and language therapist (SALT). However, SALT terminology and the proliferation of assessment parameters has been cause for concern in head and neck radiotherapy [8]. With so many features selectively assessed, it has not been possible to define either absolutely or concisely what constitutes a ‘normal’ voice. Similarly, it has not been possible to explain how cancer patients subjected to intense vocal fold irradiation during radiotherapy can recover voicing to a level that could be considered to be ‘normal’. Hence, tracking recovery is problematic and made worse by inter-observer errors. Given the nature of human perception of acoustic phenomena, it is not surprising that SALT references to the importance of spectral features usually found in signal processing are commonplace [9]. The variability of frequency and amplitude of pressure and vocal fold contact are labelled ‘jitter’ and ‘shimmer’, respectively. Similarly, ‘breathiness’ and ‘whisper’, arising from arguably undesirable air flows during phonation, are factors that would be recognized by physical scientists as reducing signal to noise ratio in voicing. The spectral envelope [9] is used to understand the decay of harmonics in patients. This solid scientific basis, supported by the availability of physical measurements, offers a route to rationalizing the objective description of normal and aberrant voicing.
The electro-glottogram (EGG) is one instrument available to the SALT. An example is shown in Fig. 1(a). It measures the impedance changes across the larynx during voicing, producing time series with a characteristic waveform that reflect the vibratory action of the vocal folds, as seen in Fig. 1(b) [10]. However, the important and variable detail present in an otherwise elegantly simple EGG waveform is hard to interpret, even for the most experienced of SALTs [7], [11], [12]. The detail in the corresponding EGG spectral pattern is also difficult to interpret, even though the frequency domain has instantly recognizable, gross features in the form of the fundamental and trailing harmonic peaks (Fig. 1(c)). In 2004 Moore et al. suggested the first machine computed, single parameter reference standards for normal voicing based on the EGG measurements and compared these to a small sample of patients known to have significantly perturbed voices [13]. The key steps were to apply complexity analysis in the spectral domain, rather than the time domain, and to avoid the selection of specific peaks or features by analysing the spectral pattern collectively.
This paper shows that the EGG can be deployed to differentiate the healthy normal male population, quantify pathological voicing in pre-treatment male larynx cancer patients and track the pattern of patient recovery following radiotherapy. The approach reported is based on the regularity statistic ‘approximate entropy’ (ApEn) [14] applied across a suitably normalised frequency spectrum for vowel phonation, which has been decoupled from the effects of drift in phonation fundamental frequency (Fig. 1(c)).
Section snippets
Theory
Vowel phonation is predominantly driven by vocal fold vibration, which is accompanied by impedance variations across the thyroid area. Therefore, changes to fold vibration caused by physical damage arising from malignant disease and associated therapeutic irradiation are potentially reflected in changes to the impedance signature. Trans-larynx impedance changes can be detected during phonation via the EGG [10], which correlates well with actual vocal fold vibration and, when expertly measured,
Methodology
Eighty-nine male volunteers provided the reference standard for this study. Each subject was connected to a Laryngograph machine and asked to phonate the sustained vowel /i/ for up to 4 seconds. The Laryngograph outputs EGG impedance waveforms, which were digitised at a sampling rate of 20 kHz to produce 16-bit floating point values. The resultant data files, excluding four compromised files, were subjected to ApEn complexity analysis using software written in IDL from Research Systems
Results
Fig. 3 shows the ApEn complexity distribution for the healthy male normals reported by Moore et al. [13]. The bimodal nature of these data was tested by Gaussian mixtures model fitting using maximum likelihood [18]. They concluded (p < 0.001) that two normal groups G1 and G2 existed, characterised by complexity values 0.340 (±0.035) and 0.183 (±0.057) with relative weights 62 and 38%, respectively. Members of G1 exhibited strong FHN–fPSD features whilst those in G2 were weak, especially in the
Discussion
Of the 48 cancer cases considered, ApEn analysis indicated that two-thirds would develop improved vocal fold functionality 1 year after radiotherapy and could be objectively classified as normal (i.e. well within the G1 and G2 reference standards). Only one-quarter of cases would be below normal voicing bounds and distinctly pathological.
Fig. 4 demonstrates that patients assigned a less aberrant pre-treatment category by SALT perceptual analysis have improved ApEn post-treatment. This takes the
Conclusion
ApEn complexity analysis of the collective spectral pattern derived from trans-larynx impedance time series has allowed the recovery pattern of vocal fold functionality and voicing in male radiotherapy cancer cases to be examined. Using a single objective parameter to quantify the collective spectral pattern of vowel phonation, the majority of radiotherapy patients are seen to recover to levels of normal ApEn seen in the general, healthy male population. Many patients recover to at least the
References (19)
Electrolaryngographic assessment of vocal fold function
J. Phon.
(1986)- Cancer-Stats Incidence, CRUK, March...
Concurrent chemotherapy and radiotherapy for organ preservation in advanced laryngeal cancer
N. Engl. J. Med.
(2003)- et al.
Factors associated with recurrence & voice quality following radiation therapy for T1 & T2 glottic carcinomas
Laryngoscope
(1994) - et al.
The effect of head & neck radiation therapy on voice quality
Laryngoscope
(1992) - et al.
Computerised quantification & 3D-visualisation of voice quality changes following radiotherapy for carcinoma of the larynx
- et al.
Stage I (T1, M0, N0) squamous cell carcinoma of the laryngeal glottis: therapeutic results & voice preservation
Head Neck
(1999) - et al.
The effect of radiotherapy on various acoustical, clinical and perceptual pitch measures
- et al.
Voice measurement: is more better?
Logoped. Phoniatr. Vocol.
(1997)