Abstract
Obstructive sleep apnea (OSA) is a highly prevalent disease in which upper airways are collapsed during sleep, leading to serious consequences. The gold standard of diagnosis, called Polysomnography (PSG), requires a full-night hospital stay connected to over 15 channels of measurements requiring physical contact with sensors. PSG is expensive and unsuited for community screening. Snoring is the earliest symptom of OSA, but its potential in OSA diagnosis is not fully recognized yet. In this paper, we propose a novel model for SRS as the response of a mixed-phase system (total airways response, TAR) to a source excitation at the input. The TAR/source model is similar to the vocal tract/source model in speech synthesis, and is capable of capturing acoustical changes brought about by the collapsing upper airways in OSA. We propose an algorithm based on higher-order-spectra (HOS) to jointly estimate the source and TAR, preserving the true phase characteristics of the latter. Working on a clinical database of signals, we show that TAR is indeed a mixed-phased signal and second-order statistics cannot fully characterize it. Night-time speech sounds can corrupt snore recordings and pose a challenge to snore based OSA diagnosis. We show that the TAR could be used to detect speech segments embedded in snores, and derive features to diagnose OSA via non-contact, low-cost instrumentation holding potential for a community screening device.
Similar content being viewed by others
References
Abeyratne UR (1999) Blind reconstruction of non-minimum phase systems from 1-D oblique slices of the bispectrum. IEE IEE Proc Vis Image Signal Process 146(5):253–26
Abeyratne UR, Patabandi CKK, Puvanendran K (2001) Pitch–Jitter analysis snoring signals in the diagnosis of obstructive sleep apnea 23-rd annual international conference of the IEEE Engineering in Medicine and Biology Society (IEEE EMBC2001), Istanbul, Turkey
Abeyratne UR, Wakwella AS, Hukins C (2005) Pitch jump probability measure for the analysis of snoring sound in apnea. Physiol Meas 26(5):779–798
Bassiri AG, Guilleminault C (2000) Clinical features and evaluations of obstructive sleep apnea. In: Kryger T, Roth WD (eds) The principles and practice of sleep medicine. W.B. Saunders co, Philadelphia
Diagnostic classification steering committee (1990) International classification of sleep disorders: diagnostic and coding manual, American Sleep Disorder Association, Rochester, Minnesota
Fiz JA, Abad J, Jane R, Riera M, Mananas MA, Caminal P, Rodenstein D, Morera J (1996) acoustic analysis of snoring in patients with simple snoring and obstructive sleep apnea. Eur Resp J 9(11):2365–2370
Flemons WW, Rowley JA, Anderson WM, McEvoy RD (2003) Home diagnosis of sleep apnea: a systematic review of literature. Chest 124(4):1543–1579
Hoffstein V (2000) Snoring. In: Kryger T, Roth WD (eds) The principles and practice of sleep medicine, 3 edn. W.B. Saunders co., Philadelphia
Issa FG, Morrison D, Hadjuk E, Iyer A, Feroah T, Remmers JE (1993) Digital monitoring of obstructive sleep-apnea using snoring sound and arterial oxygen-saturation. Sleep 16(8):S132–S132
Kim J, In K et al (2004) Prevalence of sleep disordered breathing in middle aged korean men and women. Am J Respir Crit Care Med 170(10):1108–1113
Lim PVH, Curry AR (1999) A new method for evaluating and reporting the severity of snoring. J Laryngol Otol 113(4):336–340
Lee TH, Abeyratne UR (1999) Analysis of snoring sounds for the detection of obstructive sleep apnea. Med Biol Eng Comput 37(suppl 2):538–539
Lee TH, Abeyratne UR, Puvanendran K, Goh KL (2000) Formant-structure and phase-coupling analysis of human snoring sounds for detection of obstructive sleep apnea. In: Middletion J, Jones ML, Pande GN (eds) Computer methods in biomechanics and biomedical engineering-3. Gordon& Breach Science Publishers, Amsterdam
Martin JM, Gascon JM, Carrizo S, Gispert J (1997) Prevalence of sleep apnea syndrome in the spanish audult population. Int J Epidemiol 26(2)
National Commission on Sleep Disorders Research (1993) Wake up America: a national sleep alert. U.S. Government Printing Office, Washington, D.C
Nikias CL, Petropulu AP (1993) Higher-order spectra analysis: a nonlinear signal processing framework. Prentice Hall, Englewood Cliff
Oppenheim AV, Schafer RW (1989) Discrete-time signal processing. Prentice Hall, Englewood Cliff
Puvanendran K, Goh KL (1999) From snoring to sleep apnea in a Singapore population. Sleep Res Online 2(1):11–14
Ronald J, Delaive K, Roos L, Manfreda J, Kryger MH (1998) Obstructive sleep apnea patients use more health care resources ten years prior to diagnosis. Sleep Res Online 1(1):71–74
Sondhi MM (1968) New Methods of pitch extraction. IEEE Trans Audio Electroacoust 16:262–266
Sola-Soler J et al (2002) Pitch analysis in snoring signals from simple snorers and patients with obstructive sleep apnea. EMBS/BMES Conference. IEEE, Houston
The National Sleep Disorders Research Plan (2003) National Institute of Health, USA
The Boston consulting group (2003) Proposal for a National Sleep Health Agenda, Australia
Udwadia ZF, Doshi AV, Lonkar SG, Singh CI (2004) Prevalence of sleep disordered breathing and sleep alpea in middle aged Urban Indian Men. Am J Respir Crit Care Med 169(2):168–173
Van Brunt DL, Lichstein KL, Noe SL, Aguillard RN, Lester KW (1997) Intensity pattern of sleep sounds as a predictor for obstructive sleep apnea. Sleep 20(12):1151–1156
Wakwella AS, Abeyratne UR, Kinouchi Y (2004) Automatic segmentation and pitch/jitter tracking of sleep disturbed breathing sounds. In: 8th international conference on control, automation, robotics and vision, Kunming, China
Wakwella AS, Abeyratne UR, Hukins C (2004) Snore based systems for the diagnosis of apnea: a novel feature and its receiver operating characteristics for a full-night clinical database. IEEE International Workshop on BioMedical Circuits and Systems, Singapore
Wilson K et al (1999) The snoring spectrum—acoustic assessment of snoring sound intensity in 1,139 individuals undergoing Polysomnography. Chest 115(3):762–770
Young T, Palta M, Dempsey J, Skatrud J, Weber S, Badr S (1993) The occurrence of sleep-disordered breathing among middle-aged adults. New Engl J Med 328(17): 1230–1235
Young T, Evans L, Finn L, Palta M (1997) Estimation of the clinically diagnosed proportion of sleep apnea syndrome in middle aged men and women. Sleep 20:705–706
Young T, Peppard PE, Gottlieb DJ (2002) Epidemiology of obstructive sleep apnea; a population health perspective. Am J Respir Crit Care Med 165:1217–1239
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A
Appendix A describes a novel method to jointly estimate the pitch period and the TAR (or VTR) using an exhaustive search process.
The approach is to assume a set of test-values (q) for the pitch of x v (n) and then determine the best value q opt of q based on a performance measure (see Eq. (14)). The quantity q opt is then considered as the pitch of x v (n). Steps (S1)–(S8) below describe the procedure (Please see Fig. 15 for a block diagram).
-
(S1) Initialization: Set q = 1; set the window parameter l = 1.
-
(S2) Estimate \({\tilde{h}}_{v} (n)\) via (5)–(8) and (9)–(12) for the assumed values of q and l; generate a test data block \({\tilde{s}}_{v} (n)\) via \({\tilde{s}}_{v} (n) = {\tilde{x}}_{v} (n)* {\tilde{h}}_{v} (n),\) where \({\tilde{x}}_{v} (n)\) is the unit impulse train with the inter-impulse pitch set to the assumed q.
-
(S3) Normalize the original data block s v (n) and the test block \({\tilde{s}}_{v} (n)\) each to have rms values equal to 1; let the Fast Fourier Transform (FFT) of the blocks, respectively be S v (j) and \({\tilde{S}}_{v} (j), \quad j=0,1,2,\ldots, N-1,\) where N is the length of the FFT.
-
(S4) Calculate the distance measure e(l,q) defined as:
$$ e(l,q) = {\sqrt \frac{\sum_{k = 0}^{N - 1} {\left({\vert S_{v} (j) \vert - \vert {\tilde{S}}_{v} (j) \vert} \right)}^{2}} {N}}$$(14) -
(S5) Repeat steps (S2)–(S4) increasing q to q + 1 at each repetition until q reaches an arbitrarily large number M q . Write each e(l,q),(q = 1,2, ... M q ) in the error matrix E[ ] M lx M q row l column q, (q = 1,2, ... M q ).
-
(S6) Repeat steps (S1)–(S5) increasing l to l +1 at each repetition until l reaches an arbitrarily large number M l . Write each e(l, q), (q = 1,2, ... M q ) in the error matrix E[ ] M lxM q row l,(l = 1, 2, ... M l ).
-
(S7) Find the global minimum entry, e min(l,q) of E[ ] M lxM q .
-
(S8) Define a threshold value ɛ for e min(l,q). If e min(l,q) < ɛ, the data block is periodic and thus contains a voiced-segment. Obtain the corresponding value of q as the optimum pitch value q opt. Use q opt as the pitch of source excitation x v (n), and consider the corresponding \({\tilde{h}}_{v} (n)\) to be the TAR. If e min(l,q) > ɛ the data block is treated as from an unvoiced-segment, and no pitch value is recorded.
Appendix B
1.1 Minimum-phase/all-pass decomposition of the TAR
Any mixed-phase signal can be represented as the convolution between a unique minimum-phase equivalent and an all-pass component [17]. For the reason \({\tilde{h}}(n)\) is modeled as a mixed phase sequence and estimated preserving the true phase characteristics, it is amenable for the min-phase/all-pass decomposition as given by:
In (15) \({\tilde{h}}_{m} (n)\) is the minimum phase equivalent of \({\tilde{h}}(n)\) such that their Fourier transform magnitudes, |H(e jω)| = |H m (e jω)| and \({\tilde{h}}_{a} (n)\) is the all pass sequence of \({\tilde{h}}(n).\)
Then,
And
where H(e jω), H m (e jω) and H a (e jω) are the Fourier transforms of \({\tilde{h}}(n),\, {\tilde{h}}_{m} (n)\) and \({\tilde{h}}_{a}(n),\) respectively.
Note that in the special case when \({\tilde{h}}(n)\) is a pure minimum phase signal |H a (e jω)| = 1 and \(\angle H_{a} (e^{{j\omega}}) = 0.\)
Figure 16 shows the process of decomposition of \({\tilde{h}}(n)\) into the minimum-phase and all pass components.
The real cepstrum \({\hat{c}}_{h} (n)\) of \({\tilde{h}}(n),\) a nonlinear operator used in speech analysis, is frequency-invariant linear filtered [17] to obtain the complex cepstrum, \({\hat{h}}_{m} (n)\) of the minimum phase sequence, \({\tilde{h}}_{m} (n),\) via;
Where w m (n) = 2u(n) − δ(n).
Now we obtain the minimum phase equivalent signal \({\tilde{h}}_{m} (n)\) of \({\tilde{h}}(n)\) from the inverse complex operation in (11).
Transforming (18) in to complex cepstrum domain where convolution becomes an addition, followed by a simple algebraic operation, the complex cepstrum \({\hat{h}}_{a} (n)\) of the all pass sequence \({\tilde{h}}_{a} (n)\) is obtained as:
The all pass sequence \({\tilde{h}}_{a} (n),\) can now be obtained from the inverse complex operation defined in (11).
Rights and permissions
About this article
Cite this article
Abeyratne, U.R., Karunajeewa, A.S. & Hukins, C. Mixed-phase modeling in snore sound analysis. Med Bio Eng Comput 45, 791–806 (2007). https://doi.org/10.1007/s11517-007-0186-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11517-007-0186-x