Skip to main content

Advertisement

Log in

Mixed-phase modeling in snore sound analysis

  • Original Article
  • Published:
Medical & Biological Engineering & Computing Aims and scope Submit manuscript

Abstract

Obstructive sleep apnea (OSA) is a highly prevalent disease in which upper airways are collapsed during sleep, leading to serious consequences. The gold standard of diagnosis, called Polysomnography (PSG), requires a full-night hospital stay connected to over 15 channels of measurements requiring physical contact with sensors. PSG is expensive and unsuited for community screening. Snoring is the earliest symptom of OSA, but its potential in OSA diagnosis is not fully recognized yet. In this paper, we propose a novel model for SRS as the response of a mixed-phase system (total airways response, TAR) to a source excitation at the input. The TAR/source model is similar to the vocal tract/source model in speech synthesis, and is capable of capturing acoustical changes brought about by the collapsing upper airways in OSA. We propose an algorithm based on higher-order-spectra (HOS) to jointly estimate the source and TAR, preserving the true phase characteristics of the latter. Working on a clinical database of signals, we show that TAR is indeed a mixed-phased signal and second-order statistics cannot fully characterize it. Night-time speech sounds can corrupt snore recordings and pose a challenge to snore based OSA diagnosis. We show that the TAR could be used to detect speech segments embedded in snores, and derive features to diagnose OSA via non-contact, low-cost instrumentation holding potential for a community screening device.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. Abeyratne UR (1999) Blind reconstruction of non-minimum phase systems from 1-D oblique slices of the bispectrum. IEE IEE Proc Vis Image Signal Process 146(5):253–26

    Article  Google Scholar 

  2. Abeyratne UR, Patabandi CKK, Puvanendran K (2001) Pitch–Jitter analysis snoring signals in the diagnosis of obstructive sleep apnea 23-rd annual international conference of the IEEE Engineering in Medicine and Biology Society (IEEE EMBC2001), Istanbul, Turkey

  3. Abeyratne UR, Wakwella AS, Hukins C (2005) Pitch jump probability measure for the analysis of snoring sound in apnea. Physiol Meas 26(5):779–798

    Article  Google Scholar 

  4. Bassiri AG, Guilleminault C (2000) Clinical features and evaluations of obstructive sleep apnea. In: Kryger T, Roth WD (eds) The principles and practice of sleep medicine. W.B. Saunders co, Philadelphia

  5. Diagnostic classification steering committee (1990) International classification of sleep disorders: diagnostic and coding manual, American Sleep Disorder Association, Rochester, Minnesota

  6. Fiz JA, Abad J, Jane R, Riera M, Mananas MA, Caminal P, Rodenstein D, Morera J (1996) acoustic analysis of snoring in patients with simple snoring and obstructive sleep apnea. Eur Resp J 9(11):2365–2370

    Article  Google Scholar 

  7. Flemons WW, Rowley JA, Anderson WM, McEvoy RD (2003) Home diagnosis of sleep apnea: a systematic review of literature. Chest 124(4):1543–1579

    Article  Google Scholar 

  8. Hoffstein V (2000) Snoring. In: Kryger T, Roth WD (eds) The principles and practice of sleep medicine, 3 edn. W.B. Saunders co., Philadelphia

  9. Issa FG, Morrison D, Hadjuk E, Iyer A, Feroah T, Remmers JE (1993) Digital monitoring of obstructive sleep-apnea using snoring sound and arterial oxygen-saturation. Sleep 16(8):S132–S132

    Google Scholar 

  10. Kim J, In K et al (2004) Prevalence of sleep disordered breathing in middle aged korean men and women. Am J Respir Crit Care Med 170(10):1108–1113

    Article  Google Scholar 

  11. Lim PVH, Curry AR (1999) A new method for evaluating and reporting the severity of snoring. J Laryngol Otol 113(4):336–340

    Google Scholar 

  12. Lee TH, Abeyratne UR (1999) Analysis of snoring sounds for the detection of obstructive sleep apnea. Med Biol Eng Comput 37(suppl 2):538–539

    Google Scholar 

  13. Lee TH, Abeyratne UR, Puvanendran K, Goh KL (2000) Formant-structure and phase-coupling analysis of human snoring sounds for detection of obstructive sleep apnea. In: Middletion J, Jones ML, Pande GN (eds) Computer methods in biomechanics and biomedical engineering-3. Gordon& Breach Science Publishers, Amsterdam

  14. Martin JM, Gascon JM, Carrizo S, Gispert J (1997) Prevalence of sleep apnea syndrome in the spanish audult population. Int J Epidemiol 26(2)

  15. National Commission on Sleep Disorders Research (1993) Wake up America: a national sleep alert. U.S. Government Printing Office, Washington, D.C

  16. Nikias CL, Petropulu AP (1993) Higher-order spectra analysis: a nonlinear signal processing framework. Prentice Hall, Englewood Cliff

    MATH  Google Scholar 

  17. Oppenheim AV, Schafer RW (1989) Discrete-time signal processing. Prentice Hall, Englewood Cliff

    MATH  Google Scholar 

  18. Puvanendran K, Goh KL (1999) From snoring to sleep apnea in a Singapore population. Sleep Res Online 2(1):11–14

    Google Scholar 

  19. Ronald J, Delaive K, Roos L, Manfreda J, Kryger MH (1998) Obstructive sleep apnea patients use more health care resources ten years prior to diagnosis. Sleep Res Online 1(1):71–74

    Google Scholar 

  20. Sondhi MM (1968) New Methods of pitch extraction. IEEE Trans Audio Electroacoust 16:262–266

    Article  Google Scholar 

  21. Sola-Soler J et al (2002) Pitch analysis in snoring signals from simple snorers and patients with obstructive sleep apnea. EMBS/BMES Conference. IEEE, Houston

  22. The National Sleep Disorders Research Plan (2003) National Institute of Health, USA

  23. The Boston consulting group (2003) Proposal for a National Sleep Health Agenda, Australia

  24. Udwadia ZF, Doshi AV, Lonkar SG, Singh CI (2004) Prevalence of sleep disordered breathing and sleep alpea in middle aged Urban Indian Men. Am J Respir Crit Care Med 169(2):168–173

    Article  Google Scholar 

  25. Van Brunt DL, Lichstein KL, Noe SL, Aguillard RN, Lester KW (1997) Intensity pattern of sleep sounds as a predictor for obstructive sleep apnea. Sleep 20(12):1151–1156

    Google Scholar 

  26. Wakwella AS, Abeyratne UR, Kinouchi Y (2004) Automatic segmentation and pitch/jitter tracking of sleep disturbed breathing sounds. In: 8th international conference on control, automation, robotics and vision, Kunming, China

  27. Wakwella AS, Abeyratne UR, Hukins C (2004) Snore based systems for the diagnosis of apnea: a novel feature and its receiver operating characteristics for a full-night clinical database. IEEE International Workshop on BioMedical Circuits and Systems, Singapore

  28. Wilson K et al (1999) The snoring spectrum—acoustic assessment of snoring sound intensity in 1,139 individuals undergoing Polysomnography. Chest 115(3):762–770

    Google Scholar 

  29. Young T, Palta M, Dempsey J, Skatrud J, Weber S, Badr S (1993) The occurrence of sleep-disordered breathing among middle-aged adults. New Engl J Med 328(17): 1230–1235

    Article  Google Scholar 

  30. Young T, Evans L, Finn L, Palta M (1997) Estimation of the clinically diagnosed proportion of sleep apnea syndrome in middle aged men and women. Sleep 20:705–706

    Google Scholar 

  31. Young T, Peppard PE, Gottlieb DJ (2002) Epidemiology of obstructive sleep apnea; a population health perspective. Am J Respir Crit Care Med 165:1217–1239

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Udantha R. Abeyratne.

Appendices

Appendix A

Appendix A describes a novel method to jointly estimate the pitch period and the TAR (or VTR) using an exhaustive search process.

The approach is to assume a set of test-values (q) for the pitch of x v (n) and then determine the best value q opt of q based on a performance measure (see Eq. (14)). The quantity q opt is then considered as the pitch of x v (n). Steps (S1)–(S8) below describe the procedure (Please see Fig. 15 for a block diagram).

  • (S1) Initialization: Set q = 1; set the window parameter l = 1.

  • (S2) Estimate \({\tilde{h}}_{v} (n)\) via (5)–(8) and (9)–(12) for the assumed values of q and l; generate a test data block \({\tilde{s}}_{v} (n)\) via \({\tilde{s}}_{v} (n) = {\tilde{x}}_{v} (n)* {\tilde{h}}_{v} (n),\) where \({\tilde{x}}_{v} (n)\) is the unit impulse train with the inter-impulse pitch set to the assumed q.

  • (S3) Normalize the original data block s v (n) and the test block \({\tilde{s}}_{v} (n)\) each to have rms values equal to 1; let the Fast Fourier Transform (FFT) of the blocks, respectively be S v (j) and \({\tilde{S}}_{v} (j), \quad j=0,1,2,\ldots, N-1,\) where N is the length of the FFT.

  • (S4) Calculate the distance measure e(l,q) defined as:

    $$ e(l,q) = {\sqrt \frac{\sum_{k = 0}^{N - 1} {\left({\vert S_{v} (j) \vert - \vert {\tilde{S}}_{v} (j) \vert} \right)}^{2}} {N}}$$
    (14)
  • (S5) Repeat steps (S2)–(S4) increasing q to q + 1 at each repetition until q reaches an arbitrarily large number M q . Write each e(l,q),(q = 1,2, ... M q ) in the error matrix E[ ] M lx M q row l column q,  (q = 1,2, ... M q ).

  • (S6) Repeat steps (S1)–(S5) increasing l to l +1 at each repetition until l reaches an arbitrarily large number M l . Write each e(l, q), (q = 1,2, ... M q ) in the error matrix E[ ] M lxM q row l,(l = 1, 2, ... M l ).

  • (S7) Find the global minimum entry, e min(l,q) of E[ ] M lxM q .

  • (S8) Define a threshold value ɛ for e min(l,q). If e min(l,q) < ɛ, the data block is periodic and thus contains a voiced-segment. Obtain the corresponding value of q as the optimum pitch value q opt. Use q opt as the pitch of source excitation x v (n), and consider the corresponding \({\tilde{h}}_{v} (n)\) to be the TAR. If e min(l,q) > ɛ the data block is treated as from an unvoiced-segment, and no pitch value is recorded.

Fig. 15
figure 15

Block diagram of the exhaustive search process in Appendix A

Appendix B

1.1 Minimum-phase/all-pass decomposition of the TAR

Any mixed-phase signal can be represented as the convolution between a unique minimum-phase equivalent and an all-pass component [17]. For the reason \({\tilde{h}}(n)\) is modeled as a mixed phase sequence and estimated preserving the true phase characteristics, it is amenable for the min-phase/all-pass decomposition as given by:

$$ {\tilde{h}}(n) = {\tilde{h}}_{m} (n)* {\tilde{h}}_{a} (n). $$
(15)

In (15) \({\tilde{h}}_{m} (n)\) is the minimum phase equivalent of \({\tilde{h}}(n)\) such that their Fourier transform magnitudes, |H(e jω)| = |H m (e jω)| and \({\tilde{h}}_{a} (n)\) is the all pass sequence of \({\tilde{h}}(n).\)

Then,

$$ {\left\vert {H_{a} (e^{{j\omega}})} \right\vert} = \frac{{{\left\vert {H(e^{{j\omega}})} \right\vert}}}{{{\left\vert {H_{m} (e^{{j\omega}})} \right\vert}}} = 1, $$
(16)

And

$$ \angle H_{a} (e^{{j\omega}}) = \angle H(e^{{j\omega}}) - \angle H_{m} (e^{{j\omega}}), $$
(17)

where H(e jω),   H m (e jω) and H a (e jω) are the Fourier transforms of \({\tilde{h}}(n),\, {\tilde{h}}_{m} (n)\) and \({\tilde{h}}_{a}(n),\) respectively.

Note that in the special case when \({\tilde{h}}(n)\) is a pure minimum phase signal |H a (e jω)| = 1 and \(\angle H_{a} (e^{{j\omega}}) = 0.\)

Figure 16 shows the process of decomposition of \({\tilde{h}}(n)\) into the minimum-phase and all pass components.

Fig. 16
figure 16

Minimum-phase/all-pass decomposition of the TAR

The real cepstrum \({\hat{c}}_{h} (n)\) of \({\tilde{h}}(n),\) a nonlinear operator used in speech analysis, is frequency-invariant linear filtered [17] to obtain the complex cepstrum, \({\hat{h}}_{m} (n)\) of the minimum phase sequence, \({\tilde{h}}_{m} (n),\) via;

$$ {\hat{h}}_{m} (n) = {\hat{c}}_{h} (n)\,w_{m} (n), $$
(18)

Where w m (n) = 2u(n) − δ(n).

Now we obtain the minimum phase equivalent signal \({\tilde{h}}_{m} (n)\) of \({\tilde{h}}(n)\) from the inverse complex operation in (11).

Transforming (18) in to complex cepstrum domain where convolution becomes an addition, followed by a simple algebraic operation, the complex cepstrum \({\hat{h}}_{a} (n)\) of the all pass sequence \({\tilde{h}}_{a} (n)\) is obtained as:

$$ {\hat{h}}_{a} (n) = {\hat{h}}(n) - {\hat{h}}_{m} (n). $$
(19)

The all pass sequence \({\tilde{h}}_{a} (n),\) can now be obtained from the inverse complex operation defined in (11).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Abeyratne, U.R., Karunajeewa, A.S. & Hukins, C. Mixed-phase modeling in snore sound analysis. Med Bio Eng Comput 45, 791–806 (2007). https://doi.org/10.1007/s11517-007-0186-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11517-007-0186-x

Keywords

Navigation