ISCA Archive Interspeech 2018
ISCA Archive Interspeech 2018

Breathy to Tense Voice Discrimination using Zero-Time Windowing Cepstral Coefficients (ZTWCCs)

Sudarsana Reddy Kadiri, Bayya Yegnanarayana

In this paper, we consider breathy to tense voices, which are often considered to be opposite ends of a voice quality continuum. Along with these, other aspects of a speaker's voice play an important role to convey the information to the listener such as mood, attitude and emotional state. The glottal pulse characteristics in different phonation types vary due to the tension of laryngeal muscles together with the respiratory effort. In the present study, we are deriving the features that can capture effects of excitation on the vocal tract system through a signal processing method, called as zero-time windowing (ZTW) method. The ZTW method gives the instantaneous spectrum which captures the changes in the speech production mechanism, providing higher spectral resolution. The cepstral coefficients derived from ZTW method are used for the classification of phonation types. Along with zero-time windowing cepstral coefficients (ZTWCCs), we use the excitation source features derived from zero frequency filtering (ZFF) method. The excitation features used are: strength of excitation, energy of excitation, loudness measure and ZFF signal energy. Classification experiments using ZTWCC and excitation features reveal a significant improvement in the detection of phonation type compared to the existing voice quality features and MFCC features.


doi: 10.21437/Interspeech.2018-2498

Cite as: Kadiri, S.R., Yegnanarayana, B. (2018) Breathy to Tense Voice Discrimination using Zero-Time Windowing Cepstral Coefficients (ZTWCCs). Proc. Interspeech 2018, 232-236, doi: 10.21437/Interspeech.2018-2498

@inproceedings{kadiri18b_interspeech,
  author={Sudarsana Reddy Kadiri and Bayya Yegnanarayana},
  title={{Breathy to Tense Voice Discrimination using Zero-Time Windowing Cepstral Coefficients (ZTWCCs)}},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={232--236},
  doi={10.21437/Interspeech.2018-2498}
}