Abstract:
Voice activity detection and phonetically-based segmentation are used to classify input speech into four modes: onset, silence, unvoiced and voiced. Each phonetic segment...Show MoreMetadata
Abstract:
Voice activity detection and phonetically-based segmentation are used to classify input speech into four modes: onset, silence, unvoiced and voiced. Each phonetic segment is coded at a suitable bitrate depending on the mode type, using a G.726 ADPCM encoder and preserving distinct encoder state information for each mode. The proposed speech coder achieves PESQ-MOS equivalent to G.726 ADPCM at 24 kbps but at an average rate less than 16 kbps while encoding a typical telephone conversation. A moderate 40 ms encoder delay is incurred.
Date of Conference: 26-29 October 2008
Date Added to IEEE Xplore: 12 June 2009
ISBN Information: