Dynamic Bayesian Network Inversion for Robust Speech Recognition

Lei XIE
Hongwu YANG

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E90-D    No.7    pp.1117-1120
Publication Date: 2007/07/01
Online ISSN: 1745-1361
DOI: 10.1093/ietisy/e90-d.7.1117
Print ISSN: 0916-8532
Type of Manuscript: LETTER
Category: Speech and Hearing
Keyword: 
speech recognition,  hidden Markov model,  dynamic Bayesian network,  

Full Text: PDF(198.8KB)>>
Buy this Article



Summary: 
This paper presents an inversion algorithm for dynamic Bayesian networks towards robust speech recognition, namely DBNI, which is a generalization of hidden Markov model inversion (HMMI). As a dual procedure of expectation maximization (EM)-based model reestimation, DBNI finds the 'uncontaminated' speech by moving the input noisy speech to the Gaussian means under the maximum likelihood (ML) sense given the DBN models trained on clean speech. This algorithm can provide both the expressive advantage from DBN and the noise-removal feature from model inversion. Experiments on the Aurora 2.0 database show that the hidden feature model (a typical DBN for speech recognition) with the DBNI algorithm achieves superior performance in terms of word error rate reduction.


open access publishing via