Abstract:
Acoustically, car cabins are extremely noisy and as a consequence audio-only, in-car voice recognition systems perform poorly. As the visual modality is immune to acousti...Show MoreMetadata
Abstract:
Acoustically, car cabins are extremely noisy and as a consequence audio-only, in-car voice recognition systems perform poorly. As the visual modality is immune to acoustic noise, using the visual lip information from the driver is seen as a viable strategy in circumventing this problem by using audio visual automatic speech recognition (AVASR). However, implementing AVASR requires a system being able to accurately locate and track the drivers face and lip area in real-time. In this paper we present such an approach using the Viola-Jones algorithm. Using the AVICAR [1] in-car database, we show that the Viola- Jones approach is a suitable method of locating and tracking the driver's lips despite the visual variability of illumination and head pose for audio-visual speech recognition system.
Published in: 10th International Conference on Information Science, Signal Processing and their Applications (ISSPA 2010)
Date of Conference: 10-13 May 2010
Date Added to IEEE Xplore: 18 October 2010
ISBN Information: