Abstract:
This paper presents a method for generating indexable and browsable keyword metadata from ASR transcripts by leveraging the Web. Search engine queries are built from an A...Show MoreMetadata
Abstract:
This paper presents a method for generating indexable and browsable keyword metadata from ASR transcripts by leveraging the Web. Search engine queries are built from an ASR transcript and used to retrieve similar text from the Web. The keyword meta information embedded in those pages for search engines is then ranked using a mutual information criteria to derive a keyword set. The proposed method is training-free, allows phrase keyword generation, and can generate words that were not spoken in the ASR transcript, alleviating the impact of ASR out-of-vocabulary. Subjective evaluations on technical presentations demonstrate a clear preference for this approach. Additionally an objective measure of keyword generation performance is proposed and shown to be a useful guide for tuning compared to more onerous subjective evaluations.
Published in: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 22-27 May 2011
Date Added to IEEE Xplore: 11 July 2011
ISBN Information: