Abstract
In the current study, the BASRAH system was used to calculate confidence measures (CMs) and then use them to designate individual words provided by an automatic speech recognition system (ASR) as either accept or reject. This information about a recognized word can be used to reduce the impact of ASR transcription errors on retrieval performance. The system also can process multilingual broadcasts, which is more challenging than dealing with a single language. The BASRAH system is able to provide CMs for ASR output for large data sets based on a word acoustic score. In a case study, we successfully used the BASRAH system to first calculate CMs to clean up spoken multilingual (English and Malay) broadcast news transcription and then to identify the boundaries of the broadcast news stories.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Arisoy, E., Can, D., Parlak, S., Sak, H., Saraclar, M.: Turkisk Broadcast News Transcription and Retrieval. IEEE Transactions on Audio, Speech and Language Processing 17, 874–883 (2009)
Chelba, C., Hazen, T.J., Salaclar, M.: Retrieval and Browsing of Spoken Content. IEEE Singal Processing Magazine 25, 39–49 (2008)
Jiang, H., Seneff, S., Polifroni, J.: Recognition confidence scoring and its use in speech understanding systems. Computer Speech and Language 16, 49–67 (2002)
Lo, W.-K., Meng, H.M., Ching, P.C.: Multi-Scale Spoken Document Retrieval for Cantonese Broadcast News. International Journal Of Speech Technology 7, 203–219 (2004)
Ostendorf, M., et al.: Speech Segmentation and its Impact on Spoken Document Processing (2007)
Lu, M.-M., Xie, L., Fu, Z.-H., Jiang, D.-M., Zhang, Y.-N.: Multi-Modal Feature Integration for Story Boundary Detection in Broadcast News. IEEE (2010) ISBN 978-1-4244-6245-2
Jiang, H.: Confidence measures for speech recognition: A survey. Speech Communication 45, 455–470 (2005)
Senay, G., Linarès, G., Lecouteux, B.: A Segment-Level Confidence Measure For Spoken Document Retrieval, vol. 11. IEEE (2011)
Skantze, G.: The use of speech recogition confidence scores in dialogue systems. Speech Technology (2003)
Stanford University, The Stanford Parser: A statistical parser, http://nlp.stanford.edu/software/lex-parser.shtml
Megyesi, B.: Brill’s POS Tagger with Extended Lexical Templates for Hungarian. In: Proceedings of the Workshop (W01) on Machine Learning in Human Language Technology: ACAI 1999, pp. 22–28 (1999)
Sakti, S., et al.: In: Third International Workshop on Malay and Indonesian Language Engineering (MALINDO), Singapore (2009)
Megyesi, B.: Brill’s Rule-Based Part of Speech Tagger for Hungarian. Stockholm University (1998)
Johnsont, S.E., Jourlint, P., Mooret, G.L., Jones, K.S., Woodlandt, P.C.: In: IEEE International Conference on Acoustics, Speech And Signal Processing (ICASSP), pp. 49–52 (1999)
Adriani, M., Asian, J., Nazief, B., Tahaghoghi, S.M.M., Williams, H.E.: Stemming Indonesian: A confix-stripping approach. ACM Transactions on Asian Language Information Processing (TALIP)Â 6 (2007)
Hartl, A.: Other Tips & Tricks: Word Stemming in Java with WordNet and JWNL (2010)
Cios, K.J., Pedrycz, W., Swiniarski, R.W., Kurgan, L.A.: Data Mining A knowledge Discovery Approach, pp. 289–306 (2007)
Jain, A.K.: Data Clustering: 50 Years Beyond K-Means1. Pattern Recognition Letters 31, 651–666 (2010)
Akbacak, M.: Rebust Spoken Document Retrieval in Multilingual and Nosiy Acoustic Envernments (2009)
Parlak, S., Saraclar, M.: Performance Analysis and Improvement of Turkish Broadcast News Retrieval. IEEE Transactions on Audio, Speech and Language Processing 20, 731–741 (2011)
Yonathan, A., Adriani, M.: Indonesian Spoken Document Retrieval Using Statistical Methods
Rosenberg, A., Hirschberg, J.: Story segmentation of broadcast news in English, Mandarin and Arabic. In: Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers. Association for Computational Linguistics (2006)
Mousavipour, S.F., Seyedtabaii, S.: Dual Particle-Number RBPF for Speech Enhancement. Journal of E-Technology 2, 159–169 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Khalaf Aleqili, Z.A. (2012). The BASRAH System: A Method for Spoken Broadcast News Story Clustering. In: Benlamri, R. (eds) Networked Digital Technologies. NDT 2012. Communications in Computer and Information Science, vol 293. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30507-8_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-30507-8_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30506-1
Online ISBN: 978-3-642-30507-8
eBook Packages: Computer ScienceComputer Science (R0)