Highly Scalable Speech Processing on Data Stream Management System

Nishii, Shunsuke; Suzumura, Toyotaro

doi:10.1007/978-3-642-29035-0_14

Shunsuke Nishii²² &
Toyotaro Suzumura^22,23

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7239))

Included in the following conference series:

International Conference on Database Systems for Advanced Applications

1810 Accesses
1 Citations

Abstract

Today we require sophisticated speech processing technologies that process massive speech data simultaneously. In this paper we describe the implementation and evaluation of a Julius-backended parallel and scalable speech recognition system on the data stream management system “System S” developed by IBM Research. Our experimental result on our parallel and distributed environment with 4 nodes and 16 cores shows that the throughput can be significantly increased by a factor of 13.8 when compared with that on a single core. We also demonstrate that the beam management module in our system can keep throughput and recognition accuracy with varying input data rate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Localized Mandarin Speech Synthesis Services for Enterprise Scenarios

The NECTEC 2015 Thai Open-Domain Automatic Speech Recognition System

CloudASR: Platform and Service

References

Abadi, D.J., et al.: The Design of the Borealis Stream Processing Engine. In: Proc. CIDR, pp. 277–289 (2005)
Google Scholar
Wolf, J., Bansal, N., Hildrum, K., Parekh, S., Rajan, D., Wagle, R., Wu, K.-L., Fleischer, L.K.: SODA: An Optimizing Scheduler for Large-Scale Stream-Based Distributed Computer Systems. In: Issarny, V., Schantz, R. (eds.) Middleware 2008. LNCS, vol. 5346, pp. 306–325. Springer, Heidelberg (2008)
Chapter Google Scholar
Gedik, B., et al.: A Code Generation Approach to Optimizing High-Performance Distributed Data Stream Processing. In: Proc. USENIX, pp. 847–856 (2009)
Google Scholar
Arakawa, Y., et al.: A Study for a Scalability Evaluation Model of Spoken Dialogue System. Transactions of Information Processing Society of Japan 46(9), 2269–2278 (2005) (in Japanese)
MathSciNet Google Scholar
Tatbul, N., et al.: Load Shedding in a Data Stream Manager. In: Proc. VLDB (2003)
Google Scholar
Gedik, B., et al.: SPADE: The System S Declarative Stream Processing Engine. In: Proc. SIGMOD, pp. 1123–1134 (2008)
Google Scholar
Amini, L., et al.: SPC: A Distributed, Scalable Platform for Data Mining. In: DM-SSP, pp. 27–37 (2006)
Google Scholar
Jain, N., et al.: Design, implementation, and evaluation of the linear road benchmark on the stream processing core. In: International Conference on Management of Data, ACM SIGMOD, Chicago, IL (2006)
Google Scholar
Young, S., et al.: The HTK book (for HTK Version 3.2) (2002)
Google Scholar
Lee, A., et al.: Recent Development of Open-Source Speech Recognition Engine Julius. In: Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC (2009)
Google Scholar
Lee, A.: Large Vocabulary Continuous Speech Recognition Engine Julius ver. 4. IEICE technical report. Speech 107(406), pp.307-312 (2007) (in Japanese)
Google Scholar
Dixon, P.R., et al.: The Titech Large Vocabulary WFST Speech Recognition System. In: IEEE ASRU, pp. 443–448 (2007)
Google Scholar
Lee, A., et al.: An Efficient Two-pass Search Algorithm using Word Trellis Index. In: Proc. ICSLP, pp. 1831–1834 (1998)
Google Scholar
Itahashi, S., et al.: Development of ASJ Japanese newspaper article sentences corpus. Annual Meeting of Acoustic Society of Japan 1997(2), 187–188 (1997) (in Japanese)
Google Scholar

Download references

Author information

Authors and Affiliations

Tokyo Institute of Technology, 2-12-1 Ookayama, Meguro-ku, Tokyo, Japan
Shunsuke Nishii & Toyotaro Suzumura
IBM Research - Tokyo, 1623-14 Shimotsuruma, Yamato-shi, Kanagawa, Japan
Toyotaro Suzumura

Authors

Shunsuke Nishii
View author publications
You can also search for this author in PubMed Google Scholar
Toyotaro Suzumura
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Science and Engineering, Seoul National University, Gwanak-ro, Gwanak-gu, 151747, Seoul, South Korea
Sang-goo Lee
Computer School, Wuhan University, Luo-jia-shan, Wuchang, 430081, Wuhan, Hubei Province, China
Zhiyong Peng
School of Information Technology and Electrical Engineering, University of Queensland, 4072, Brisbane, QLD, Australia
Xiaofang Zhou
Department of Computer Science, Kangwon National University, 192-1, Hyoja2-Dong, 200701, Chuncheon, Kangwon, South Korea
Yang-Sae Moon
Institute for Computer Science and Business Information, University of Duisburg-Essen, Schützenbahn 70, 45117, Essen, Germany
Rainer Unland
School of Information and Communication Engineering, Chungbuk National University, 52 Naesudong-ro, Heungdeok-gu, 4072, Cheongju, Chungbuk, South Korea
Jaesoo Yoo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nishii, S., Suzumura, T. (2012). Highly Scalable Speech Processing on Data Stream Management System. In: Lee, Sg., Peng, Z., Zhou, X., Moon, YS., Unland, R., Yoo, J. (eds) Database Systems for Advanced Applications. DASFAA 2012. Lecture Notes in Computer Science, vol 7239. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29035-0_14

Download citation

DOI: https://doi.org/10.1007/978-3-642-29035-0_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29034-3
Online ISBN: 978-3-642-29035-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics