IXHEALTH: A Multilingual Platform for Advanced Speech Recognition in Healthcare

Vivancos-Vicente, Pedro José; Castejón-Garrido, Juan Salvador; Paredes-Valverde, Mario Andrés; Salas-Zárate, María del Pilar; Valencia-García, Rafael

doi:10.1007/978-3-319-48024-4_3

Pedro José Vivancos-Vicente¹⁵,
Juan Salvador Castejón-Garrido¹⁵,
Mario Andrés Paredes-Valverde¹⁶,
María del Pilar Salas-Zárate¹⁶ &
…
Rafael Valencia-García¹⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 658))

Included in the following conference series:

International Conference on Technologies and Innovation

647 Accesses

Abstract

Nowadays, there are many healthcare systems focused on the optimization and improvement of processes such as the generation of medical records and medical test reports. The interaction with this kind of systems is mainly done through user interfaces that demand the use of a keyboard or a mouse, which reduces the productivity of healthcare professionals. For example, pathological anatomy professionals use both sight and hands to analyse a sample by means of a microscope; therefore, the use of information systems through traditional interfaces (keyboard and mouse) involves a considerable waste of time and effort. In this sense, this work presents IXHEALTH, a multilingual platform for advanced speech recognition that allows healthcare professionals to perform transcription and dictation activities, as well as the definition and management of voice commands to interact with healthcare information systems. From this perspective, IXHEALTH was evaluated in terms of its ability to allow users to perform dictation activities and to interact with healthcare information systems by means of speech recognition and natural language technologies. The evaluation results seem promising and have proved that IXHEALTH platform is highly useful to healthcare professionals.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Automatic Speech Recognition for Kreol Morisien: A Case Study for the Health Domain

Multilingual Speech Recognition: An In-Depth Review of Applications, Challenges, and Future Directions

RUTA:MED – Dual Workflow Medical Speech Transcription Pipeline and Editor

References

Akhtar, W., Ali, A., Mirza, K.: Impact of a voice recognition system on radiology report turnaround time: experience from a non-English-Speaking South Asian Country. Am. J. Roentgenol. 196(4), W485–W485 (2011). doi:10.2214/AJR.10.5426
Article Google Scholar
Cimiano, P., Haase, P., Heizmann, J., Mantel, M., Studer, R.: Towards portable natural language interfaces to knowledge bases – The case of the ORAKEL system. Data Knowl. Eng. 65(2), 325–354 (2008). doi:10.1016/j.datak.2007.10.007
Article Google Scholar
Paredes-Valverde, M.A., Rodríguez-García, M.A., Ruiz-Martínez, A., Valencia-García, R., Alor-Hernández, G.: ONLI: an ontology-based system for querying DBpedia using natural language paradigm. Expert Syst. App. 42(12), 5163–5176 (2015). doi:10.1016/j.eswa.2015.02.034
Article Google Scholar
Paredes-Valverde, M.A., Valencia-García, R., Rodríguez-García, M.A., Colomo-Palacios, R., Alor-Hernández, G.: A semantic-based approach for querying linked data using natural language. J. Inf. Sci. (2015). doi:10.1177/0165551515616311
Google Scholar
Salas-Zárate, M.P., López-López, E., Valencia-García, R., Aussenac-Gilles, N., Almela, A., Alor-Hernández, G.: A study on LIWC categories for opinion mining in Spanish reviews. J. Inf. Sci. (2014). doi:10.1177/0165551514547842
Google Scholar
Peñalver-Martinez, I., Garcia-Sanchez, F., Valencia-Garcia, R., Rodríguez-García, M.A., Moreno, V., Fraga, A., Sánchez-Cervantes, J.L.: Feature-based opinion mining through ontologies. Expert Syst. Appl. 41(13), 5995–6008 (2014). doi:10.1016/j.eswa.2014.03.022
Article Google Scholar
Salas-Zárate, M.P., Valencia-García, R., Ruiz-Martínez, A., Colomo-Palacios, R.: Feature-based opinion mining in financial news: an ontology-driven approach. J. Inf. Sci. (2016). doi:10.1177/0165551516645528
Google Scholar
Markowitz, J.A.: Voice biometrics. Commun. ACM 43(9), 66–73 (2000). doi:10.1145/348941.348995
Article Google Scholar
Muhammad, G.: Automatic speech recognition using interlaced derivative pattern for cloud based healthcare system. Clust. Comput. 18(2), 795–802 (2015). doi:10.1007/s10586-015-0439-7
Article Google Scholar
Hart, J.L., Mcbride, A., Blunt, D., Gishen, P., Strickland, N.: Immediate and sustained benefits of a “total” implementation of speech recognition reporting. Br. J. Radiol. 83(989), 424–427 (2010). doi:10.1259/bjr/58137761
Article Google Scholar
Al-Aynati, M.M., Chorneyko, K.A.: Comparison of voice-automated transcription and human transcription in generating pathology reports. Arch. Pathol. Lab. Med. 127(6), 721–725 (2003). doi:10.1043/1543-2165(2003)127<721:COVTAH>2.0.CO;2
Google Scholar
Suominen, H., Johnson, M., Zhou, L., Sanchez, P., Sirel, R., Basilakis, J., Hanlen, L., Estival, D., Dawson, L., Kelly, B.: Capturing patient information at nursing shift changes: methodological evaluation of speech recognition and information extraction. J. Am. Med. Inform. Assoc. 22(e1), e48–e66 (2015). doi:10.1136/amiajnl-2014-002868
Google Scholar
Williams, D.R., Kori, S.K., Williams, B., Sackrison, S.J., Kowalski, H.M., McLaughlin, M.G., Kuszyk, B.S.: Journal club: voice recognition dictation: analysis of report volume and use of the send-to-editor function. Am. J. Roentgenol. 201(5), 1069–1074 (2013). doi:10.2214/AJR.10.6335
Article Google Scholar
Aronson, A.R., Lang, F.M.: An overview of MetaMap: historical perspective and recent advances. J. Am. Med. Inform. Assoc. 17(3), 229–236 (2010). doi:10.1136/jamia.2009.002733
Article Google Scholar
Savova, G.K., Masanz, J.J., Ogren, P.V., Zheng, J., Sohn, S., Kipper-Schuler, K.C., Chute, C.G.: Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications. J. Am. Med. Inform. Assoc. 17(5), 507–513 (2010). doi:10.1136/jamia.2009.001560
Article Google Scholar
Rodríguez-González, A., Martínez-Romero, M., Costumero, R., Wilkinson, M.D., Menasalvas-Ruiz, E.: Diagnostic knowledge extraction from medlineplus: an application for infectious diseases. In: Overbeek, R., Rocha, M.P., Fdez-Riverola, F., Paz, J.F.D. (eds.) 9th International Conference on Practical Applications of Computational Biology and Bioinformatics. AISC, vol. 375, pp. 79–87. Springer International Publishing, Switzerland (2015)
Chapter Google Scholar
Xia, Y., Zhong, X., Liu, P., Tan, C., Na, S., Hu, Q., Huang, Y.: Combining MetaMap and cTAKES in Disorder Recognition: THCIB at CLEF eHealth Lab 2013 Task 1, in CLEF (Working Notes) (2013)
Google Scholar
Huang, X.D., Ariki, Y., Jack, M.A.: Hidden Markov Models for Speech Recognition, vol. 2004. Edinburgh University Press, Edinburgh (1990)
Google Scholar
Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)
MATH Google Scholar
Hunt, A., McGlashan, S.: Speech recognition grammar specification version 1.0, W3C Recomm, March 2004
Google Scholar
Bundy, A., Wallen, L.: Context-free grammar. In: Bundy, A., Wallen, L. (eds.) Catalogue of Artificial Intelligence Tools, pp. 22–23. Springer, New York (1984)
Chapter Google Scholar
Rose, P.: Forensic Speaker Identification. CRC Press, New York (2003)
Google Scholar
Chen, J., Benesty, J., Huang, Y., Doclo, S.: New insights into the noise reduction Wiener filter. IEEE Trans. Audio Speech Lang. Process. 14(4), 1218–1234 (2006). doi:10.1109/TSA.2005.860851
Article Google Scholar
Davis, S., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. Speech Signal Process. 28(4), 357–366 (1980). doi:10.1109/TASSP.1980.1163420
Article Google Scholar
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian mixture models. Digit. Signal Process. 10(1), 19–41 (2000). doi:10.1006/dspr.1999.0361
Article Google Scholar
Thierry, D.: A Short Introduction to Text-to-Speech Synthesis, TTS Res. Team TCTS Lab (1999)
Google Scholar
Levinson, S.E., Olive, J.P., Tschirgi, J.S.: Speech synthesis in telecommunications. IEEE Commun. Mag. 31(11), 46–53 (1993). doi:10.1109/35.256873
Article Google Scholar
Coker, C.H.: A dictionary-intensive letter-to-sound program. J. Acoust. Soc. Am. 78(S1), S7–S7 (1985). doi:10.1121/1.2023005
Article Google Scholar
Rodríguez-García, M.A., Valencia-García, R., García-Sánchez, F., Samper-Zapater, J.J.: Creating a semantically-enhanced cloud services environment through ontology evolution. Future Gener. Comput. Syst. 32, 295–306 (2014). doi:10.1016/j.future.2013.08.003
Google Scholar
Rodríguez-García, M.A., Valencia-García, R., García-Sánchez, F., Samper-Zapater, J.J.: Ontology-based annotation and retrieval of services in the cloud. Know-Based Syst. 56, 15–25 (2014). doi:10.1016/j.knosys.2013.10.006
Article Google Scholar
Cunningham, H., Tablan, V., Roberts, A., Bontcheva, K.: Getting more out of biomedical documents with GATE’s full lifecycle open source text analytics. PLoS Comput. Biol. 9(2), e1002854 (2013). doi:10.1371/journal.pcbi.1002854
Article Google Scholar
Porter, M.F.: Snowball: A language for stemming algorithms (2001)
Google Scholar
Makhoul, J., Schwartz, R.: State of the art in continuous speech recognition. Proc. Natl. Acad. Sci. 92(22), 9956–9963 (1995)
Article Google Scholar

Download references

Acknowledgments

This work has been supported by the Murcian Government (Instituto de Fomento de la Región de Murcia) and the European Commission (FEDER/ERDF) through project IXHEALTH (2015.08.ID + I.0011).

Author information

Authors and Affiliations

VOCALI Sistemas Inteligentes S.L., Parque Científico de Murcia, Ctra. de Madrid km. 388, Complejo de Espinardo, 30100, Murcia, Spain
Pedro José Vivancos-Vicente & Juan Salvador Castejón-Garrido
Department of Informatics and Systems, Universidad de Murcia, Murcia, Spain
Mario Andrés Paredes-Valverde, María del Pilar Salas-Zárate & Rafael Valencia-García

Authors

Pedro José Vivancos-Vicente
View author publications
You can also search for this author in PubMed Google Scholar
Juan Salvador Castejón-Garrido
View author publications
You can also search for this author in PubMed Google Scholar
Mario Andrés Paredes-Valverde
View author publications
You can also search for this author in PubMed Google Scholar
María del Pilar Salas-Zárate
View author publications
You can also search for this author in PubMed Google Scholar
Rafael Valencia-García
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rafael Valencia-García .

Editor information

Editors and Affiliations

Universidad de Murcia, Murcia, Spain
Rafael Valencia-García
Universidad Agraria del Ecuador, Guayaquil, Ecuador
Katty Lagos-Ortiz
Universidad de Castilla-La Mancha, Toledo, Spain
Gema Alcaraz-Mármol
Universidad Agraria del Ecuador, Guayaquil, Ecuador
Javier del Cioppo
Universidad Agraria del Ecuador, Guayaquil, Ecuador
Nestor Vera-Lucio

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vivancos-Vicente, P.J., Castejón-Garrido, J.S., Paredes-Valverde, M.A., Salas-Zárate, M.d.P., Valencia-García, R. (2016). IXHEALTH: A Multilingual Platform for Advanced Speech Recognition in Healthcare. In: Valencia-García, R., Lagos-Ortiz, K., Alcaraz-Mármol, G., del Cioppo, J., Vera-Lucio, N. (eds) Technologies and Innovation. CITI 2016. Communications in Computer and Information Science, vol 658. Springer, Cham. https://doi.org/10.1007/978-3-319-48024-4_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-48024-4_3
Published: 20 October 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-48023-7
Online ISBN: 978-3-319-48024-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics