Abstract
Nowadays, there are many healthcare systems focused on the optimization and improvement of processes such as the generation of medical records and medical test reports. The interaction with this kind of systems is mainly done through user interfaces that demand the use of a keyboard or a mouse, which reduces the productivity of healthcare professionals. For example, pathological anatomy professionals use both sight and hands to analyse a sample by means of a microscope; therefore, the use of information systems through traditional interfaces (keyboard and mouse) involves a considerable waste of time and effort. In this sense, this work presents IXHEALTH, a multilingual platform for advanced speech recognition that allows healthcare professionals to perform transcription and dictation activities, as well as the definition and management of voice commands to interact with healthcare information systems. From this perspective, IXHEALTH was evaluated in terms of its ability to allow users to perform dictation activities and to interact with healthcare information systems by means of speech recognition and natural language technologies. The evaluation results seem promising and have proved that IXHEALTH platform is highly useful to healthcare professionals.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Akhtar, W., Ali, A., Mirza, K.: Impact of a voice recognition system on radiology report turnaround time: experience from a non-English-Speaking South Asian Country. Am. J. Roentgenol. 196(4), W485–W485 (2011). doi:10.2214/AJR.10.5426
Cimiano, P., Haase, P., Heizmann, J., Mantel, M., Studer, R.: Towards portable natural language interfaces to knowledge bases – The case of the ORAKEL system. Data Knowl. Eng. 65(2), 325–354 (2008). doi:10.1016/j.datak.2007.10.007
Paredes-Valverde, M.A., Rodríguez-García, M.A., Ruiz-Martínez, A., Valencia-García, R., Alor-Hernández, G.: ONLI: an ontology-based system for querying DBpedia using natural language paradigm. Expert Syst. App. 42(12), 5163–5176 (2015). doi:10.1016/j.eswa.2015.02.034
Paredes-Valverde, M.A., Valencia-García, R., Rodríguez-García, M.A., Colomo-Palacios, R., Alor-Hernández, G.: A semantic-based approach for querying linked data using natural language. J. Inf. Sci. (2015). doi:10.1177/0165551515616311
Salas-Zárate, M.P., López-López, E., Valencia-García, R., Aussenac-Gilles, N., Almela, A., Alor-Hernández, G.: A study on LIWC categories for opinion mining in Spanish reviews. J. Inf. Sci. (2014). doi:10.1177/0165551514547842
Peñalver-Martinez, I., Garcia-Sanchez, F., Valencia-Garcia, R., Rodríguez-García, M.A., Moreno, V., Fraga, A., Sánchez-Cervantes, J.L.: Feature-based opinion mining through ontologies. Expert Syst. Appl. 41(13), 5995–6008 (2014). doi:10.1016/j.eswa.2014.03.022
Salas-Zárate, M.P., Valencia-García, R., Ruiz-Martínez, A., Colomo-Palacios, R.: Feature-based opinion mining in financial news: an ontology-driven approach. J. Inf. Sci. (2016). doi:10.1177/0165551516645528
Markowitz, J.A.: Voice biometrics. Commun. ACM 43(9), 66–73 (2000). doi:10.1145/348941.348995
Muhammad, G.: Automatic speech recognition using interlaced derivative pattern for cloud based healthcare system. Clust. Comput. 18(2), 795–802 (2015). doi:10.1007/s10586-015-0439-7
Hart, J.L., Mcbride, A., Blunt, D., Gishen, P., Strickland, N.: Immediate and sustained benefits of a “total” implementation of speech recognition reporting. Br. J. Radiol. 83(989), 424–427 (2010). doi:10.1259/bjr/58137761
Al-Aynati, M.M., Chorneyko, K.A.: Comparison of voice-automated transcription and human transcription in generating pathology reports. Arch. Pathol. Lab. Med. 127(6), 721–725 (2003). doi:10.1043/1543-2165(2003)127<721:COVTAH>2.0.CO;2
Suominen, H., Johnson, M., Zhou, L., Sanchez, P., Sirel, R., Basilakis, J., Hanlen, L., Estival, D., Dawson, L., Kelly, B.: Capturing patient information at nursing shift changes: methodological evaluation of speech recognition and information extraction. J. Am. Med. Inform. Assoc. 22(e1), e48–e66 (2015). doi:10.1136/amiajnl-2014-002868
Williams, D.R., Kori, S.K., Williams, B., Sackrison, S.J., Kowalski, H.M., McLaughlin, M.G., Kuszyk, B.S.: Journal club: voice recognition dictation: analysis of report volume and use of the send-to-editor function. Am. J. Roentgenol. 201(5), 1069–1074 (2013). doi:10.2214/AJR.10.6335
Aronson, A.R., Lang, F.M.: An overview of MetaMap: historical perspective and recent advances. J. Am. Med. Inform. Assoc. 17(3), 229–236 (2010). doi:10.1136/jamia.2009.002733
Savova, G.K., Masanz, J.J., Ogren, P.V., Zheng, J., Sohn, S., Kipper-Schuler, K.C., Chute, C.G.: Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications. J. Am. Med. Inform. Assoc. 17(5), 507–513 (2010). doi:10.1136/jamia.2009.001560
Rodríguez-González, A., Martínez-Romero, M., Costumero, R., Wilkinson, M.D., Menasalvas-Ruiz, E.: Diagnostic knowledge extraction from medlineplus: an application for infectious diseases. In: Overbeek, R., Rocha, M.P., Fdez-Riverola, F., Paz, J.F.D. (eds.) 9th International Conference on Practical Applications of Computational Biology and Bioinformatics. AISC, vol. 375, pp. 79–87. Springer International Publishing, Switzerland (2015)
Xia, Y., Zhong, X., Liu, P., Tan, C., Na, S., Hu, Q., Huang, Y.: Combining MetaMap and cTAKES in Disorder Recognition: THCIB at CLEF eHealth Lab 2013 Task 1, in CLEF (Working Notes) (2013)
Huang, X.D., Ariki, Y., Jack, M.A.: Hidden Markov Models for Speech Recognition, vol. 2004. Edinburgh University Press, Edinburgh (1990)
Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)
Hunt, A., McGlashan, S.: Speech recognition grammar specification version 1.0, W3C Recomm, March 2004
Bundy, A., Wallen, L.: Context-free grammar. In: Bundy, A., Wallen, L. (eds.) Catalogue of Artificial Intelligence Tools, pp. 22–23. Springer, New York (1984)
Rose, P.: Forensic Speaker Identification. CRC Press, New York (2003)
Chen, J., Benesty, J., Huang, Y., Doclo, S.: New insights into the noise reduction Wiener filter. IEEE Trans. Audio Speech Lang. Process. 14(4), 1218–1234 (2006). doi:10.1109/TSA.2005.860851
Davis, S., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. Speech Signal Process. 28(4), 357–366 (1980). doi:10.1109/TASSP.1980.1163420
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian mixture models. Digit. Signal Process. 10(1), 19–41 (2000). doi:10.1006/dspr.1999.0361
Thierry, D.: A Short Introduction to Text-to-Speech Synthesis, TTS Res. Team TCTS Lab (1999)
Levinson, S.E., Olive, J.P., Tschirgi, J.S.: Speech synthesis in telecommunications. IEEE Commun. Mag. 31(11), 46–53 (1993). doi:10.1109/35.256873
Coker, C.H.: A dictionary-intensive letter-to-sound program. J. Acoust. Soc. Am. 78(S1), S7–S7 (1985). doi:10.1121/1.2023005
Rodríguez-García, M.A., Valencia-García, R., García-Sánchez, F., Samper-Zapater, J.J.: Creating a semantically-enhanced cloud services environment through ontology evolution. Future Gener. Comput. Syst. 32, 295–306 (2014). doi:10.1016/j.future.2013.08.003
Rodríguez-García, M.A., Valencia-García, R., García-Sánchez, F., Samper-Zapater, J.J.: Ontology-based annotation and retrieval of services in the cloud. Know-Based Syst. 56, 15–25 (2014). doi:10.1016/j.knosys.2013.10.006
Cunningham, H., Tablan, V., Roberts, A., Bontcheva, K.: Getting more out of biomedical documents with GATE’s full lifecycle open source text analytics. PLoS Comput. Biol. 9(2), e1002854 (2013). doi:10.1371/journal.pcbi.1002854
Porter, M.F.: Snowball: A language for stemming algorithms (2001)
Makhoul, J., Schwartz, R.: State of the art in continuous speech recognition. Proc. Natl. Acad. Sci. 92(22), 9956–9963 (1995)
Acknowledgments
This work has been supported by the Murcian Government (Instituto de Fomento de la Región de Murcia) and the European Commission (FEDER/ERDF) through project IXHEALTH (2015.08.ID + I.0011).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Vivancos-Vicente, P.J., Castejón-Garrido, J.S., Paredes-Valverde, M.A., Salas-Zárate, M.d.P., Valencia-García, R. (2016). IXHEALTH: A Multilingual Platform for Advanced Speech Recognition in Healthcare. In: Valencia-García, R., Lagos-Ortiz, K., Alcaraz-Mármol, G., del Cioppo, J., Vera-Lucio, N. (eds) Technologies and Innovation. CITI 2016. Communications in Computer and Information Science, vol 658. Springer, Cham. https://doi.org/10.1007/978-3-319-48024-4_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-48024-4_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-48023-7
Online ISBN: 978-3-319-48024-4
eBook Packages: Computer ScienceComputer Science (R0)