Skip to main content

IXHEALTH: A Multilingual Platform for Advanced Speech Recognition in Healthcare

  • Conference paper
  • First Online:
Book cover Technologies and Innovation (CITI 2016)

Abstract

Nowadays, there are many healthcare systems focused on the optimization and improvement of processes such as the generation of medical records and medical test reports. The interaction with this kind of systems is mainly done through user interfaces that demand the use of a keyboard or a mouse, which reduces the productivity of healthcare professionals. For example, pathological anatomy professionals use both sight and hands to analyse a sample by means of a microscope; therefore, the use of information systems through traditional interfaces (keyboard and mouse) involves a considerable waste of time and effort. In this sense, this work presents IXHEALTH, a multilingual platform for advanced speech recognition that allows healthcare professionals to perform transcription and dictation activities, as well as the definition and management of voice commands to interact with healthcare information systems. From this perspective, IXHEALTH was evaluated in terms of its ability to allow users to perform dictation activities and to interact with healthcare information systems by means of speech recognition and natural language technologies. The evaluation results seem promising and have proved that IXHEALTH platform is highly useful to healthcare professionals.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Akhtar, W., Ali, A., Mirza, K.: Impact of a voice recognition system on radiology report turnaround time: experience from a non-English-Speaking South Asian Country. Am. J. Roentgenol. 196(4), W485–W485 (2011). doi:10.2214/AJR.10.5426

    Article  Google Scholar 

  2. Cimiano, P., Haase, P., Heizmann, J., Mantel, M., Studer, R.: Towards portable natural language interfaces to knowledge bases – The case of the ORAKEL system. Data Knowl. Eng. 65(2), 325–354 (2008). doi:10.1016/j.datak.2007.10.007

    Article  Google Scholar 

  3. Paredes-Valverde, M.A., Rodríguez-García, M.A., Ruiz-Martínez, A., Valencia-García, R., Alor-Hernández, G.: ONLI: an ontology-based system for querying DBpedia using natural language paradigm. Expert Syst. App. 42(12), 5163–5176 (2015). doi:10.1016/j.eswa.2015.02.034

    Article  Google Scholar 

  4. Paredes-Valverde, M.A., Valencia-García, R., Rodríguez-García, M.A., Colomo-Palacios, R., Alor-Hernández, G.: A semantic-based approach for querying linked data using natural language. J. Inf. Sci. (2015). doi:10.1177/0165551515616311

    Google Scholar 

  5. Salas-Zárate, M.P., López-López, E., Valencia-García, R., Aussenac-Gilles, N., Almela, A., Alor-Hernández, G.: A study on LIWC categories for opinion mining in Spanish reviews. J. Inf. Sci. (2014). doi:10.1177/0165551514547842

    Google Scholar 

  6. Peñalver-Martinez, I., Garcia-Sanchez, F., Valencia-Garcia, R., Rodríguez-García, M.A., Moreno, V., Fraga, A., Sánchez-Cervantes, J.L.: Feature-based opinion mining through ontologies. Expert Syst. Appl. 41(13), 5995–6008 (2014). doi:10.1016/j.eswa.2014.03.022

    Article  Google Scholar 

  7. Salas-Zárate, M.P., Valencia-García, R., Ruiz-Martínez, A., Colomo-Palacios, R.: Feature-based opinion mining in financial news: an ontology-driven approach. J. Inf. Sci. (2016). doi:10.1177/0165551516645528

    Google Scholar 

  8. Markowitz, J.A.: Voice biometrics. Commun. ACM 43(9), 66–73 (2000). doi:10.1145/348941.348995

    Article  Google Scholar 

  9. Muhammad, G.: Automatic speech recognition using interlaced derivative pattern for cloud based healthcare system. Clust. Comput. 18(2), 795–802 (2015). doi:10.1007/s10586-015-0439-7

    Article  Google Scholar 

  10. Hart, J.L., Mcbride, A., Blunt, D., Gishen, P., Strickland, N.: Immediate and sustained benefits of a “total” implementation of speech recognition reporting. Br. J. Radiol. 83(989), 424–427 (2010). doi:10.1259/bjr/58137761

    Article  Google Scholar 

  11. Al-Aynati, M.M., Chorneyko, K.A.: Comparison of voice-automated transcription and human transcription in generating pathology reports. Arch. Pathol. Lab. Med. 127(6), 721–725 (2003). doi:10.1043/1543-2165(2003)127<721:COVTAH>2.0.CO;2

    Google Scholar 

  12. Suominen, H., Johnson, M., Zhou, L., Sanchez, P., Sirel, R., Basilakis, J., Hanlen, L., Estival, D., Dawson, L., Kelly, B.: Capturing patient information at nursing shift changes: methodological evaluation of speech recognition and information extraction. J. Am. Med. Inform. Assoc. 22(e1), e48–e66 (2015). doi:10.1136/amiajnl-2014-002868

    Google Scholar 

  13. Williams, D.R., Kori, S.K., Williams, B., Sackrison, S.J., Kowalski, H.M., McLaughlin, M.G., Kuszyk, B.S.: Journal club: voice recognition dictation: analysis of report volume and use of the send-to-editor function. Am. J. Roentgenol. 201(5), 1069–1074 (2013). doi:10.2214/AJR.10.6335

    Article  Google Scholar 

  14. Aronson, A.R., Lang, F.M.: An overview of MetaMap: historical perspective and recent advances. J. Am. Med. Inform. Assoc. 17(3), 229–236 (2010). doi:10.1136/jamia.2009.002733

    Article  Google Scholar 

  15. Savova, G.K., Masanz, J.J., Ogren, P.V., Zheng, J., Sohn, S., Kipper-Schuler, K.C., Chute, C.G.: Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications. J. Am. Med. Inform. Assoc. 17(5), 507–513 (2010). doi:10.1136/jamia.2009.001560

    Article  Google Scholar 

  16. Rodríguez-González, A., Martínez-Romero, M., Costumero, R., Wilkinson, M.D., Menasalvas-Ruiz, E.: Diagnostic knowledge extraction from medlineplus: an application for infectious diseases. In: Overbeek, R., Rocha, M.P., Fdez-Riverola, F., Paz, J.F.D. (eds.) 9th International Conference on Practical Applications of Computational Biology and Bioinformatics. AISC, vol. 375, pp. 79–87. Springer International Publishing, Switzerland (2015)

    Chapter  Google Scholar 

  17. Xia, Y., Zhong, X., Liu, P., Tan, C., Na, S., Hu, Q., Huang, Y.: Combining MetaMap and cTAKES in Disorder Recognition: THCIB at CLEF eHealth Lab 2013 Task 1, in CLEF (Working Notes) (2013)

    Google Scholar 

  18. Huang, X.D., Ariki, Y., Jack, M.A.: Hidden Markov Models for Speech Recognition, vol. 2004. Edinburgh University Press, Edinburgh (1990)

    Google Scholar 

  19. Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)

    MATH  Google Scholar 

  20. Hunt, A., McGlashan, S.: Speech recognition grammar specification version 1.0, W3C Recomm, March 2004

    Google Scholar 

  21. Bundy, A., Wallen, L.: Context-free grammar. In: Bundy, A., Wallen, L. (eds.) Catalogue of Artificial Intelligence Tools, pp. 22–23. Springer, New York (1984)

    Chapter  Google Scholar 

  22. Rose, P.: Forensic Speaker Identification. CRC Press, New York (2003)

    Google Scholar 

  23. Chen, J., Benesty, J., Huang, Y., Doclo, S.: New insights into the noise reduction Wiener filter. IEEE Trans. Audio Speech Lang. Process. 14(4), 1218–1234 (2006). doi:10.1109/TSA.2005.860851

    Article  Google Scholar 

  24. Davis, S., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. Speech Signal Process. 28(4), 357–366 (1980). doi:10.1109/TASSP.1980.1163420

    Article  Google Scholar 

  25. Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian mixture models. Digit. Signal Process. 10(1), 19–41 (2000). doi:10.1006/dspr.1999.0361

    Article  Google Scholar 

  26. Thierry, D.: A Short Introduction to Text-to-Speech Synthesis, TTS Res. Team TCTS Lab (1999)

    Google Scholar 

  27. Levinson, S.E., Olive, J.P., Tschirgi, J.S.: Speech synthesis in telecommunications. IEEE Commun. Mag. 31(11), 46–53 (1993). doi:10.1109/35.256873

    Article  Google Scholar 

  28. Coker, C.H.: A dictionary-intensive letter-to-sound program. J. Acoust. Soc. Am. 78(S1), S7–S7 (1985). doi:10.1121/1.2023005

    Article  Google Scholar 

  29. Rodríguez-García, M.A., Valencia-García, R., García-Sánchez, F., Samper-Zapater, J.J.: Creating a semantically-enhanced cloud services environment through ontology evolution. Future Gener. Comput. Syst. 32, 295–306 (2014). doi:10.1016/j.future.2013.08.003

    Google Scholar 

  30. Rodríguez-García, M.A., Valencia-García, R., García-Sánchez, F., Samper-Zapater, J.J.: Ontology-based annotation and retrieval of services in the cloud. Know-Based Syst. 56, 15–25 (2014). doi:10.1016/j.knosys.2013.10.006

    Article  Google Scholar 

  31. Cunningham, H., Tablan, V., Roberts, A., Bontcheva, K.: Getting more out of biomedical documents with GATE’s full lifecycle open source text analytics. PLoS Comput. Biol. 9(2), e1002854 (2013). doi:10.1371/journal.pcbi.1002854

    Article  Google Scholar 

  32. Porter, M.F.: Snowball: A language for stemming algorithms (2001)

    Google Scholar 

  33. Makhoul, J., Schwartz, R.: State of the art in continuous speech recognition. Proc. Natl. Acad. Sci. 92(22), 9956–9963 (1995)

    Article  Google Scholar 

Download references

Acknowledgments

This work has been supported by the Murcian Government (Instituto de Fomento de la Región de Murcia) and the European Commission (FEDER/ERDF) through project IXHEALTH (2015.08.ID + I.0011).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rafael Valencia-García .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Vivancos-Vicente, P.J., Castejón-Garrido, J.S., Paredes-Valverde, M.A., Salas-Zárate, M.d.P., Valencia-García, R. (2016). IXHEALTH: A Multilingual Platform for Advanced Speech Recognition in Healthcare. In: Valencia-García, R., Lagos-Ortiz, K., Alcaraz-Mármol, G., del Cioppo, J., Vera-Lucio, N. (eds) Technologies and Innovation. CITI 2016. Communications in Computer and Information Science, vol 658. Springer, Cham. https://doi.org/10.1007/978-3-319-48024-4_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-48024-4_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-48023-7

  • Online ISBN: 978-3-319-48024-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics