Automatic Classification of Valve Diseases Through Natural Language Processing in Spanish and Active Learning

Pérez-Sánchez, Pablo; Vicente-Palacios, Víctor; Barreiro-Pérez, Manuel; Díaz-Peláez, Elena; Sánchez-Puente, Antonio; Sampedro-Gómez, Jesús; García-Galindo, Alberto; Dorado-Díaz, P. Ignacio; Sánchez, Pedro L.

doi:10.1007/978-3-030-88163-4_4

Pablo Pérez-Sánchez¹²,
Víctor Vicente-Palacios^12,14,
Manuel Barreiro-Pérez^12,13,
Elena Díaz-Peláez^12,13,
Antonio Sánchez-Puente^12,13,
Jesús Sampedro-Gómez^12,13,
Alberto García-Galindo¹²,
P. Ignacio Dorado-Díaz^12,13 &
…
Pedro L. Sánchez^12,13

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12940))

Included in the following conference series:

International Conference on Bioengineering and Biomedical Signal and Image Processing

702 Accesses
5 Altmetric

Abstract

A correct label classification of data allows improving healthcare processes and research. However, labeling is a difficult and expensive process, which limits its use and quality. We propose a proof of concept based on Natural Language Processing and active learning, in order to automatically structure information from a text in Spanish in the field of echocardiography.

Echocardiographic reports from a Health National System Cardiology Department were analyzed. Reports were divided into a training corpus (26,699 reports) and a validation corpus (2,881 reports). The design of the model was focused on the automatic labeling of aortic and mitral valve disease (stenosis/insufficiency) and their valve nature (native/prosthetic). The following steps were followed to build the models: data preparation, vectorization, and model fitting and validation. Results were compared with the manually labeled ground truth data from the physicians reporting the echocardiographic studies.

Four machine learning algorithms were compared: logistic regression, naïve bayes, random forest, and support vector machine; obtaining the last our best results with areas under the ROC curve of 0.92 and 0.93 for aortic and mitral stenosis, 0.87 and 0.89 for aortic and mitral insufficiency, and 0.97 and 0.96 for native aortic and mitral valve, respectively. Natural Language processing tools are useful to automatically structure and label echocardiographic information in Spanish text format. The developed models combined with active learning are capable of performing a correct prospective labeling.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Banerjee, I., Madhavan, S., Goldman, R.E., Rubin, D.L.: Intelligent word embeddings of free-text radiology reports. In: AMIA Annual Symposium Proceedings, pp. 411–420 (2017)
Google Scholar
Bressan, R.S., Camargo, G., Bugatti, P.H., Saito, P.T.M.: Exploring active learning based on representativeness and uncertainty for biomedical data classification. IEEE J. Biomed. Health Inf. 23(6), 2238–2244 (2018)
Article Google Scholar
Chen, J., Abbod, M., Shieh, J.S.: Integrations between autonomous systems and modern computing techniques: a mini review. Sensors 19(18), 3897 (2019)
Article Google Scholar
Chen, P.H.: Essential elements of natural language processing: what the radiologist should know. Acad. Radiol. 27(1), 6–12 (2020)
Article Google Scholar
Dorado-Díaz, P.I., Sampedro-Gómez, J., Vicente-Palacios, V., Sánchez, P.L.: Applications of artificial intelligence in cardiology. the future is already here. Rev. Esp. Cardiol. (Engl. Ed.) 72(12), 1065–1075 (2019)
Google Scholar
Esteva, A., et al.: A guide to deep learning in healthcare. Nat. Med. 25(1), 24–29 (2019)
Article Google Scholar
Evangelista, A., et al.: European association of echocardiography recommendations for standardization of performance, digital storage and reporting of echocardiographic studies. Eur. J. Echocardiogr. 9(4), 438–48 (2008)
Article Google Scholar
Honnibal, M., Johnson, M.: An improved non-monotonic transition system for dependency parsing. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Lisbon, September 2015
Google Scholar
Hughes, K.S., Zhou, J., Bao, Y., Singh, P., Wang, J., Yin, K.: Natural language processing to facilitate breast cancer research and management. Breast J
Google Scholar
Hughes, M., Li, I., Kotoulas, S., Suzumura, T.: Medical text classification using convolutional neural networks. Stud. Health Technol. Inform. 235, 246–250 (2017)
Google Scholar
Kim, Y., et al.: Extraction of left ventricular ejection fraction information from various types of clinical reports. J. Biomed. Inf. 67, 42–48 (2017)
Article Google Scholar
Kreimeyer, K., et al.: Natural language processing systems for capturing and standardizing unstructured clinical information: a systematic review. J. Biomed. Inf. 73, 14–29 (2017)
Article Google Scholar
McInnes, L., Healy, J., Astels, S.: Hdbscan : hierarchical density based clustering. J. Open Source Softw. 2(11), 205 (2017)
Article Google Scholar
McInnes, L., Healy, J., Saul, N., Grossberger, L.: Umap: uniform manifold approximation and projection. J. Open Source Softw. 3(29), 861 (2018)
Article Google Scholar
Nath, C., Albaghdadi, M.S., Jonnalagadda, S.R.: A natural language processing tool for large-scale data extraction from echocardiography reports. PLoS One 11(4), e0153749 (2017)
Article Google Scholar
Nowotka, M.M., Gaulton, A., Mendez, D., Bento, A.P., Hersey, A., Leach, A.: Using chembl web services for building applications and data processing workflows relevant to drug discovery. Expert Opin. Drug Discov. 12(8), 757–767 (2017)
Google Scholar
Névéol, A., Dalianis, H., Velupillai, S., Savova, G., Zweigenbaum, P.: Clinical natural language processing in languages other than english: opportunities and challenges. J. Biomed. Seman. 9(1), 12 (2018)
Article Google Scholar
Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Poder, T.G., Fisette, J.F., Déry, V.: Speech recognition for medical dictation: Overview in quebec and systematic review. J. Med. Syst. 42(5), 89 (2018)
Article Google Scholar
Pons, E., Braun, L.M., Hunink, M.G., Kors, J.A.: Natural language processing in radiology: a systematic review. Radiology 279(2), 329–43 (2016)
Article Google Scholar
Rodríguez, J.D., Pérez, A., Lozano, J.A.: Sensitivity analysis of kappa-fold cross validation in prediction error estimation. IEEE Trans. Pattern Anal. Mach. Intell. 32(3), 569–75 (2009)
Article Google Scholar
Sampedro-Gómez, J., et al.: Machine learning to predict stent restenosis based on daily demographic, clinical and angiographic characteristics. Can. J. Cardiol. 36, 1624–1630 (2020)
Article Google Scholar
Wong, J., Manderson, T., Abrahamowicz, M., Buckeridge, D.L., Tamblyn, R.: Can hyperparameter tuning improve the performance of a super learner?: a case study. Epidemiology 30(4), 521–531 (2019)
Article Google Scholar
Zech, J., et al.: Natural language-based machine learning models for the annotation of clinical radiology reports. Radiology 287(2), 570–580 (2018)
Google Scholar
Řehůřek, R., Sojka, P.: Software framework for topic modelling with large corpora. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. ELRA (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Servicio de Cardiología, Hospital Universitario de Salamanca-IBSAL, Salamanca, Spain
Pablo Pérez-Sánchez, Víctor Vicente-Palacios, Manuel Barreiro-Pérez, Elena Díaz-Peláez, Antonio Sánchez-Puente, Jesús Sampedro-Gómez, Alberto García-Galindo, P. Ignacio Dorado-Díaz & Pedro L. Sánchez
CIBERCV, Instituto de Salud Carlos III, Madrid, Spain
Manuel Barreiro-Pérez, Elena Díaz-Peláez, Antonio Sánchez-Puente, Jesús Sampedro-Gómez, P. Ignacio Dorado-Díaz & Pedro L. Sánchez
Philips Ibérica, Madrid, Spain
Víctor Vicente-Palacios

Authors

Pablo Pérez-Sánchez
View author publications
You can also search for this author in PubMed Google Scholar
Víctor Vicente-Palacios
View author publications
You can also search for this author in PubMed Google Scholar
Manuel Barreiro-Pérez
View author publications
You can also search for this author in PubMed Google Scholar
Elena Díaz-Peláez
View author publications
You can also search for this author in PubMed Google Scholar
Antonio Sánchez-Puente
View author publications
You can also search for this author in PubMed Google Scholar
Jesús Sampedro-Gómez
View author publications
You can also search for this author in PubMed Google Scholar
Alberto García-Galindo
View author publications
You can also search for this author in PubMed Google Scholar
P. Ignacio Dorado-Díaz
View author publications
You can also search for this author in PubMed Google Scholar
Pedro L. Sánchez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Víctor Vicente-Palacios .

Editor information

Editors and Affiliations

University of Granada, Granada, Spain
Ignacio Rojas
University of Granada, Granada, Spain
Daniel Castillo-Secilla
University of Granada, Granada, Spain
Luis Javier Herrera
Universidad de Granada, Granada, Granada, Spain
Héctor Pomares

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pérez-Sánchez, P. et al. (2021). Automatic Classification of Valve Diseases Through Natural Language Processing in Spanish and Active Learning. In: Rojas, I., Castillo-Secilla, D., Herrera, L.J., Pomares, H. (eds) Bioengineering and Biomedical Signal and Image Processing. BIOMESIP 2021. Lecture Notes in Computer Science(), vol 12940. Springer, Cham. https://doi.org/10.1007/978-3-030-88163-4_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-88163-4_4
Published: 09 October 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-88162-7
Online ISBN: 978-3-030-88163-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics