Abstract
This paper proposes a method of construction of discharge summaries classifier. First, morphological and correspondence analysis generates a term matrix from text data. Then, machine learning methods are applied to a term matrix. The method compared several machine learning methods by using discharge summaries stored in hospital information system. The experimental results show that random forest is the best clasifier, compared with deep learning, SVM and decision tree.
This research is supported by Grant-in-Aid for Scientific Research (B) 18H03289 from Japan Society for the Promotion of Science(JSPS).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Outpatient clinic is still based on the action-based payment system even in large hospitals.
- 2.
The method can also generate \(p (p\ge 3)\)-dimensional coordinates. However, higher dimensional coordinates did not give better performance that the experiments below.
- 3.
Darch was removed from R package. Please check the githb: https://github.com/maddin79/darch.
- 4.
The reason why 2-fold is selected is that the estimator of 2-fold cross-validation will give the lowest estimate of parameters, such as accuracy and the estimation of bias will be minimized.
- 5.
DPC codes are three-level hierarchical system and each DPC code is defined as a tree. The first-level denotes the type of a disease, the second-level gives the primary selected therapy and the third-level shows the additional therapy. Thus, in the tables, characteristics of codes are used to represent similarities.
References
Egakutsushinsha, Tokyo (2020)
Ishida, M.: Rmecab (2016). http://rmecab.jp/wiki/index.php?RMeCabFunctions
Jones, K.S.: A statistical interpretation of term specificity and its application in retrieval. J. Doc. 28(1), 11–21 (1972)
Karatzoglou, A., Smola, A., Hornik, K., Zeileis, A.: kernlab - an S4 package for kernel methods in R. J. Stat. Softw. 11(9), 1–20 (2004). http://www.jstatsoft.org/v11/i09/
Kim, J.H.: Estimating classification error rate: repeated cross-validation, repeated hold-out and bootstrap. Comput. Stat. Data Anal. 53(11), 3735–3745 (2009). https://doi.org/10.1016/j.csda.2009.04.009
Liaw, A., Wiener, M.: Classification and regression by randomforest. R News 2(3), 18–22 (2002). http://CRAN.R-project.org/doc/Rnews/
Luhn, H.P.: A statistical approach to mechanized encoding and searching of literary information. IBM J. Res. Dev. 1(4), 309–317 (1957)
Mares, M.A., Wang, S., Guo, Y.: Combining multiple feature selection methods and deep learning for high-dimensional data. Trans. Mach. Learn. Data Min. 9, 27–45 (2016)
Nezhad, M.Z., Zhu, D., Li, X., Yang, K., Levy, P.: SAFS: a deep feature selection approach for precision medicine. CoRR abs/1704.05960 (2017). http://arxiv.org/abs/1704.05960
Therneau, T.M., Atkinson, E.J.: An introduction to recursive partitioning using the RPART routines (2015). https://cran.r-project.org/web/packages/rpart/vignettes/longintro.pdf
Venables, W.N., Ripley, B.D.: Modern Applied Statistics with S, 4th edn. Springer, New York (2002). https://doi.org/10.1007/978-0-387-21706-2. http://www.stats.ox.ac.uk/pub/MASS4. iSBN 0-387-95457-0
Acknowledgments
This research is supported by Grant-in-Aid for Scientific Research (B) 18H03289 from Japan Society for the Promotion of Science(JSPS).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Tsumoto, S., Kimura, T., Hirano, S. (2021). Mining in Discharge Summaries. In: Yada, K., et al. Advances in Artificial Intelligence. JSAI 2020. Advances in Intelligent Systems and Computing, vol 1357. Springer, Cham. https://doi.org/10.1007/978-3-030-73113-7_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-73113-7_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-73112-0
Online ISBN: 978-3-030-73113-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)