Abstract
Value-based payment is becoming general in healthcare. In rehabilitation medicine, medical services are becoming to be paid depending on the outcome obtained from hospitalization period and dependency score called as FIM (Functional Independent Measurement). The optimal therapies to maximize the outcome differs by each patient’s age, sex, disease, handicap, FIM and therapies. Non-experienced hospitals have a difficulty in improving the outcome. Therefore, there are needs to maximize the outcome by optimizing therapies. We developed a rehabilitation XAI system to predict outcome with optimal therapies. Our system piles up medical records into vectors and predicts the outcome with optimal therapies using machine learning based on vector distance that can explain the basis of prediction in the same way as doctors suggesting optimal therapies to patients based on similar past cases. The interface not only displays optimal therapies but also predicts outcome by each patient. We used data from multiple hospitals and evaluated the adaptability of our system. In case of using the data from one hospital, the pattern achieving high outcome, which was most important because it was used to suggest optimal therapies, occupied the proportion of 31.1% in the actual record while the precision and recall were 64.5% and 73.4%. In case of using the data from another hospital, they were 64.4% and 66.1% against the actual proportion of 35.7%. In case of using the data from both hospitals, they were 63.6% and 71.0% against the actual proportion of 33.3%. Our system achieved similar performance and adaptability between two hospitals. Correlation coefficient between actual and predicted outcome were 0.681 using 203 patients’ record. We compared the accuracy to predict outcome between our XAI and humans. Average outcomes of top 70% patients predicted at hospitalization by our XAI and humans were 43.0 and 42.4. Our XAI could predict outcome at higher accuracy than humans.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Background and Purpose
Value-based program [9] and payment [10] are becoming general in healthcare insurance company or governmental institution [11]. In rehabilitation medicine, the medical services are becoming to be paid depending on the outcome determined by the combination of hospitalization period and dependency score called as functional independent measurement (FIM) gain [11] utilized to measure the level of dependence that patient has in performing a certain task. As more patients can complete the rehabilitation at higher FIM gain within shorter period, hospitals can get higher outcome and more payment. The outcome has an impact on the revenue of rehabilitation hospitals.
Rehabilitation therapies are composed of physical, occupational and speaking therapies (PT, OT and ST). Their optimal combination to maximize the outcome varies depending on many parameters such as each patients’ age, sex, disease, handicap, time-series FIM score and therapies. Non-experienced hospitals, therapists or doctors don’t know the optimal combination well and have a difficulty in improving the outcome. Therefore, there are needs to maximize the outcome by predicting optimal combination of rehabilitation therapies.
We have developed and improved a rehabilitation explainable AI (XAI) system to predict the outcome obtained from hospitalization period and FIM gain with optimal combination of rehabilitation therapies [8, 17]. Our proposed system piled up actual time-series medical records into vectors and predicted the pattern of outcome with optimal therapies by each patient using machine learning based on vector distances that could explain the basis of prediction in the same way as doctors suggesting optimal therapies to patients based on similar past cases. The interface not only displays the optimal combination of therapies to maximize the outcome but also predicted the outcome calculated from FIM gain and hospitalization period by each patient.
In this research, we added new data from another hospital and evaluated the adaptability of our AI system to multiple hospitals. We also evaluated and compared the accuracy to predict the outcome between our XAI and humans using the actual record from a customer hospital.
2 Related Works
Medical diagnostic using correlation [1] is the most popular technique. If the distribution of data is small and correlation coefficient is large, correlation-based diagnostic generally has high accuracy. In rehabilitation medicine, the distribution of outcome against each parameter is large and the correlation coefficient often becomes small between outcome and many parameters by large noise because humans have emotion and don’t react quantitatively like machines.
Medical diagnostic using Bayesian network [2] is also well known. If each conditional branch node has the table of conditional probability defined by optimal condition, diagnostic using Bayesian network has high accuracy. Rehabilitation’s outcome differs widely by the difference of a few percentage in the allocation of therapies between PT, OT and ST. It is often difficult to find the optimal condition in rehabilitation medicine.
In rehabilitation medicine, the techniques to predict FIM score [3, 4] or FIM gain [5,6,7] using correlation-based technique were reported in past papers. As the accuracy to predict FIM gain, the correlation coefficient of 0.653 was reported [5]. However, we don’t find any report about how to accurately predict outcome obtained from dividing FIM gain by hospitalization period. The prediction of rehabilitation outcome is challenging because it needs to accurately predict both FIM gain and hospitalization period that greatly fluctuate depending on patient’s motivation, families’ emotion, home’s preparation and hospital’s policy even if the condition is same.
3 Proposed Rehabilitation XAI System
3.1 System Configuration
Figure 1 shows the configuration of our proposed rehabilitation XAI system to predict outcome with optimal therapies. Our XAI system works as a cloud service of SaaS (Software as a Service) [17] with other services such as NaaS (Network as a Service) [16]. Users send electric medical record to XAI application of cloud. After then, they receive the result of predicted outcomes with optimal therapies.
Our XAI system includes two programs. One is learning data generator. Another is judgement program.
Learning data generator creates the learning data piled up as vectors using the archives of past electric medical record composed of patients’ personality, disease, handicap, time-series FIM scores and therapies. It also classifies vectors into multiple patterns depending on FIM gain per week.
Judgement program recognizes the pattern of outcome by each patient based on machine learning using the learning data. It not only predicts the outcome with optimal therapies, FIM gain, FIM score and hospitalization period but also shows the statistical information about similar cases and patients as the basis of prediction. It uses the algorithm of K-NN (Nearest Neighbor) based on vector distance that can explain the basis of prediction in the same way as doctors suggesting probable therapies to patients based on similar past cases.
3.2 Medical Record Data Used for Learning
Through the collaboration with two actual rehabilitation hospitals A [12] and B [13], we analyzed the actual electric medical record including eighteen thousand patients between 2006 and 2018. Our system classified diseases into eight categories (stroke, heart, kidney, diabetes, cancer, dementia, bone fracture, depression) and handicaps into five categories (physical, speaking, occupation, cognition, higher brain dysfunction).
The statistical analysis of the actual record shows the trend where highly recovered patient group receives larger number of PT/OT and smaller number of ST as initial FIM score or age becomes larger. On the contrary, too large number of PT/OT or too small number of ST causes the worse outcome. The type of disease also influences the outcome. Stroke type of disease shows the trend where highly recovered patients receive larger number of ST. On the other hand, heart or cancer type of disease shows the trend where highly recovered patients receive larger number of PT/OT.
3.3 Learning Data Generation
Learning data generator creates the learning data piled up as vectors composed of sex, age, disease, handicap, FIM score (motor, cognition and speed) and therapies (the number of PT, OT and ST) by each combination of patient ID and hospitalized day. FIM score and therapies are smoothed by each day between two days when FIM scores are measured. One row becomes one vector. (See Fig. 2)
Each vector is classified into three patterns depending on FIM gain per one week. The classification of patterns based on FIM gain per one week enables the direct prediction of outcome obtained from dividing FIM gain by hospitalization periods. The threshold for the classification were determined by the trend line between hospitalization period and average FIM gain shown in Fig. 3. In the first ten weeks, average FIM gain per week was 2. Therefore, we used the threshold of 2 for classifying vectors into first pattern and others.
FIM gain of more than two per one week is defined as pattern 1. FIM gain of two or less and more than zero per one week is defined as pattern 2. FIM gain of zero or less per one week is defined as pattern 3. Pattern 1, which is most important because it is used to suggest optimal therapies, occupies 31.1%, 35.7% and 33.3% in hospital A, B and both ones (See Table 1).
3.4 Explainable Pattern Recognition Using K-NN with Tuned Weight
Judgement program recognizes the pattern of outcome by each patient using the algorithm of K-NN based on Euclidean vector distance tuning weight in specific range of some features. The program extracts top 700–1500 of most similar vectors from all learned vectors using Euclidean distance of vectors composed of sex, age, disease, handicap, FIM score and therapies between original patient’s vector and all learned vectors. The program calculates the proportion of pattern 1–3 in the extracted top 700–1500 of most similar vectors and recognizes the pattern significantly deviating from the average proportion of entire learned data as prospected pattern. After then, it calculates the predicted values of outcome, FIM gain, FIM score and hospitalization period using extracted similar cases of the recognized pattern (See Fig. 4).
The K-NN algorithm based on Euclidean vector distance can explain the basis of prediction in the same way as doctors suggesting probable therapies to patients based on similar past cases or literatures. It is adaptable to the field of medicine that requires responsibility to explain the basis of prediction to patients and doctors.
3.5 Suggestion of Optimal Therapies
Figure 5 shows our method to suggest optimal therapies. Judgement program extracts top N (= 00, 300 or 700) of most similar vectors from learned vectors of pattern 1 using Euclidean distance of vectors eliminating therapies between original patient’s vector and all learned vectors. After then, the program creates N vectors by overwriting original patient’s vector eliminating therapies into the extracted top N similar vectors. The patterns of newly created N vectors are recognized N times using the K-NN algorithm. The vectors recognized as pattern 1 are sorted in descending order of the number of similar cases, and after then, the therapies included in top 3 vectors are output as optimal therapies. If there is no vector recognized as pattern 1, the vectors recognized as pattern 2 or 3 are used. The vectors recognized as pattern 3 are sorted in ascending order of the number of similar cases.
4 Evaluation
We divided the vectors of past electric medical record into two groups for leaning of 99.9% and evaluation of 0.1%. In addition, the vectors related to the evaluated vector were eliminated from learning data by each evaluation. We evaluated the execution time, precision, recall and accuracy to recognize the pattern or predict the outcome.
4.1 Comparison of Execution Time Between CPU and GPU
We evaluated the improvement of execution time using GPU [14]. The result is shown in Fig. 6 and Table 2.
Execution time between input and output improved from 28 to 8 s by three times using GPU (See Fig. 6). When our XAI uses CPU [15], brute-force distance calculation occupied large area of execution time. GPU drastically reduced the time of brute-force distance calculation and sort time remained occupying large area of execution time.
The execution time increased in proportion to the top N number of similar vectors extracted for recognition and prediction. The top 300 or less satisfied the response of 10 s required by our customers.
4.2 Precision and Recall of Each Pattern
We evaluated the precision and recall by each pattern using the electric medical record from hospital A [12]. The result is shown in Table 3 and Table 4.
The precision and recall of pattern 1 were larger than those of pattern 2 or 3 (See Table 3). Pattern 1 often includes patients having single major disease like stroke and is easy to recognize the pattern of outcome. On the other hand, pattern 2 or 3 often includes patients having multiple major diseases and is more difficult to predict the outcome. Especially, the outcome of patients having depression with other major disease drastically changes depending on dairy symptom. Therefore, the precision and recall of pattern 2 or 3 were thought to decrease.
Pattern 1 achieving high FIM gain per week, which was most important because it was used to predict optimal therapies, occupied the proportion of 31.1% in the actual record while the precision was 64.5% and the recall was 73.4% (See Table 3). Our XAI system could correctly extract 73.4% of the most important pattern 1 used as candidates for optimal therapies. The users can improve the proportion of pattern1 achieving high outcome by preferentially hospitalizing patients predicted as pattern 1 with optimal therapies and enhance their outcome. Achieving higher accuracy close to 100% is challenging because patients have emotion and react non-quantitatively unlike machines.
Average FIM gain per week of pattern 1, 2 and 3 were about 5, 1 and 0. The difference between pattern 2 and 3 was very small. Therefore, it was important to separate pattern 1 from other patterns. The precision and recall merging pattern 2 and 3 were shown in Table 4. In this case, the accuracy improved from 59.3% to 79.1%. The accuracy changed depending on the number of recognized patterns.
4.3 Dependency of Precision and Recall on Hospitals
We evaluated the dependency of precision and recall on hospitals using data from hospital A [12], B [13] or both ones. The result is shown in Table 5.
In case of using data from hospital B, the precision and recall of the most important pattern 1 were 64.4% and 66.1% against the actual proportion of 35.7%. In case of using data from both hospitals, the precision and recall of the most important pattern 1 were 63.6% and 71.0% against the actual proportion of 33.3%. Our AI could extract the most important pattern 1 at the percentage of 64.4–73.4% in all combinations of hospitals. There were not large difference between single and mixed data. Our system achieved similar performance and adaptability between two hospitals.
4.4 Dependency of Precision and Recall on Amount of Data
We evaluated the dependency of precision and recall on amount of data by deleting past two year of 2006 and 2007 from hospital A and B record. The result is shown in Table 6.
The precision and recall of the most important pattern 1 decreased as the learning data decreased. Gathering large amount of data is important to achieve high precision and recall.
4.5 Prediction of Outcome
Outcome to determine payment for medical services is calculated by multiplying the standard hospitalization period by the value obtained from dividing FIM gain by actual hospitalization period. Moreover, hospitals can eliminate up to 30% patients from the population to calculate the outcome only when they are hospitalized [11]. Therefore, it is important to predict the outcome at high accuracy at the time of hospitalization.
We evaluated the correlation coefficients between predicted and actual values of outcome using actual 203 patients’ who were newly hospitalized at hospital A between May and July in 2018. The result is shown in Fig. 7. Correlation coefficient between actual and predicted outcome was 0.681, which was similar to the precision or recall of 64.5–73.4% (See Fig. 7).
The prediction of outcome is more difficult than that of FIM gain because the prediction of the outcome, which is calculated using both FIM gain and hospitalization period, needs to predict not only FIM gain but also hospitalization period at high accuracy. Especially, even if the condition is same, hospitalization period greatly fluctuates depending on patients’ motivation, families’ consensus, hospitals’ policy or homes’ preparation for acceptance. We enabled the direct prediction of outcome by the classification of patterns based on FIM gain per one week. Our prediction of outcome could achieve higher accuracy at correlation coefficient of 0.681 than conventional prediction of FIM gain at correlation coefficient of 0.653 [5].
We also compared the accuracy to predict the outcome between our XAI and humans. Average outcomes of top 70% patients predicted at hospitalization by our XAI and humans were 43.0 and 42.4 (See Table 7). Our XAI could predict the outcome at the higher accuracy than humans.
5 Conclusion
We have developed and improved a rehabilitation XAI system to predict the outcome obtained from dividing FIM gain by hospitalization period with optimal combination of rehabilitation therapies.
Our XAI system works as a cloud service. Users send electric medical record to XAI application of cloud. The electric medical record includes patient’s personality with disease, handicap, therapies and FIM score. After then, they receive the result of predicted outcome with optimal therapies. The interface not only displays the optimal combination of therapies to maximize the outcome but also predicts the outcome obtained from dividing FIM gain by hospitalization period by each patient.
Our XAI system creates learning data by piling up past medical records into vectors. Each vector is classified into three patterns depending on FIM gain per one week. The classification of patterns based on FIM gain per one week enables the direct prediction of outcome determined by dividing FIM gain by hospitalization periods. Our XAI predicts the pattern of outcome with optimal therapies by each patient using K-NN machine learning algorithm based on vector distances that can explain the basis of prediction in the same way as doctors suggesting optimal therapies to patients based on similar past cases.
We used data from multiple hospitals and evaluated the adaptability of our system. In case of using the data from one hospital, the pattern achieving high FIM gain per week, which was most important because it was used to suggest optimal combination of therapies, occupied the proportion of 31.1% in the actual medical record while the precision was 64.5% and the recall was 73.4%. The users can improve the proportion of pattern 1 achieving high outcome by preferentially hospitalizing patients suggested with therapies predicted as pattern 1 and enhance their outcome.
In case of using the data from another hospital, the precision and recall were 64.4% and 66.1% against the actual proportion of 34.5%. In case of using the data from both hospitals, they were 63.6% and 71.0% against the actual proportion of 32.7%. Our system achieved similar performance and adaptability between two hospitals.
We also evaluated the correlation coefficients between predicted and actual values of outcome using actual 203 patients’ record newly obtained from our customer hospital. Correlation coefficient between actual and predicted outcome was 0.681. The prediction of outcome is more difficult than that of FIM gain because the prediction of the outcome needs to predict not only FIM gain but also hospitalization period at high accuracy. We enabled the direct prediction of outcome by the classification of patterns based on FIM gain per one week. Our prediction of outcome could achieve higher accuracy at correlation coefficient of 0.681 than conventional prediction of FIM gain at correlation coefficient of 0.653 [5].
We also compared the accuracy to predict the outcome between our XAI and humans. Average outcomes of top 70% patients predicted at hospitalization by our XAI and humans were 43.0 and 42.4. Our XAI could predict the outcome at the higher accuracy than humans.
We are currently improving our rehabilitation XAI system to have higher accuracy and adaptability using the record obtained from more hospitals as the future task.
References
Mukaka, M.M.: A guide to appropriate use of correlation coefficient in medical research. Malawi Med. J. 24(3), 69–71 (2012). Medical Association of Malawi
Nikovski, D.: Constructing Bayesian networks for medical diagnosis from incomplete and partially correct statistics. IEEE Trans. Knowl. Data Eng. 12(4), 509–516 (2000)
Sonoda, S., Saitoh, E., Nagai, S., Okuyama, Y., Suzuki, T., Suzuki, M.: Stroke outcome prediction using reciprocal number of initial activities of daily living status. J. Stroke 14(1), 8–11 (2005)
Chumney, D., Nollinger, K., Shesko, K., Skop, K., Spencer, M., Newton, R.A.: Ability of functional independence measure to accurately predict functional outcome of stroke-specific population: systematic review. J. Rehabil. Res. Dev. 47(1), 17–29 (2010)
Tokunaga, M., et al.: The stratification of motor FIM and cognitive FIM and the creation of four prediction formulas to enable higher prediction accuracy of multiple linear regression analysis with motor FIM gain as the objective variable—an analysis of the Japan Rehabilitation Database. Jpn. J. Compr. Rehabil. Sci. 8, 21–29 (2017). Kaifukuki Rehabilitation Ward Association
Tokunaga, M., Mori, Y., Ogata, Y., Tanaka, Y., Uchino, K., Maeda, Y., et al.: Predicting FIM gain in stroke patients by adding median FIM gain stratified by FIM score at hospital admission to the explanatory variables in multiple regression analysis. Jpn. J. Compr. Rehabil. Sci. 7, 13–18 (2016). Kaifukuki Rehabilitation Ward Association
Tokunaga, M., Sannomiya, K., Nakashima, Y., Nojiri, S., Tokisato, K., Katsura, K., et al.: Formula for predicting FIM gain and discharge FIM: methods using median values of FIM gain stratified by admission FIM, age, cognitive function, and transfer interval. Jpn. J. Compr. Rehabil. Sci. 6, 6–13 (2015)
Isobe, T., Okada, Y.: Medical AI system to assist rehabilitation therapy. In: Perner, P. (ed.) ICDM 2018. LNCS (LNAI), vol. 10933, pp. 266–271. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-95786-9_20
Value-based programs of CMS (Centers for medicare & medicaid services). https://www.cms.gov/Medicare/Quality-Initiatives-Patient-Assessment-Instruments/Value-Based-Programs/Value-Based-Programs.html. Accessed 05 Dec 2019
Value-based payment of medicaid. https://www.medicaid.gov/state-resource-center/innovation-accelerator-program/iap-functional-areas/value-based-payment/index.html. Accessed 05 Dec 2019
Ministry of health, labor and welfare. http://www.mhlw.go.jp/file/05-Shingikai-12404000-Hokenkyoku-Iryouka/0000169318.pdf. Accessed 05 Dec 2019
Hatsudai rehabilitation hospital. http://www.hatsudai-reha.or.jp/. Accessed 05 Dec 2019
Funabashi municipal rehabilitation hospital. http://www.funabashi-reha.com/. Accessed 05 Dec 2019
Nvidia GPU Tesla P 100. http://www.nvidia.com/object/tesla-p100.html. Accessed 05 Dec 2019
Intel CPU E5-1620 v3. https://ark.intel.com/products/82763/Intel-Xeon-Processor-E5-1620-v3-10M-Cache-3_50-GHz. Accessed 05 Dec 2019
Isobe, T., Tanida, N., Oishi, Y., Yoshida, K.: TCP acceleration technology for cloud computing: algorithm, performance evaluation in real network. In: 2014 International Conference on Advanced Technologies for Communications (ATC 2014), pp. 714–719. IEEE (2014)
Hitachi high-tech solutions corporation rehabilitation AI service https://www.hitachi-hightech.com/hsl/special/cloud/awina/english/about/. Accessed 10 Jan 2020
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Isobe, T., Okada, Y. (2020). Rehabilitation XAI to Predict Outcome with Optimal Therapies. In: Xu, R., De, W., Zhong, W., Tian, L., Bai, Y., Zhang, LJ. (eds) Artificial Intelligence and Mobile Services – AIMS 2020. AIMS 2020. Lecture Notes in Computer Science(), vol 12401. Springer, Cham. https://doi.org/10.1007/978-3-030-59605-7_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-59605-7_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59604-0
Online ISBN: 978-3-030-59605-7
eBook Packages: Computer ScienceComputer Science (R0)