Abstract
A sequence variant is a data structure that represents elements with partial order, and it can be regarded as a sequence with branches. The comparison of sequence variants is an important problem in practical applications. To conduct comparisons, it is necessary to detect the common and uncommon parts of sequence variants, but a suitable method is not available. In this study, we developed a method for comparing sequence variants by providing appropriate definitions and algorithms. The longest common subsequence variant is defined based on the longest common subsequence and a merged sequence variant is proposed for comparison. We also propose algorithms for calculating these sequence variants. As an example, we applied the methods to real data from electronic medical records (EMRs) to determine the diversity of the treatment patterns in different hospitals, before presenting the results to medical workers to help them recognize the differences and improve their medical actions. Frequent treatment patterns were extracted from a real treatment order database of EMRs by using sequential pattern mining, and the differences in the treatment patterns were calculated and visualized as longest common subsequence variants and merged sequence variants.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Gao, X., Xiao, B., Tao, D., Li, X.: A survey of graph edit distance. Pattern Anal. Appl. 13, 113–129 (2010)
Yoshihara, H.: Millennial medical record project: toward establishment of authentic Japanese version EHR and secondary use of medical data. J. Inf. Process. Manag. 60, 767–778 (2018)
Fournier-Viger, P., Lin, J.C.-W., Kiran, R.U., Koh, Y.S., Thomas, R.: A survey of sequential pattern mining. Data Sci. Pattern Recogn. 1(1), 54–77 (2017)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th International Conference on Very Large Data Bases, pp. 487–499 (1994)
Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.: PrefixSpan: mining sequential patterns efficiently by prefix-projected pattern growth. In: Proceedings of the 2001 International Conference on Data Engineering, pp. 215–224 (2001)
Raju, V.P., Varma, G.S.: Mining closed sequential patterns in large sequence databases. Int. J. Database Manag. Syst. 7(1), 29–39 (2015)
Wright, A.P., Wright, A.T., McCoy, A.B., Sittig, D.F.: The use of sequential pattern mining to predict next prescribed medications. J. Biomed. Inform. 53, 73–80 (2015)
Zaki, M.J.: SPADE: An efficient algorithm for mining frequent sequences. Mach. Learn. 42, 31–60 (2001)
Uragaki, K., et al.: Sequential pattern mining on electronic medical records with handling time intervals and the efficacy of medicines. In: Proceedings of the 21st IEEE International Symposium on Computers and Communications, pp. 20–25 (2016)
Le, H.H., Kushima, M., Araki, K., Yokota, H.: Differentially private sequential pattern mining considering time interval for electronic medical record systems. In: Proceedings of the 23rd International Database Engineering and Applications Symposium, pp. 95–103 (2019)
Honda, Y., Kushima, M., Yamazaki, T., Araki, K., Yokot, Y.: Detection and visualization of variants in typical medical treatment sequences. In: Proceedings of the 3rd VLDB Workshop on Data Management and Analytics for Medicine and Healthcare. Springer, pp. 88–101 (2017)
Le, H.H., Yamada, T., Honda, Y., Kayahara, M., Kushima, M., Araki, K., Yokota, H.: Analyzing sequence pattern variants in sequential pattern mining and its application to electronic medical record systems. In: Hartmann, S., Küng, J., Chakravarthy, S., Anderst-Kotsis, G., Tjoa, A.M., Khalil, I. (eds.) DEXA 2019. LNCS, vol. 11707, pp. 393–408. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-27618-8_29
Maier, D.: The complexity of some problems on subsequences and supersequence. J. ACM 25(2), 322–336 (1978)
The Section of Medical Information at the Faculty of Medicine, University of Miyazaki Hospital. http://www.med.miyazakiu.ac.jp/home/jyoho/
Various Information on Medical Fees. https://shinryohoshu.mhlw.go.jp/shinryohoshu/paMenu/doPaDetailSpNext &100
Index of Treatment Orders. https://www.ichikawa568.com/ika-sinryouhousyu-tensuuhyo.html
Acknowledgement
This research is supported by Grants-in-Aid for Scientific Research (B) (#20H04192) and Grants-in-Aid for Early-Career Scientists (#21K17746) from the Japan Society for the Promotion of Science.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Li, Y., Le, H.H., Matsuo, R., Yamazaki, T., Araki, K., Yokota, H. (2022). Comparison of Sequence Variants and the Application in Electronic Medical Records. In: Strauss, C., Cuzzocrea, A., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2022. Lecture Notes in Computer Science, vol 13427. Springer, Cham. https://doi.org/10.1007/978-3-031-12426-6_10
Download citation
DOI: https://doi.org/10.1007/978-3-031-12426-6_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-12425-9
Online ISBN: 978-3-031-12426-6
eBook Packages: Computer ScienceComputer Science (R0)