Abstract
Electronic health records (EHRs) contain important temporal information about disease progression and patients. However, mining temporal representations from discrete EHR data (e.g., diagnosis, medication, or procedure codes) for use in standard Machine Learning is challenging. We propose a transitive Sequential Pattern Mining approach (tSPM) to address the temporal irregularities involved in recording discrete records in EHRs. We perform experiments to compare the classification performance metrics for predicting “true” diagnosis between traditional sequential pattern mining (SPM) and the proposed tSPM algorithms across multiple diseases. We demonstrate that transitive approach is superior to the traditional SPM in mining temporal representations for diagnosis prediction.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Agrawal, R., Srikant, R., et al.: Mining sequential patterns. In: ICDE, vol. 95, pp. 3–14 (1995)
Albers, D.J., Hripcsak, G.: Estimation of time-delayed mutual information and bias for irregularly and sparsely sampled time-series. Chaos, Solitons Fractals 45(6), 853–860 (2012)
Ayres, J., Flannick, J., Gehrke, J., Yiu, T.: Sequential pattern mining using a bitmap representation. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2002, pp. 429–435. ACM, New York (2002)
Batal, I., Cooper, G.F., Fradkin, D., Harrison, J., Moerchen, F., Hauskrecht, M.: An efficient pattern mining approach for event detection in multivariate temporal data. Knowl. Inf. Syst. 46(1), 115–150 (2015). https://doi.org/10.1007/s10115-015-0819-6
Batal, I., Valizadegan, H., Cooper, G.F., Hauskrecht, M.: A temporal pattern mining approach for classifying electronic health record data. ACM Trans. Intell. Syst. Technol. 4(4) (2013)
Berlingerio, M., Bonchi, F., Giannotti, F., Turini, F.: Mining clinical data with a temporal dimension: a case study. In: 2007 IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2007), pp. 429–436, November 2007
Brown, G., Pocock, A., Zhao, M.J., Luján, M.: Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. J. Mach. Learn. Res. 13, 27–66 (2012)
Choi, E., et al.: Multi-layer representation learning for medical concepts. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016, pp. 1495–1504. Association for Computing Machinery, New York, August 2016
Choi, E., Bahadori, M.T., Sun, J., Kulas, J., Schuetz, A., Stewart, W.: RETAIN: an interpretable predictive model for healthcare using reverse time attention mechanism. In: Lee, D.D., Sugiyama, M., Luxburg, U.V., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 29, pp. 3504–3512. Curran Associates, Inc. (2016)
Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, Hoboken (2012)
Hripcsak, G., Albers, D.J., Perotte, A.: Exploiting time in electronic health record correlations. J. Am. Med. Inform. Assoc. 18(Suppl 1), i109–15 (2011)
Johnson, A., Pollard, T., Shen, L., et al.: MIMIC-III, a freely accessible critical care database. Sci. Data 3, 160035 (2016). https://doi.org/10.1038/sdata.2016.35
Lee, J.M., Hauskrecht, M.: Recent context-aware LSTM for clinical event time-series prediction. In: Riaño, D., Wilk, S., ten Teije, A. (eds.) AIME 2019. LNCS (LNAI), vol. 11526, pp. 13–23. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21642-9_3
Mabroukeh, N.R., Ezeife, C.I.: A taxonomy of sequential pattern mining algorithms. ACM Comput. Surv. 43(1), 41 p. (2010). Article 3. https://doi.org/10.1145/1824795.1824798
Meyer, P.E.: Information-theoretic variable selection and network inference from microarray data. Ph.D. thesis, Université Libre de Bruxelles (2008)
Moskovitch, R., Choi, H., Hripcsak, G., Tatonetti, N.: Prognosis of clinical outcomes with temporal patterns and experiences with one class feature selection. IEEE/ACM Trans. Comput. Biol. Bioinform. 14(3), 555–563 (2017)
Moskovitch, R., Polubriaginof, F., Weiss, A., Ryan, P., Tatonetti, N.: Procedure prediction from symbolic electronic health records via time intervals analytics. J. Biomed. Inform. 75, 70–82 (2017)
Moskovitch, R., Shahar, Y.: Classification-driven temporal discretization of multivariate time series. Data Min. Knowl. Disc. 29(4), 871–913 (2014). https://doi.org/10.1007/s10618-014-0380-z
Orphanou, K., Dagliati, A., Sacchi, L., Stassopoulou, A., Keravnou, E., Bellazzi, R.: Incorporating repeating temporal association rules in Naïve Bayes classifiers for coronary heart disease diagnosis. J. Biomed. Inform. 81, 74–82 (2018)
Paninski, L.: Estimation of entropy and mutual information. Neural Comput. 15(6), 1191–1253 (2003)
Perer, A., Wang, F., Hu, J.: Mining and exploring care pathways from electronic medical records with visual analytics. J. Biomed. Inform. 56, 369–378 (2015)
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Others: learning representations by back-propagating errors. Cogn. Model. 5(3), 1 (1988)
Stacey, M., McGregor, C.: Temporal abstraction in intelligent clinical data analysis: a survey. Artif. Intell. Med. 39(1), 1–24 (2007)
Sun, W., Rumshisky, A., Uzuner, O.: Temporal reasoning over clinical text: the state of the art. J. Am. Med. Inform. Assoc. 20(5), 814–819 (2013)
Yang, H., Moody, J.: Data visualization and feature selections: new algorithms for non-Gaussian data. In: Advances in Neural Information Processing Systems, vol. 12 (1999)
Youden, W.J.: Index for rating diagnostic tests. Cancer 3(1), 32–35 (1950)
Zaki, M.J.: Parallel sequence mining on shared-memory machines. J. Parallel Distrib. Comput. 61(3), 401–426 (2001)
Acknowledgements
The work in this paper was supported by NIH grant R01-HG009174. The content of the paper is solely the responsibility of the authors and does not necessarily represent the official views of NIH.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Estiri, H., Vasey, S., Murphy, S.N. (2020). Transitive Sequential Pattern Mining for Discrete Clinical Data. In: Michalowski, M., Moskovitch, R. (eds) Artificial Intelligence in Medicine. AIME 2020. Lecture Notes in Computer Science(), vol 12299. Springer, Cham. https://doi.org/10.1007/978-3-030-59137-3_37
Download citation
DOI: https://doi.org/10.1007/978-3-030-59137-3_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59136-6
Online ISBN: 978-3-030-59137-3
eBook Packages: Computer ScienceComputer Science (R0)