Skip to main content

Transitive Sequential Pattern Mining for Discrete Clinical Data

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12299))

Abstract

Electronic health records (EHRs) contain important temporal information about disease progression and patients. However, mining temporal representations from discrete EHR data (e.g., diagnosis, medication, or procedure codes) for use in standard Machine Learning is challenging. We propose a transitive Sequential Pattern Mining approach (tSPM) to address the temporal irregularities involved in recording discrete records in EHRs. We perform experiments to compare the classification performance metrics for predicting “true” diagnosis between traditional sequential pattern mining (SPM) and the proposed tSPM algorithms across multiple diseases. We demonstrate that transitive approach is superior to the traditional SPM in mining temporal representations for diagnosis prediction.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Agrawal, R., Srikant, R., et al.: Mining sequential patterns. In: ICDE, vol. 95, pp. 3–14 (1995)

    Google Scholar 

  2. Albers, D.J., Hripcsak, G.: Estimation of time-delayed mutual information and bias for irregularly and sparsely sampled time-series. Chaos, Solitons Fractals 45(6), 853–860 (2012)

    Article  Google Scholar 

  3. Ayres, J., Flannick, J., Gehrke, J., Yiu, T.: Sequential pattern mining using a bitmap representation. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2002, pp. 429–435. ACM, New York (2002)

    Google Scholar 

  4. Batal, I., Cooper, G.F., Fradkin, D., Harrison, J., Moerchen, F., Hauskrecht, M.: An efficient pattern mining approach for event detection in multivariate temporal data. Knowl. Inf. Syst. 46(1), 115–150 (2015). https://doi.org/10.1007/s10115-015-0819-6

    Article  Google Scholar 

  5. Batal, I., Valizadegan, H., Cooper, G.F., Hauskrecht, M.: A temporal pattern mining approach for classifying electronic health record data. ACM Trans. Intell. Syst. Technol. 4(4) (2013)

    Google Scholar 

  6. Berlingerio, M., Bonchi, F., Giannotti, F., Turini, F.: Mining clinical data with a temporal dimension: a case study. In: 2007 IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2007), pp. 429–436, November 2007

    Google Scholar 

  7. Brown, G., Pocock, A., Zhao, M.J., Luján, M.: Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. J. Mach. Learn. Res. 13, 27–66 (2012)

    MathSciNet  MATH  Google Scholar 

  8. Choi, E., et al.: Multi-layer representation learning for medical concepts. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016, pp. 1495–1504. Association for Computing Machinery, New York, August 2016

    Google Scholar 

  9. Choi, E., Bahadori, M.T., Sun, J., Kulas, J., Schuetz, A., Stewart, W.: RETAIN: an interpretable predictive model for healthcare using reverse time attention mechanism. In: Lee, D.D., Sugiyama, M., Luxburg, U.V., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 29, pp. 3504–3512. Curran Associates, Inc. (2016)

    Google Scholar 

  10. Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, Hoboken (2012)

    MATH  Google Scholar 

  11. Hripcsak, G., Albers, D.J., Perotte, A.: Exploiting time in electronic health record correlations. J. Am. Med. Inform. Assoc. 18(Suppl 1), i109–15 (2011)

    Article  Google Scholar 

  12. Johnson, A., Pollard, T., Shen, L., et al.: MIMIC-III, a freely accessible critical care database. Sci. Data 3, 160035 (2016). https://doi.org/10.1038/sdata.2016.35

    Article  Google Scholar 

  13. Lee, J.M., Hauskrecht, M.: Recent context-aware LSTM for clinical event time-series prediction. In: Riaño, D., Wilk, S., ten Teije, A. (eds.) AIME 2019. LNCS (LNAI), vol. 11526, pp. 13–23. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21642-9_3

    Chapter  Google Scholar 

  14. Mabroukeh, N.R., Ezeife, C.I.: A taxonomy of sequential pattern mining algorithms. ACM Comput. Surv. 43(1), 41 p. (2010). Article 3. https://doi.org/10.1145/1824795.1824798

  15. Meyer, P.E.: Information-theoretic variable selection and network inference from microarray data. Ph.D. thesis, Université Libre de Bruxelles (2008)

    Google Scholar 

  16. Moskovitch, R., Choi, H., Hripcsak, G., Tatonetti, N.: Prognosis of clinical outcomes with temporal patterns and experiences with one class feature selection. IEEE/ACM Trans. Comput. Biol. Bioinform. 14(3), 555–563 (2017)

    Article  Google Scholar 

  17. Moskovitch, R., Polubriaginof, F., Weiss, A., Ryan, P., Tatonetti, N.: Procedure prediction from symbolic electronic health records via time intervals analytics. J. Biomed. Inform. 75, 70–82 (2017)

    Article  Google Scholar 

  18. Moskovitch, R., Shahar, Y.: Classification-driven temporal discretization of multivariate time series. Data Min. Knowl. Disc. 29(4), 871–913 (2014). https://doi.org/10.1007/s10618-014-0380-z

    Article  MathSciNet  Google Scholar 

  19. Orphanou, K., Dagliati, A., Sacchi, L., Stassopoulou, A., Keravnou, E., Bellazzi, R.: Incorporating repeating temporal association rules in Naïve Bayes classifiers for coronary heart disease diagnosis. J. Biomed. Inform. 81, 74–82 (2018)

    Article  Google Scholar 

  20. Paninski, L.: Estimation of entropy and mutual information. Neural Comput. 15(6), 1191–1253 (2003)

    Article  Google Scholar 

  21. Perer, A., Wang, F., Hu, J.: Mining and exploring care pathways from electronic medical records with visual analytics. J. Biomed. Inform. 56, 369–378 (2015)

    Article  Google Scholar 

  22. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Others: learning representations by back-propagating errors. Cogn. Model. 5(3), 1 (1988)

    MATH  Google Scholar 

  23. Stacey, M., McGregor, C.: Temporal abstraction in intelligent clinical data analysis: a survey. Artif. Intell. Med. 39(1), 1–24 (2007)

    Article  Google Scholar 

  24. Sun, W., Rumshisky, A., Uzuner, O.: Temporal reasoning over clinical text: the state of the art. J. Am. Med. Inform. Assoc. 20(5), 814–819 (2013)

    Article  Google Scholar 

  25. Yang, H., Moody, J.: Data visualization and feature selections: new algorithms for non-Gaussian data. In: Advances in Neural Information Processing Systems, vol. 12 (1999)

    Google Scholar 

  26. Youden, W.J.: Index for rating diagnostic tests. Cancer 3(1), 32–35 (1950)

    Article  Google Scholar 

  27. Zaki, M.J.: Parallel sequence mining on shared-memory machines. J. Parallel Distrib. Comput. 61(3), 401–426 (2001)

    Article  Google Scholar 

Download references

Acknowledgements

The work in this paper was supported by NIH grant R01-HG009174. The content of the paper is solely the responsibility of the authors and does not necessarily represent the official views of NIH.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hossein Estiri .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Estiri, H., Vasey, S., Murphy, S.N. (2020). Transitive Sequential Pattern Mining for Discrete Clinical Data. In: Michalowski, M., Moskovitch, R. (eds) Artificial Intelligence in Medicine. AIME 2020. Lecture Notes in Computer Science(), vol 12299. Springer, Cham. https://doi.org/10.1007/978-3-030-59137-3_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-59137-3_37

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-59136-6

  • Online ISBN: 978-3-030-59137-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics