Skip to main content

Machine Learning for Biomedical Time Series Classification: From Shapelets to Deep Learning

  • Protocol
  • First Online:
Artificial Neural Networks

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2190))

Abstract

With the biomedical field generating large quantities of time series data, there has been a growing interest in developing and refining machine learning methods that allow its mining and exploitation. Classification is one of the most important and challenging machine learning tasks related to time series. Many biomedical phenomena, such as the brain’s activity or blood pressure, change over time. The objective of this chapter is to provide a gentle introduction to time series classification. In the first part we describe the characteristics of time series data and challenges in its analysis. The second part provides an overview of common machine learning methods used for time series classification. A real-world use case, the early recognition of sepsis, demonstrates the applicability of the methods discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    While we try to be consistent in our usage of notation, it might be necessary to slightly diverge from initially introduced notation to keep equations more readable. If this is the case, we will clarify this in the text.

  2. 2.

    We will refer to subsequences that are statistically significantly associated with a class label as shapelets. Note that this term is typically used for subsequences that are maximizing information gain.

References

  1. Sudlow C, Gallacher J, Allen N et al (2015) UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med 12:e1001779

    Article  PubMed  PubMed Central  Google Scholar 

  2. Johnson AEW, Pollard TJ, Shen L et al (2016) MIMIC-III, a freely accessible critical care database. Sci Data 3:160035

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Miller G (2012) The smartphone psychology manifesto. Perspect Psychol Sci 7:221–237

    Article  PubMed  Google Scholar 

  4. Ent MMVX van den, Brown DW, Hoekstra EJ et al (2011) Measles mortality reduction contributes substantially to reduction of all cause mortality among children less than five years of age, 1990-2008. https://doi.org/10.1093/infdis/jir081

  5. Au-Yong ITH, Thorn N, Ganatra R et al (2009) Brown adipose tissue and seasonal variation in humans. Diabetes 58:2583–2587

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Refinetti R, Menaker M (1992) The circadian rhythm of body temperature. Physiol Behav 51:613–637

    Article  CAS  PubMed  Google Scholar 

  7. Reed BG, Carr BR (2018) The normal menstrual cycle and the control of ovulation. In: Feingold KR, Anawalt B, Boyce A et al (eds) Endotext. MDText.com, South Dartmouth, MA

    Google Scholar 

  8. Nagai S, Anzai D, Wang J (2017) Motion artefact removals for wearable ECG using stationary wavelet transform. Healthc Technol Lett 4:138–141

    Article  PubMed  PubMed Central  Google Scholar 

  9. Durbin J, Watson GS (1950) Testing for serial correlation in least squares regression. I. Biometrika 37:409–428

    CAS  PubMed  Google Scholar 

  10. Bence JR (1995) Analysis of short time series: correcting for autocorrelation. Ecology 76:628–639

    Article  Google Scholar 

  11. Peña D, Tiao GC, Tsay RS (2011) A course in time series analysis. Wiley, New York

    Google Scholar 

  12. Kurbalija V, Radovanović M, Geler Z et al (2010) A framework for time-series analysis. In: Artificial intelligence: methodology, systems, and applications. Springer, Berlin, pp 42–51

    Chapter  Google Scholar 

  13. Warren Liao T (2005) Clustering of time series data—a survey. Pattern Recognit 38:1857–1874

    Article  Google Scholar 

  14. Malhotra P, Vig L, Shroff G et al (2015) Long short term memory networks for anomaly detection in time series. In: Proceedings. Presses universitaires de Louvain, p 89

    Google Scholar 

  15. De Gooijer JG (2017) Elements of nonlinear time series analysis and forecasting. Springer, Cham

    Book  Google Scholar 

  16. Kirchgässner G, Wolters J (2008) Introduction to modern time series analysis. Springer Science & Business Media, Berlin

    Google Scholar 

  17. Ye L, Keogh E (2009) Time series shapelets: a new primitive for data mining. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, NY, pp 947–956

    Chapter  Google Scholar 

  18. Zhu Y, Imamura M, Nikovski D et al (2018) Time series chains: a novel tool for time series data mining. https://doi.org/10.24963/ijcai.2018/764

  19. Yeh CM, Zhu Y, Ulanova L et al (2016) Matrix profile I: all pairs similarity joins for time series: a unifying view that includes motifs, discords and shapelets. In: 2016 IEEE 16th international conference on data mining (ICDM). pp 1317–1322

    Google Scholar 

  20. Celebi ME, Aydin K (eds) (2016) Unsupervised learning algorithms. Springer, Cham

    Google Scholar 

  21. Dau HA, Bagnall A, Kamgar K et al (2018) The UCR time series archive. http://arxiv.org/abs/1810.07758

  22. Che Z, Purushotham S, Cho K et al (2018) Recurrent neural networks for multivariate time series with missing values. Sci Rep 8:6085

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  23. Singer M, Deutschman CS, Seymour CW et al (2016) The third international consensus definitions for sepsis and septic shock (Sepsis-3). JAMA 315:801–810

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Chevyrev I and Kormilitzin A (2016). A primer on the signature method in machine learning. http://arxiv.org/abs/1603.03788

  25. Aggarwal CC (2015) Data mining: the textbook. Springer, New York

    Google Scholar 

  26. Rizzo R, Fiannaca A, La Rosa M et al (2016) A deep learning approach to DNA sequence classification. In: Computational intelligence methods for bioinformatics and biostatistics. Springer, New York

    Google Scholar 

  27. Kadous MW, Sammut C (2005) Classification of multivariate time series and structured data using constructive induction. Mach Learn 58:179–216

    Article  Google Scholar 

  28. Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. In: Guyon I, Luxburg UV, Bengio S et al (eds) Advances in neural information processing systems 30. Curran Associates, Red Hook, pp 5998–6008

    Google Scholar 

  29. Harutyunyan H, Khachatrian H, Kale DC et al (2019) Multitask learning and benchmarking with clinical time series data. https://doi.org/10.1038/s41597-019-0103-9

  30. Bagnall A, Lines J, Bostrom A et al (2017) The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. https://doi.org/10.1007/s10618-016-0483-9

  31. Ismail Fawaz H, Forestier G, Weber J et al (2019) Deep learning for time series classification: a review. Data Min Knowl Discov 33:917–963

    Article  Google Scholar 

  32. Futoma J, Hariharan S, Heller K (2017) Learning to detect sepsis with a multitask Gaussian process RNN classifier, In: Proceedings of the 34th international conference on machine learning—volume 70. JMLR.org, Sydney, NSW, pp 1174–1182

    Google Scholar 

  33. Calvert JS, Price DA, Chettipally UK et al (2016) A computational approach to early sepsis detection. Comput Biol Med 74:69–73

    Article  PubMed  Google Scholar 

  34. Moor M, Horn M, Rieck B et al (2019) Early recognition of sepsis with Gaussian process temporal convolutional networks and dynamic time warping.

    Google Scholar 

  35. Futoma J, Hariharan S, Sendak M et al (2017) An improved multi-output Gaussian process RNN with real-time validation for early sepsis detection. http://arxiv.org/abs/1708.05894

  36. Ferrer R, Martin-Loeches I, Phillips G et al (2014) Empiric antibiotic treatment reduces mortality in severe sepsis and septic shock from the first hour: results from a guideline-based performance improvement program. Crit Care Med 42:1749–1755

    Article  CAS  PubMed  Google Scholar 

  37. Shimabukuro DW, Barton CW, Feldman MD et al (2017) Effect of a machine learning-based severe sepsis prediction algorithm on patient survival and hospital length of stay: a randomised clinical trial. BMJ Open Respir Res 4:e000234

    Article  PubMed  PubMed Central  Google Scholar 

  38. Desautels T, Calvert J, Hoffman J et al (2016) Prediction of sepsis in the intensive care unit with minimal electronic health record data: a machine learning approach. JMIR Med Inform 4:e28

    Article  PubMed  PubMed Central  Google Scholar 

  39. Reyna M, Josef C, Jeter R et al (2019) Early prediction of sepsis from clinical data: the PhysioNet/computing in cardiology challenge 2019. Crit Care Med

    Google Scholar 

  40. Sakoe H, Chiba S (1978) Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans Acoust 26:43–49

    Article  Google Scholar 

  41. Xi X, Keogh E, Shelton C et al (2006) Fast time series classification using numerosity reduction. In: Proceedings of the 23rd international conference on machine learning. ACM, New York, NY, pp 1033–1040

    Chapter  Google Scholar 

  42. Dau HA, Silva DF, Petitjean F et al (2018) Optimizing dynamic time warping’s window width for time series data mining applications. https://doi.org/10.1007/s10618-018-0565-y

  43. Hastie T, Tibshirani R, Friedman J et al (2005) The elements of statistical learning: data mining, inference and prediction. Math Intelligencer 27:83–85

    Google Scholar 

  44. Bishop CM (2006) Pattern recognition and machine learning. Springer, New York

    Google Scholar 

  45. Ghalwash MF, Obradovic Z (2012) Early classification of multivariate temporal observations by extraction of interpretable shapelets. BMC Bioinformatics 13:195

    Article  PubMed  PubMed Central  Google Scholar 

  46. Ghalwash M, Radosavljevic V, Obradovic Z (2013) Early diagnosis and its benefits in sepsis blood purification treatment. In: 2013 IEEE international conference on healthcare informatics. pp 523–528

    Google Scholar 

  47. Bock C, Gumbsch T, Moor M et al (2018) Association mapping in biomedical time series via statistically significant shapelet mining. Bioinformatics 34:i438–i446

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Xu J, Zhang Y, Zhang P et al (2017) Data mining on icu mortality prediction using early temporal data: a survey. Int J Inf Technol Decis Mak 16:117–159

    Article  Google Scholar 

  49. Shanjina T, Sivakumar PB (2012) Human gait recognition and classification using time series shapelets. In: 2012 international conference on advances in computing and communications. pp 31–34

    Google Scholar 

  50. Shannon CE, Weaver W (1949) The mathematical theory of communication. University of Illinois Press, Champaign

    Google Scholar 

  51. Rakthanmanon T, Keogh E (2011) Fast-shapelets: a fast algorithm for discovering robust time series shapelets. In: Proceedings of 11th SIAM international conference on data mining,

    Google Scholar 

  52. Grabocka J, Schilling N, Wistuba M et al (2014) Learning time-series shapelets. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, NY, pp 392–401

    Google Scholar 

  53. Hills J, Lines J, Baranauskas E et al (2014) Classification of time series by shapelet transformation. Data Min Knowl Discov 28:851–881

    Article  Google Scholar 

  54. Dudoit S, van der Laan MJ (2007) Multiple testing procedures with applications to genomics. Springer Science & Business Media, Berlin

    Google Scholar 

  55. Llinares-Lopez F, Borgwardt K (2019) Machine learning for biomarker discovery: significant pattern mining. In: Pržulj N (ed) Analyzing network data in biology and medicine: an interdisciplinary textbook for biological, medical and computational scientists. Cambridge University Press, Cambridge, pp 313–368

    Google Scholar 

  56. Fisher RA (1922) On the interpretation of χ2 from contingency tables, and the calculation of P. J R Stat Soc 85:87–94

    Article  Google Scholar 

  57. Bonferroni CE (1936) Teoria statistica delle classi e calcolo delle probabilita. Libreria internazionale Seeber, Firenze

    Google Scholar 

  58. Tarone RE (1990) A modified Bonferroni method for discrete data. Biometrics 46:515–522

    Article  CAS  PubMed  Google Scholar 

  59. Terada A, Okada-Hatakeyama M, Tsuda K et al (2013) Statistical significance of combinatorial regulations. Proc Natl Acad Sci U S A 110:12996–13001

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Devlin J, Chang M-W, Lee K et al (2018) BERT: pre-training of deep bidirectional transformers for language understanding. http://arxiv.org/abs/1810.04805

  61. Girshick R, Donahue J, Darrell T et al (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 580–587

    Google Scholar 

  62. Tomašev N, Glorot X, Rae JW et al (2019) A clinically applicable approach to continuous prediction of future acute kidney injury. Nature 572:116–119

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  63. Lecun Y, Bottou L, Bengio Y et al (1998) Gradient-based learning applied to document recognition. https://doi.org/10.1109/5.726791

  64. He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 770–778

    Google Scholar 

  65. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9:1735–1780

    Article  CAS  PubMed  Google Scholar 

  66. Rumelhart DE, Hinton GE, Williams RJ et al (1988) Learning representations by back-propagating errors. Cogn Model 5:1

    Google Scholar 

  67. Zhou B, Khosla A, Lapedriza A et al (2016) Learning deep features for discriminative localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 2921–2929

    Google Scholar 

  68. Shanmugam D, Blalock D, Guttag J (2018) Multiple instance learning for ECG risk stratification. http://arxiv.org/abs/1812.00475

  69. Brueckner R, Schulter B (2014) Social signal classification using deep blstm recurrent neural networks. In 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP). pp 4823–4827

    Google Scholar 

  70. Xiong W, Wu L, Alleva F et al (2018) The Microsoft 2017 conversational speech recognition system. In 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP). pp 5934–5938

    Google Scholar 

  71. Ramachandran P, Zoph B, Le QV (2017) Searching for activation functions. http://arxiv.org/abs/1710.05941

  72. Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press, Cambridge

    Google Scholar 

  73. Graves A, Liwicki M, Fernández S et al (2009) A novel connectionist system for unconstrained handwriting recognition. IEEE Trans Pattern Anal Mach Intell 31:855–868

    Article  PubMed  Google Scholar 

  74. https://doi.org/10.21236/ada164453

  75. Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5:157–166

    Article  CAS  PubMed  Google Scholar 

  76. Pascanu R, Mikolov T, Bengio Y (2013) On the difficulty of training recurrent neural networks. In: International conference on machine learning. pp 1310–1318

    Google Scholar 

  77. Vinyals O, Toshev A, Bengio S et al (2015) Show and tell: a neural image caption generator, In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 3156–3164

    Google Scholar 

  78. Wu Y, Schuster M, Chen Z et al (2016) Google’s neural machine translation system: bridging the gap between human and machine translation. http://arxiv.org/abs/1609.08144

  79. Cireşan DC, Giusti A, Gambardella LM et al (2013) Mitosis detection in breast cancer histology images with deep neural networks. Med Image Comput Comput Assist Interv 16:411–418

    PubMed  Google Scholar 

  80. Cho K, Merrienboer B van, Gulcehre C et al (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. https://doi.org/10.3115/v1/d14-1179

  81. Chung J, Gulcehre C, Cho K et al (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. http://arxiv.org/abs/1412.3555

  82. Russakovsky O, Deng J, Su H et al (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115:211–252

    Article  Google Scholar 

  83. Fukushima K (1980) Neocognitron: a self organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol Cybern 36:193–202

    Article  CAS  PubMed  Google Scholar 

  84. LeCun Y, Bengio Y et al (1995) Convolutional networks for images, speech, and time series. 3361:1995

    Google Scholar 

  85. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Pereira F, Burges CJC, Bottou L et al (eds) Advances in neural information processing systems 25. Curran Associates, Red Hook, pp 1097–1105

    Google Scholar 

  86. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 3431–3440

    Google Scholar 

  87. Lea C, Flynn MD, Vidal R et al (2017) Temporal convolutional networks for action segmentation and detection. In: proceedings of the IEEE conference on computer vision and pattern recognition. pp 156–165

    Google Scholar 

  88. Bai S, Zico Kolter J, Koltun V (2018) An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. http://arxiv.org/abs/1803.01271

  89. Yu F, Koltun V (2015) Multi-scale context aggregation by dilated convolutions. http://arxiv.org/abs/1511.07122

  90. Oord A van den, Dieleman S, Zen H et al (2016) WaveNet: a generative model for raw audio. http://arxiv.org/abs/1609.03499

  91. Waibel A, Hanazawa T, Hinton G et al (1989) Phoneme recognition using time-delay neural networks. IEEE Trans Acoust 37:328–339

    Article  Google Scholar 

  92. Salimans T, Kingma DP (2016) Weight normalization: a simple reparameterization to accelerate training of deep neural networks. In: Advances in neural information processing systems. pp 901–909

    Google Scholar 

  93. Ba JL, Kiros JR, Hinton GE (2016) Layer normalization. http://arxiv.org/abs/1607.06450

  94. Bonilla EV, Chai KM, Williams C (2008) Multi-task Gaussian process prediction. In: Platt JC, Koller D, Singer Y et al (eds) Advances in neural information processing systems 20. Curran Associates, Red Hook, pp 153–160

    Google Scholar 

  95. Li SC-X, Marlin BM (2016) A scalable end-to-end Gaussian process adapter for irregularly sampled time series classification. In: Lee DD, Sugiyama M, Luxburg UV et al (eds) Advances in neural information processing systems 29. Curran Associates, Red Hook, pp 1804–1812

    Google Scholar 

Download references

Acknowledgments

The authors would like to thank Dr. Bastian Rieck, Dr. Damian Roqueiro, and Max Horn for their valuable input and discussion.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Karsten Borgwardt .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Bock, C., Moor, M., Jutzeler, C.R., Borgwardt, K. (2021). Machine Learning for Biomedical Time Series Classification: From Shapelets to Deep Learning. In: Cartwright, H. (eds) Artificial Neural Networks. Methods in Molecular Biology, vol 2190. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-0826-5_2

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-0826-5_2

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-0825-8

  • Online ISBN: 978-1-0716-0826-5

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics