Skip to main content
Log in

Impact of time series discretization on intensive care burn unit survival classification

  • Regular Paper
  • Published:
Progress in Artificial Intelligence Aims and scope Submit manuscript

Abstract

In the preprocessing step of a knowledge discovery process, the method of discretization selected can have a remarkable impact on the performance and accuracy of classification algorithms. In this article, we analyze and compare expert discretization and automatic discretization algorithms. In particular, we study their impact to predict the survival of patients in the context of intensive care burn units. We focus on the quality of different discretizations algorithm analyzing the number of intervals generated, the amount of patterns produced and the classification performance in a specific clinical problem. Our results show that the many algorithms underperform expert discretization and that it is necessary to take into account the correlation among continuous features to obtain the best accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  1. Agrawal, R., Srikant, R.: Mining sequential patterns. In: International Conference on Data Engineering, March 6–10, 1995, Taipei, Taiwan (1995)

  2. Alcalá-Fdez, J., Sánchez, L., García, S., del Jesus, M.J., Ventura, S., Garrell, J.M., Otero, J., Romero, C., Bacardit, J., Rivas, V.M., Fernández, J.C., Herrera, F.: KEEL: a software tool to assess evolutionary algorithms to data mining problems. Soft Comput. 13(3), 307–318 (2009)

    Article  Google Scholar 

  3. Azulay, R. et al.: Discretization of medical time series—A comparative study. In: Proceedings of the IDAMAP 2007, Amsterdam, The Netherlands, (2007)

  4. Casanova, I.J., Campos, M., Juarez, J.M., Fernandez-Fernandez-Arroyo, A., Lorente, J.A.: Using multivariate sequential patterns to improve survival prediction in Intensive Care Burn Unit. In: Proceedings of the 15th Conference on Artificial Intelligence in Medicine, AIME 2015, pp. 277–286. Pavia, Italy (2015)

  5. Casanova, I.J., Campos, M., Juarez, J.M., Fernandez-Fernandez-Arroyo, A., Lorente, J.A.: Impact of discretization with multivariate sequential patterns to do the classification of the survival prediction in Intensive Care Burn Unit. In: Proceedings of the VIII Simposio Teoría y Aplicaciones de Minería de Datos (TAMIDA 2016). CAEPIA 2016, pages 847–856. Salamanca, Spain (2016)

  6. Cios, K.J., Pedrycz, W., Swiniarski, R.W., Kurgan, L.: Data Mining: A Knowledge Discovery Approach. Springer Science & Business Media, Berlin (2007)

    MATH  Google Scholar 

  7. Clarke, E.J., Barton, B.A.: Entropy and MDL discretization of continuous variables for Bayesian belief networks. Int. J. Intell. Syst. 15, 61–92 (2000)

    Article  Google Scholar 

  8. Cohen, W.W.: Fast effective rule induction. In: Proceedings of the 20th International Conference on Machine Learning, pp. 115–123. Morgan Kaufmann, (1995)

  9. Demsar, J., Zupan, B., Aoki, N., et al.: Feature mining and predictive model construction from severe trauma patient’s data. Int. J. Med. Inform. 63, 41–50 (2012)

    Article  Google Scholar 

  10. Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. In: XIII International Joint Conference on Artificial Intelligence (IJCAI93), Chambery, France, pp. 1022–1029, (1993)

  11. Ferreira, A.J.: Feature selection and discretization for high-dimensional data. Ph.D. Thesis, Universidade de Lisboa, (2014)

  12. Garcia, S., Luengo, J., Saez, J.A., Lopez, V., Herrera, F.: A survey of discretization techniques: taxonomy and empirical analysis in supervised learning. IEEE Trans. Knowl. Data Eng. 25(4), 734–750 (2013)

    Article  Google Scholar 

  13. Gomariz, A.: Techniques for the discovery of temporal patterns. Ph.D. Thesis, University of Murcia (Spain), University of Antwerp (Belgium), (2013)

  14. Hoppner, F.: Time series abstraction methods—A survey in workshop on knowledge discovery in databases, Dortmund, (2002)

  15. Jimenez, F., Sanchez, G., Juarez, J.M.: Multi-objective evolutionary algorithms for fuzzy classification in survival prediction. Artif. Intell. Med. 60, 197–219 (2014)

    Article  Google Scholar 

  16. Kerber, R.: ChiMerge: discretization of numeric attributes. In: Proceedings of 10th International Artificial Intelligence, pp. 123–128, (1992)

  17. Kotsiantis, S., Kanellopoulos, D.: Discretization techniques: a recent survey. GESTS Int. Trans. Comput. Sci. Eng. 32(1), 47–58 (2006)

    Google Scholar 

  18. Lee, C.: A Hellinger-based discretization method for numeric attributes in classification learning. Knowl. Based Syst. 20(4), 419–425 (2007)

    Article  Google Scholar 

  19. Lima, M.D.C., et al.: Heuristic discretization method for bayesian networks. J. Comput. Sci. 10(5), 869–878 (2014)

    Article  Google Scholar 

  20. Lin, J., Keogh, E., Lonardi, S., Chiu, B.: A symbolic representation of time series with implications for streaming algorithms. In: Proceedings of the 8th ACM SIGMOD DMKD workshop, (2003)

  21. Liu, H., Hussain, F., Tan, C.L., Dash, M.: Discretization: an enabling technique. Data Min. Knowl. Discov. 6(4), 393–423 (2002)

    Article  MathSciNet  Google Scholar 

  22. Liu, X.: A discretization algorithm based on a heterogeneity criterion. IEEE Trans. Knowl. Data Eng. 17(9), 1166–1173 (2005)

    Article  Google Scholar 

  23. Maslove, D.M., Podchiyska, T., Lowe, H.J.: Discretization of continuous features in clinical datasets. J. Am. Med. Inform. Assoc. 20(3), 544–553 (2013)

    Article  Google Scholar 

  24. Mehta, S., Parthasarathy, S., Yang, H.: Toward unsupervised correlation preserving discretization. IEEE Trans. Knowl. Data Eng. 17(9), 1174–1185 (2005)

    Article  Google Scholar 

  25. Mörchen, F., Ultsch, A.: Optimizing time series discretization for knowledge discovery. In: Proceedings of the KDD05 (2005)

  26. Moskovitch, R., Shahar, Y.: Classification-driven temporal discretization of multivariate time series. Data Min. Knowl. Discov. 29(4), 871–913 (2015)

    Article  MathSciNet  Google Scholar 

  27. Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)

    Google Scholar 

  28. Ridzuan, N., Wolfe, D.: Human Readable Rule Induction in Medical Data Mining: A Survey of Existing Algorithms Proceedings of the European Computing Conference, Lecture Notes in Electrical Engineering, Volume 27, pp. 787–798 (2009)

  29. Ruiz, F.J., Angulo, C., Agell, N.: IDD: a supervised interval distance-based method for discretization. IEEE Trans. Knowl. Data Eng. 20(9), 1230–1238 (2008)

    Article  Google Scholar 

  30. Shahar, Y.: A framework for knowledge-based temporal abstraction. Artif. Intell. 90(1—-2), 79–133 (1997)

    Article  MATH  Google Scholar 

  31. Sheppard, N.N., Hemington-Gorse, S., Shelley, O.P., Philp, B., Dziewulski, P.: Prognostic scoring systems in burns: a review. Burns 37(8), 1288–1295 (2011)

    Article  Google Scholar 

  32. Stacey, M., McGregor, C.: Temporal abstraction in intelligent clinical data analysis: a survey. Artif. Intell. Med. 39, 1–24 (2007)

    Article  Google Scholar 

  33. Sun, C.-T., Hsu, J.H.: An extended Chi2 algorithm for discretization of real value attributes. IEEE Trans. Knowl. Data Eng. 17(3), 437–441 (2005)

    Article  Google Scholar 

  34. Wu, Q.X., Bell, D.A., Prasad, G., McGinnity, T.M.: A distribution-index-based discretizer for decision-making with symbolic AI approaches. IEEE Trans. Knowl. Data Eng. 19(1), 17–28 (2007)

  35. Zighed, D.A., Rabaseda, R., Rakotomalala, R.: FUSINTER: a method for discretization of continuous attributes. Int. J. Uncertain. Fuzz. Knowl.-Based Syst. 6(3), 307–326 (1998)

    Article  MATH  Google Scholar 

Download references

Acknowledgements

This work was partially funded by the Spanish Ministry of Economy and Competitiveness under project TIN2013-45491-R, European Fund for Regional Development (EFRD),and Instituto de Salud Carlos III (Ref: FIS PI 12/2898).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Manuel Campos.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Casanova, I.J., Campos, M., Juarez, J.M. et al. Impact of time series discretization on intensive care burn unit survival classification. Prog Artif Intell 7, 41–53 (2018). https://doi.org/10.1007/s13748-017-0130-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13748-017-0130-8

Keywords

Navigation