Skip to main content

A Fuzzy-Based Approach to Survival Data Mining

  • Chapter
  • First Online:
Fifty Years of Fuzzy Logic and its Applications

Part of the book series: Studies in Fuzziness and Soft Computing ((STUDFUZZ,volume 326))

  • 1618 Accesses

Abstract

Traditional data mining algorithms assume that all data on a given object becomes available simultaneously (e.g., by accessing the object record in a database). However, certain real-world applications, known as survival analysis, or event history analysis (EHA), deal with monitoring specific objects, such as medical patients, in the course of their lifetime. The data streams produced by such applications contain various events related to the monitored objects. When we observe an infinite stream of events, at each point in time (the “cut-off point”), some of the monitored entities are “right-censored”, since they have not experienced the event of interest yet and we do not know when the event will occur in the future. In snapshot monitoring, the data stream is observed as a sequence of periodic snapshots. Given each snapshot, we are interested to estimate the probability of a critical event (e.g., patient death or equipment failure) as a function of time for every monitored object. In this research, we use fuzzy class label adjustment so that standard classification algorithms can seamlessly handle a snapshot stream of both censored and non-censored data. The objective is to provide reasonably accurate predictions after observing relatively few snapshots of the data stream and to improve the classification performance with additional information obtained from each incoming snapshot. The proposed fuzzy-based methodology is evaluated on real-world snapshot streams from two different domains of survival analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Krempl, G., Žliobaite, I., Brzeziński, D., Hüllermeier, E., Last, M., Lemaire, V., Noack, T., Shaker, A., Sievi, S., Spiliopoulou, M., Stefanowski, J.: Open challenges for data stream mining research. ACM SIGKDD Explor. Newsl. 16(1), 1–10 (2014)

    Article  Google Scholar 

  2. Pizzi, N., Pedrycz, W.: Fuzzy set theoretic adjustment to training set class labels using robust location measures. In: Proceedings of the IEEE-INNS-ENNS International Joint Conference on, pp.109, 112 vol. 3, 2000 (2000)

    Google Scholar 

  3. Moeschberger, M.L., Klein, J.P.: Examples of survival data. In: Survival Analysis: Techniques for Censored and Truncated Data, 2nd edn, pp. 1–20. Springer, Berlin (2003)

    Google Scholar 

  4. Fleming, T., Lin, D.: Survival analysis in clinical trials: past developments and future directions. Biometrics 56(4), 971–983 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  5. Kaplan, E.L., Meier, P.: Nonparametric estimation from incomplete observations. J. Am. Stat. Assoc. 53(282), 457–481 (1958)

    Article  MATH  MathSciNet  Google Scholar 

  6. Costella, J.: A simple alternative to Kaplan–Meier for survival curves. Peter MacCallum Cancer Centre Working Paper No (2010)

    Google Scholar 

  7. Last, M., Zhmudyak, A., Halpert, H., Chakrabarty, S.: Multi-dimensional failure probability estimation in automotive industry based on censored warranty data. In: Synergies of Soft Computing and Statistics for Intelligent Data Analysis. Berlin/Heidelberg (2013)

    Google Scholar 

  8. Cox, D.R.: Regression models and life-tables. J. Roy. Stat. Soc. 34(2), 187–220 (1972)

    MATH  Google Scholar 

  9. Segal, M.: Regression trees for censored data. Biometrics 44(1), 35–47 (1988)

    Article  MATH  Google Scholar 

  10. Zupan, B., Demsar, J., Kattan, M.W., Beck, R., Bratko, I.: Machine learning for survival analysis: a case study on recurrence of prostate cancer. Artif. Intell. Med. 20(1), 59–75 (2000)

    Article  Google Scholar 

  11. Shaker, A., Hullermeier, E.: Event history analysis on data streams. Int. J. Appl. Math. Comput. Sci. (to appear)

    Google Scholar 

  12. Last, M., Halpert, H.: Survival analysis meets data stream mining. In: First Workshop on Real-World Challenges for Data Stream Mining (RealStream 2013) (2013)

    Google Scholar 

  13. Rueping, S.: SVM classifier estimation from group probabilities. In: International Conference on Machine Learning, Haifa, Israel (2010)

    Google Scholar 

  14. Hernández, J., Inza, I.: Learning naive Bayes models for multiple-instance learning with label proportions. In: Advances in Artificial Intelligence, pp. 134–144 (2011)

    Google Scholar 

  15. Provost, F., Domingos, P.: Tree Induction for Probability-Based Ranking. Mach. Learn. 52(3), 199–215 (2003)

    Article  MATH  Google Scholar 

  16. Bifet, A., Holmes, G., Kirkby, R., Pfahringer, B.: MOA: massive online analysis. J. Mach. Learn. Res. 1601–1604 (2010)

    Google Scholar 

  17. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA Data Mining Software: An Update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)

    Article  Google Scholar 

  18. Wayne, I., Pat, L.: Induction of one-level decision trees. In: ML. Aberdeen, Scotland (1992)

    Google Scholar 

  19. Freund, Y., Schapire, R., Abe, N.: A short introduction to boosting. J. Japan. Soc. Artif. Intell. 14, 771–780 (1999)

    Google Scholar 

Download references

Acknowledgments.

This work was supported in part by the General Motors Global Research & Development - India Science Lab.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mark Last .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Last, M., Halpert, H. (2015). A Fuzzy-Based Approach to Survival Data Mining. In: Tamir, D., Rishe, N., Kandel, A. (eds) Fifty Years of Fuzzy Logic and its Applications. Studies in Fuzziness and Soft Computing, vol 326. Springer, Cham. https://doi.org/10.1007/978-3-319-19683-1_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-19683-1_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-19682-4

  • Online ISBN: 978-3-319-19683-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics