A Fuzzy-Based Approach to Survival Data Mining

Last, Mark; Halpert, Hezi

doi:10.1007/978-3-319-19683-1_18

Mark Last⁵ &
Hezi Halpert⁵

Part of the book series: Studies in Fuzziness and Soft Computing ((STUDFUZZ,volume 326))

1618 Accesses

Abstract

Traditional data mining algorithms assume that all data on a given object becomes available simultaneously (e.g., by accessing the object record in a database). However, certain real-world applications, known as survival analysis, or event history analysis (EHA), deal with monitoring specific objects, such as medical patients, in the course of their lifetime. The data streams produced by such applications contain various events related to the monitored objects. When we observe an infinite stream of events, at each point in time (the “cut-off point”), some of the monitored entities are “right-censored”, since they have not experienced the event of interest yet and we do not know when the event will occur in the future. In snapshot monitoring, the data stream is observed as a sequence of periodic snapshots. Given each snapshot, we are interested to estimate the probability of a critical event (e.g., patient death or equipment failure) as a function of time for every monitored object. In this research, we use fuzzy class label adjustment so that standard classification algorithms can seamlessly handle a snapshot stream of both censored and non-censored data. The objective is to provide reasonably accurate predictions after observing relatively few snapshots of the data stream and to improve the classification performance with additional information obtained from each incoming snapshot. The proposed fuzzy-based methodology is evaluated on real-world snapshot streams from two different domains of survival analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Krempl, G., Žliobaite, I., Brzeziński, D., Hüllermeier, E., Last, M., Lemaire, V., Noack, T., Shaker, A., Sievi, S., Spiliopoulou, M., Stefanowski, J.: Open challenges for data stream mining research. ACM SIGKDD Explor. Newsl. 16(1), 1–10 (2014)
Article Google Scholar
Pizzi, N., Pedrycz, W.: Fuzzy set theoretic adjustment to training set class labels using robust location measures. In: Proceedings of the IEEE-INNS-ENNS International Joint Conference on, pp.109, 112 vol. 3, 2000 (2000)
Google Scholar
Moeschberger, M.L., Klein, J.P.: Examples of survival data. In: Survival Analysis: Techniques for Censored and Truncated Data, 2nd edn, pp. 1–20. Springer, Berlin (2003)
Google Scholar
Fleming, T., Lin, D.: Survival analysis in clinical trials: past developments and future directions. Biometrics 56(4), 971–983 (2000)
Article MATH MathSciNet Google Scholar
Kaplan, E.L., Meier, P.: Nonparametric estimation from incomplete observations. J. Am. Stat. Assoc. 53(282), 457–481 (1958)
Article MATH MathSciNet Google Scholar
Costella, J.: A simple alternative to Kaplan–Meier for survival curves. Peter MacCallum Cancer Centre Working Paper No (2010)
Google Scholar
Last, M., Zhmudyak, A., Halpert, H., Chakrabarty, S.: Multi-dimensional failure probability estimation in automotive industry based on censored warranty data. In: Synergies of Soft Computing and Statistics for Intelligent Data Analysis. Berlin/Heidelberg (2013)
Google Scholar
Cox, D.R.: Regression models and life-tables. J. Roy. Stat. Soc. 34(2), 187–220 (1972)
MATH Google Scholar
Segal, M.: Regression trees for censored data. Biometrics 44(1), 35–47 (1988)
Article MATH Google Scholar
Zupan, B., Demsar, J., Kattan, M.W., Beck, R., Bratko, I.: Machine learning for survival analysis: a case study on recurrence of prostate cancer. Artif. Intell. Med. 20(1), 59–75 (2000)
Article Google Scholar
Shaker, A., Hullermeier, E.: Event history analysis on data streams. Int. J. Appl. Math. Comput. Sci. (to appear)
Google Scholar
Last, M., Halpert, H.: Survival analysis meets data stream mining. In: First Workshop on Real-World Challenges for Data Stream Mining (RealStream 2013) (2013)
Google Scholar
Rueping, S.: SVM classifier estimation from group probabilities. In: International Conference on Machine Learning, Haifa, Israel (2010)
Google Scholar
Hernández, J., Inza, I.: Learning naive Bayes models for multiple-instance learning with label proportions. In: Advances in Artificial Intelligence, pp. 134–144 (2011)
Google Scholar
Provost, F., Domingos, P.: Tree Induction for Probability-Based Ranking. Mach. Learn. 52(3), 199–215 (2003)
Article MATH Google Scholar
Bifet, A., Holmes, G., Kirkby, R., Pfahringer, B.: MOA: massive online analysis. J. Mach. Learn. Res. 1601–1604 (2010)
Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA Data Mining Software: An Update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Article Google Scholar
Wayne, I., Pat, L.: Induction of one-level decision trees. In: ML. Aberdeen, Scotland (1992)
Google Scholar
Freund, Y., Schapire, R., Abe, N.: A short introduction to boosting. J. Japan. Soc. Artif. Intell. 14, 771–780 (1999)
Google Scholar

Download references

Acknowledgments.

This work was supported in part by the General Motors Global Research & Development - India Science Lab.

Author information

Authors and Affiliations

Department of Information Systems Engineering, Ben-Gurion University of the Negev, 84105, Beer-Sheva, Israel
Mark Last & Hezi Halpert

Authors

Mark Last
View author publications
You can also search for this author in PubMed Google Scholar
Hezi Halpert
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mark Last .

Editor information

Editors and Affiliations

Department of Computer Science, Texas State University, San Marcos, Texas, USA
Dan E. Tamir
School of Computing and Information Sciences, Florida International University, Miami, Florida, USA
Naphtali D. Rishe
School of Computing and Information Sciences, The University of South Florida, Tampa, Florida, USA, and Florida International University, Miami, Florida, USA
Abraham Kandel

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Last, M., Halpert, H. (2015). A Fuzzy-Based Approach to Survival Data Mining. In: Tamir, D., Rishe, N., Kandel, A. (eds) Fifty Years of Fuzzy Logic and its Applications. Studies in Fuzziness and Soft Computing, vol 326. Springer, Cham. https://doi.org/10.1007/978-3-319-19683-1_18

Download citation

DOI: https://doi.org/10.1007/978-3-319-19683-1_18
Published: 24 May 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19682-4
Online ISBN: 978-3-319-19683-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics