Abstract
Divers Alert Network (DAN) created a database (DB) with a big amount of dive related data which has been collected since 1994 within the scope of Dive Safety Laboratory (DSL) project. The aim of this study is to analyze the DB using data mining techniques. The clustering of divers by their health and demographic information and reveal significant differences in diver groups are the main objectives of this study.
To eliminate time effect of age, divers who participated to only one dive were included in the study. The numbers of one-dive divers is 874. Before applying clustering methods, data cleaning was performed to eliminate the potential mistakes resulting from inconsistencies, inaccuracies and missing information. TwoStep, Gower distances and K-means clustering methods were performed on DB to find the naturally associated clusters. Conventional statistical analyses were performed to understand differences in clusters and between male and female divers.
As the result of these analyses, divers were separated into 3 clusters and distinguishing variables of these clusters were revealed. As TwoStep and Gower Distances are suitable for categorical variables, age and dive activity years were distributed in 3 categories. For K-Means Clustering, original numerical values of these variables was used. The most distinct clusters were formed by TwoStep Clustering. The middle aged male divers with without any health problem are in Cluster 1. Male and female divers with health problems and high rate of cigarette smoking are in the Cluster 2 and old divers with many dive activity years are in the Cluster 3. The search for significant differences in dive-related variables was performed based on the TwoStep Clustering results and separating male and female divers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Divers Alert Network, 1988–2011 Annual Diving Reports. https://www.diversalertnetwork.org/medical/report/. Accessed 15 Oct 2015
Ozyigit, T., Egi, S.M., Denoble, P., Balestra, C., Aydin, S., Vann, V., Marroni, A.: Decompression illness medically reported by hyperbaric treatment facilities: cluster analysis of 1929 cases. Aviat Space Environ. Med. 81(1), 1–5 (2010)
Ozyigit, T., Egi, S.M.: Commercial diver selection using multiple-criteria decision-making methods. Undersea Hyperb. Med. 41(6), 565–572 (2014)
Chang, H.L., Yeh, T.H.: Motorcyclist accident involvement by age, gender and risky behaviors in Taipei. Taiwan. Transp. Res. Part F 10, 109–122 (2007)
Chiu, T., Fang, D., Chen, J., Wang, Y., Jeris, C.: A robust and scalable clustering algorithm for mixed type of attributes in large database environment. In: Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 263–268, San Francisco (2001)
Lloyd, S.P.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982)
Aggarwal, C.C., Yu, P.S.: Data mining techniques for associations, clustering and classification. In: Zhong, N., Zhou, L. (eds.) PAKDD 1999. LNCS (LNAI), vol. 1574, pp. 13–23. Springer, Heidelberg (1999)
Gower, J.C.: A general coefficient of similarity and some of its properties. Biometrics 27(4), 857–871 (1971)
Acknowledgement
This project has been financed by Galatasaray University, Scientific Research Project Commission - Project No. 15.401.001.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Ozyigit, T. et al. (2016). Data Mining on Divers Alert Network DSL Database: Classification of Divers. In: Perner, P. (eds) Advances in Data Mining. Applications and Theoretical Aspects. ICDM 2016. Lecture Notes in Computer Science(), vol 9728. Springer, Cham. https://doi.org/10.1007/978-3-319-41561-1_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-41561-1_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41560-4
Online ISBN: 978-3-319-41561-1
eBook Packages: Computer ScienceComputer Science (R0)