Abstract
This study adopted an integrated procedure that combines the clustering and classification features of data mining technology to determine the differences between the symptoms shown in past cases where patients died from or survived oral cancer. Two data mining tools, namely decision tree and artificial neural network, were used to analyze the historical cases of oral cancer, and their performance was compared with that of logistic regression, the popular statistical analysis tool. Both decision tree and artificial neural network models showed superiority to the traditional statistical model. However, as to clinician, the trees created by the decision tree models are relatively easier to interpret compared to that of the artificial neural network models. Cluster analysis also discovers that those stage 4 patients whose also possess the following four characteristics are having an extremely low survival rate: pN is N2b, level of RLNM is level I-III, AJCC-T is T4, and cells mutate situation (G) is moderate.
Similar content being viewed by others
References
Centers for Disease Control and Prevention http://www.cdc.gov/OralHealth/oral_cancer/index.htm Accessed 29 March 2014.
Health Pormotion Administration, Ministry of Health and Weifare http://www.hpa.gov.tw/BHPNet/Web/News/News.aspx?No=201404150002 Accessed 21 April 2014.
Lewin, F., Norell, S. E., Johansson, H., et al., Smoking tobacco, oral snuff, and alcohol in the etiology of squamous cell carcinoma of the head and neck: a population-based case-referent study in Sweden. Cancer 82:1367–1375, 1998.
Ho, P. S., Ko, Y. C., Yang, Y. H., Shieh, T. Y., and Tsai, C. C., The incidence of oropharyngeal cancer in Taiwan: an endemic betel quid chewing area. J. Oral Pathol. Med. 31:213–219, 2002.
Health Pormotion Administration, Ministry of Health and Weifare http://www.doh.gov.tw/statistic/index.htm Accessed 21 December 2013.
Taiwan public health report 2009 http://www.mohw.gov.tw/MOHW_Upload/doc/98%E5%B9%B4%E4%B8%AD%E6%96%87%E7%89%88%E8%A1%9B%E7%94%9F%E5%B9%B4%E5%A0%B1_0042862000.pdf Accessed 21 April 2014.
Arbes, S. J., Jr., Olshan, A. F., Caplan, D. J., Schoenbach, V. J., Slade, G. D., and Symons, M. J., Factors contributing to the poorer survival of black Americans diagnosed with oral cancer (United States). Cancer Causes Control 10:513–523, 1999.
Bànkfalvi, A., and Piffkò, J., Prognostic and predictive factors in oral cancer: the role of the invasive tumour front. J. Oral Pathol. Med. 29:291–298, 2000.
Schliephake, H., Prognostic relevance of molecular markers of oral cancer—a review. Int. J. Oral Maxillofac. Surg. 32:233–245, 2003.
de Melo, G. M., Ribeiro, K. D. C. B., Kowalski, L. P., and Deheinzelin, D., Risk factors for postoperative complications in oral cancer and their prognostic implications. Arch. Otolaryngol. Head Neck Surg. 127:828–833, 2001.
Pande, P., Soni, S., Kaur, J., et al., Prognostic factors in betel and tobacco related oral cancer. Oral Oncol 38:491–499, 2002.
Lu, H. Y., Li, T. C., Tu, Y. K., Tsai, J. C., Lai, H. S., and Kuo, L. T., Predicting long-term outcome after traumatic brain injury using repeated measurements of Glasgow coma scale and data mining methods. J. Med. Syst. 2015. doi:10.1007/s10916-014-0187-x.
Nahar, J., Tickle, K. S., Ali, A. B. M. S., and Chen, Y. P. P., Significant cancer prevention factor extraction: an association rule discovery approach. J. Med. Syst. 35:353–367, 2011.
Chao, C. M., Yu, Y. W., Cheng, B. W., and Kuo, Y. L., Construction the model on the breast cancer survival analysis use support vector machine, logistic regression and decision tree. J. Med. Syst. 2014. doi:10.1007/s10916-014-0106-1.
Yilmaz, N., Inan, O., and Uzer, M. S., A new data preparation method based in clustering algorithms for diagnosis systems of heart and diabetes diseases. J. Med. Syst. 2014. doi:10.1007/s10916-014-0048-7.
Joshi, S., and Nair, M. K., Prediction of heart disease using classification based data mining techniques. Comput Intell Data Min 2:503–511, 2015.
Yadav, A. K., and Chandel, S. S., Solar energy potential assessment of western Himalayan Indian state of Himachal Pradesh using J48 algorithm of WEKA in ANN based prediction model. Renew. Energy 75:675–693, 2015.
Yadav, A. K., Malik, H., and Chandel, S. S., Selection of most relevant input parameters using WEKA for artificial neural network based solar radiation prediction model. Renew Sust Energ Rev 31:509–519, 2014.
Koyuncugil, A. S., and Ozgulbas, N., Detecting road maps for capacity utilization decisions by cluster analysis and CHAID decision tress. J. Med. Syst. 34:459–469, 2010.
Cabena, P., Hadjinian, P., Stadler, R., Verhees, J., and Zanasi, A., Discovering data mining: from concept to implementation. Prentice Hall, New Jersey, 1997.
Kennedy, L., Lee, Y., Roy, V., Reed, C., and Lippman, R., Solving data mining problems through pattern recognition. Prentice Hall, New Jersey, 1997.
Quinlan, J. R., C4.5: programs for machine learning. Morgan Kaufmann Publishers, San Francisco, 1993.
Quinlan, J. R., Induction of decision trees. Mach. Learn. 1:81–106, 1986.
Tso, H. L. The application of data mining on the cardiovascular disease prediction. Dissertation, Southern Taiwan University of Science and Technology, 2005.
Ting, I. H., and Chen, M. Y., Data mining. Tsang Hai Book Publishing, Taiwan, 2005.
Jeng, C. C., Yang, I. C., Lain, T. J., Hsieh, K. L., and Lin, C. N., A methodology for constructing taxonomy trees and perceptual maps for microorganism classification. WSEAS Trans. Comput. 11:2571–2578, 2006.
Lin, C. N., Tsai, C. F., and Roan, J., Personal photo browsing and retrieval by clustering techniques: effectiveness and efficiency evaluation. Online Inf. Rev. 32:759–772, 2008.
Hsieh, K. L., Jeng, C. C., Yang, I. C., Chen, Y. K., and Lin, C. N., The study of applying a systematic procedure based on SOFM clustering technique into organism clustering. Expert Syst. Appl. 33:330–336, 2007.
Author information
Authors and Affiliations
Corresponding author
Additional information
This article is part of the Topical Collection on Transactional Processing Systems.
Wei-Fan Chiang holds a Ph.D., Chi-Mei Medical Center.
Shyun-Yeu Liu holds a Ph.D., Chi-Mei Medical Center.
Jinshegn Roan holds a Ph.D., National Chung Cheng University.
Chun-Nan Lin holds a Ph.D., Shu-Te University.
Rights and permissions
About this article
Cite this article
Tseng, WT., Chiang, WF., Liu, SY. et al. The Application of Data Mining Techniques to Oral Cancer Prognosis. J Med Syst 39, 59 (2015). https://doi.org/10.1007/s10916-015-0241-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10916-015-0241-3