ABSTRACT
Nowadays, datasets are growing daily to obtain knowledge from big dataset. Extraction operation of useful information from the dataset is called data mining that is one of the major techniques to get the diagnostic results especially in medical care fields as breast cancer. Breast cancer is one of the widespread cancers among women if matched with all other tumors all over the world. Classification is widely used in most important and necessary tasks in the real life applications in all fields. This technique has ability of detecting very similarities/differences that a human analyst may be not notice and therefore create and introduction more accurate/useful categories. This study presents comparison and analyses breast cancer dataset by using classification decision tree algorithms. Decision tree algorithms are applied to these algorithms which are J48, Function Tree, Random Forest Tree, AD Alternating Decision Tree, Decision stump and Best First. A computationally efficient classifies of these decision tree algorithms by employing Waikato Environment for Knowledge Analysis (WEKA) that is development program which includes a set of machine learning algorithms. These masses included 569 with 357 benign and 212 malignant cases with 32 attributes to test and proof the differences among the classification methods or algorithms. These results found by manner which involves reserving a particular sample of a medical dataset on which do not train the model. The decision tree classification forms forecast breast tumor with lower error average and higher precision of correctly classified cases 97.7%. The predicted accuracy correctly classified instances for decision stump algorithm 88.0% model is the lowest of all.
- Quinlan, J. R.1986. Induction of Decision Trees. Machine Learning., 81--106. Google ScholarCross Ref
- Rajesh K, Anand S., 2012. Analysis of SEER Dataset for Breast Cancer Diagnosis using C4.5 Classification Algorithm. Int. Jour, of Advanced Research in Computer and Communication Engineering. 1(2):72--77.Google Scholar
- Cobain EF, Hayes DF. 2015. Indications for prognostic gene expression profiling in early breast cancer. Curr. Treat Options Oncol. 16(5):23. Google ScholarCross Ref
- Canadian Cancer Society's Steering Committee on Cancer Statistics. Canadian Cancer Statistics 2015. Toronto Canadian Cancer Society.Google Scholar
- Sharma, P., Ratnoo, S. 2014, A review on Discovery of Classification Rules using Pittsburgh Approach, Int. J. Artif. Intell. Know 1. Discov., 4(3): 6--16.Google Scholar
- Han, J. and Kamber, M. 2001. Data Mining: Concepts and Techniques, Morgan Kaufmann.Google ScholarDigital Library
- Buck, Carol J. 2016. Step-by-Step Medical Coding, Elsevier Health Sciences (Book).Google Scholar
- Kim, H.,Loh,W. 2001. Classification trees with unbiased multi-way splits, Jour. of the American Statistical Association, W.-Y.2001, vol. 96, 589--604.Google Scholar
- UCI Machine Learning Repository: 1995. Center for Machine Learning and Intelligent Systems. Breast cancer Wisconsin (diagnostic) dataset.Google Scholar
- Han, J., Kamber M. and Pei, J. 2011. Data Mining: Concepts and Techniques. The Morgan Kaufmann Series in Data Management Systems.Google Scholar
- Rastogi, R. and Shim, K. 1998. Public: A decision tree classifier that integrates building and pruning. In Proc. of 1996 Int. Conf. on Very Large Databases (VLDB96).Google Scholar
- Kaur, G., Chhabra. A. 2014. Improved J48 classification algorithm for the prediction of diabetes. Int. Jour. of Computer Applications., 98(22)13--17. Google ScholarCross Ref
- Bahgat, E.M. and Moawad, I.F. 2016 Semantic-Based Feature Reduction Approach for E-mail Classification. 2016 Proceedings of the Int. Conf. on Advanced Intelligent Systems and Informatics, 533: 53--63. Google ScholarCross Ref
- Shi H. 2007. Best-first Decision Tree Learning. Master Thesis. University of WaikatoGoogle Scholar
- Gama, J. 2004, Functional trees, Machine. Learning, 55 (3) :219--250 Google ScholarDigital Library
- Torgo, L. 1997. Functional models for regression tree leaves. In D. Fisher, Machine. Learning, Proceedings of the 14th Int. conf.,385--393Google Scholar
- Freund, Y. and Mason, L.1999. The Alternating Decision Tree Algorithm. Proceedings 16th Int. Conf. on Mach. Learn, 124--133.Google Scholar
- Vanassche, A., Krzywania, D., Vaneyghen, J., Struyf, J., Blockeel, H. 2003. First order alternating decision trees. 13th Int. Conf. on Inductive Logic Programming, 116--125.Google Scholar
- Ho, Tin Kam 1995. Random Decision Forests(PDF). Proceedings of the 3rd Int. Conf. on Document Analysis and Recognition, Montreal, 278--282.Google Scholar
- Breiman, L., 2001. Random forests, Machine Learning 45 (1), 5--32. Google ScholarDigital Library
- Iba, W.; and Langley, P. 1992; Induction of One-Level Decision Trees, in ML92: Proceedings of the Ninth Int. Conf. on Mach. Learn, 233--240.Google Scholar
- Holte, Robert C. 1993. Very Simple Classification Rules Perform Well on Most Commonly Used Datasets, Machine Learning. 11:63--91. Google ScholarDigital Library
- Frank, E., Hal, M. A., and Witten I. 2016. The WEKA Workbench. Online Appendix for Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann.Google Scholar
Index Terms
- Classifying breast cancer by using decision tree algorithms
Recommendations
Application of Data Mining Classification Algorithms for Breast Cancer Diagnosis
SCA '18: Proceedings of the 3rd International Conference on Smart City ApplicationsBreast cancer is one of the diseases that represent a large number of incidence and mortality in the world. Data mining classifications techniques will be effective tools for classifying data of cancer to facilitate decision-making.
The objective of ...
Fuzzy decision tree for breast cancer prediction
AISS '19: Proceedings of the 1st International Conference on Advanced Information Science and SystemMedical errors are considered as the leading cause of death and injury. Breast cancer becomes one of the leading causes of death among women, not only in the Philippines but worldwide. In this paper, data mining was used to predict the stage of breast ...
Classifying Malignant and Benign Tumors of Breast Cancer: A Comparative Investigation Using Machine Learning Techniques
Breast cancer is the second major cause of cancer deaths in women. Machine learning classification techniques can be used to increase the precision of diagnosis and bring it closer to 100%, thus saving the lives of many people. This paper proposed ...
Comments