Abstract
Hematopoietic cancer is the malignant transformation in immune system cells. This cancer usually occurs in areas such as bone marrow and lymph nodes, the hematopoietic organ, and is a frightening disease that collapses the immune system with its own mobile characteristics. Hematopoietic cancer is characterized by the cells that are expressed, which are usually difficult to detect in the hematopoiesis process. For this reason, we focused on the five subtypes of hematopoietic cancer and conducted a study on classifying by applying machine learning algorithms both contextual approach and non-contextual approach. First, we applied PCA approach for extracting suited feature for building classification model for subtype classification. And then, we used four machine learning classification algorithms (support vector machine, k-nearest neighbor, random forest, neural network) and synthetic minority oversampling technique for generating a model. As a result, most classifiers performed better when the oversampling technique was applied, and the best result was that oversampling applied random forest produced 95.24% classification performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Xiong, H.Y., et al.: The human splicing code reveals new insights into the genetic determinants of disease. Science 347, 1254806 (2015)
Liu, Y., Wang, X.-D., Qiu, M., Zhao, H.: Machine learning for cancer subtype prediction with FSA method. In: Qiu, M. (ed.) SmartCom 2019. LNCS, vol. 11910, pp. 387–397. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-34139-8_39
Muhamed Ali, A., et al.: A machine learning approach for the classification of kidney cancer subtypes using miRNA genome data. Appl. Sci. 8(12), 2422 (2018)
Chen, R., et al.: Deep-learning approach to identifying cancer subtypes using high-dimensional genomic data. Bioinformatics 36, 1476–1483 (2019)
Gao, F., et al.: DeepCC: a novel deep learning-based framework for cancer molecular subtype classification. Oncogenesis 8(9), 1–2 (2019)
Ries, L.A.G., et al.: SEER cancer statistics review 1975–2017. National Cancer Institute (1975)
Mak, T.W., Saunders, M.E., Jett, B.D.: Primer to the Immune Response. Academic Cell, Elsevier (2014). (ISBN: 9780123852458)
Genomic Data Commons Data Portal. https://portal.gdc.cancer.gov. Accessed 14 Aug 2020
Jolliffe, I.T.: Principal Component Analysis. Springer Series in Statistics. Springer, New York (1986). https://doi.org/10.1007/978-1-4757-1904-8
Kent, M.: Vegetation Description and Data Analysis: A Practical Approach. Wiley, Hoboken (2011)
Chawla, N.V., et al.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Davagdorj, K., et al.: A machine-learning approach for predicting success in smoking cessation intervention. In: 2019 IEEE 10th International Conference on Awareness Science and Technology (iCAST). IEEE (2019)
Sutera, A., et al.: Context-dependent feature analysis with random forests. arXiv preprint arXiv: arXiv:1605.03848 (2016)
Bovolo, F., Bruzzone, L.: A context-sensitive technique based on support vector machines for image classification. In: Pal, S.K., Bandyopadhyay, S., Biswas, S. (eds.) PReMI 2005. LNCS, vol. 3776, pp. 260–265. Springer, Heidelberg (2005). https://doi.org/10.1007/11590316_36
Negri, R.G., Da Silva, E.A., Casaca, W.: Inducing contextual classifications with kernel functions into support vector machines. IEEE Geosci. Remote Sens. Lett. 15(6), 962–966 (2018)
Li, D.-C., Liu, C.-W.: A class possibility based kernel to increase classification accuracy for small data sets using support vector machines. Expert Syst. Appl. 37(4), 3104–3110 (2010)
Hearst, M.A.: Support vector machine. University of California, Berkeley (1998)
Ghimire, B., Rogan, J., Miller, J.: Contextual land-cover classification: incorporating spatial dependence in land-cover classification models using random forests and the Getis statistic. Remote Sens. Lett. 1(1), 45–54 (2010)
Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20(8), 832–844 (1998)
Abraham, A.: Artificial neural networks. In: Handbook of Measuring System Design, pp. 901–908 (2005)
Huk, M.: Non-uniform initialization of inputs groupings in contextual neural networks. In: Nguyen, N., Gaol, F., Hong, T.P., Trawiński, B. (eds.) ACIIDS 2019. LNCS, vol. 11432, pp. 420–428. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-14802-7_36
Huk, M., Mizera-Pietraszko, J.: Context-related data processing in artificial neural networks for higher reliability of telerehabilitation systems. In: 2015 17th International Conference on E-health Networking, Application & Services (HealthCom). IEEE (2015)
Chehreghani, M.H., Chehreghani, M.H.: Efficient context-aware K-nearest neighbor search. In: Pasi, G., Piwowarski, B., Azzopardi, L., Hanbury, A. (eds.) ECIR 2018. LNCS, vol. 10772, pp. 466–478. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-76941-7_35
Denoeux, T., Kanjanatarakul, O., Sriboonchitta, S.: A new evidential k-nearest neighbor rule based on contextual discounting with partially supervised learning. Int. J. Approx. Reason. 113, 287–302 (2019)
Agrawal, R.: K-nearest neighbor for uncertain data. Int. J. Comput. Appl. 105(11), 13–16 (2014)
Peterson, L.E.: K-nearest neighbor. Scholarpedia 4(2), 1883 (2009)
Acknowledgement
This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning (No. 2019K2A9A2A06020672 and No. 2020R1A2B5B02001717).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Park, K.H., Pham, V.H., Davagdorj, K., Munkhdalai, L., Ryu, K.H. (2021). A Subtype Classification of Hematopoietic Cancer Using Machine Learning Approach. In: Hong, TP., Wojtkiewicz, K., Chawuthai, R., Sitek, P. (eds) Recent Challenges in Intelligent Information and Database Systems. ACIIDS 2021. Communications in Computer and Information Science, vol 1371. Springer, Singapore. https://doi.org/10.1007/978-981-16-1685-3_10
Download citation
DOI: https://doi.org/10.1007/978-981-16-1685-3_10
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-1684-6
Online ISBN: 978-981-16-1685-3
eBook Packages: Computer ScienceComputer Science (R0)