Abstract
DNN (Deep neural network) has emerged as one of the standard methods to create a classification model. The most common issue affecting DNN performance is the class-imbalanced distribution dataset. This research designed two workflows for generating synthetic dataset using SMOTE algorithm, SDS-1, and SDS-2 dataset. We further investigated the optimal DNN parameters that generate the best optimum performance over those datasets. We used Indian Liver Patient Dataset (ILPD) from the oldest source, UCI Machine Learning Repository, with a total of 583 records, consist of 416 positives and 167 negatives data. We measured the DNN performance using sensitivity and F-score metric following the nature of the medical domain that mainly focused on identifying a particular disease. The experiment results revealed that DNN model with the learning rate of 1E-1, TanH activation function, Xavier weighting, the epoch of 40, and the hidden layers of 10, delivered the best sensitivity and F-score value, 98.40% and 99.18%, respectively. The results suggested that the workflow for generating the class-balanced dataset will leverage the DNN performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Asrani SK, Devarbhavi H, Eaton J, Kamath PS (2019) Burden of liver diseases in the world. J Hepatol 70:151–171
World Health Organization (2018) World health statistics 2018: monitoring health for the SDGs. Sustainable Development Goals, Geneva
Patel OP, Tiwari A (2015) Liver disease diagnosis using quantum-based binary neural network learning algorithm. In: Proceedings of fourth international conference on soft computing for problem solving, advances in intelligent systems and computing, vol 336. Springer, New Delhi, pp 425—434
Abdar M, Yen NY, Hung JCS (2018) Improving the diagnosis of liver disease using multilayer perceptron neural network and boosted decision trees. J Med Biol Eng 38(6):953–965
Wu CC et al (2019) Prediction of fatty liver disease using machine learning algorithms. Comput Methods Programs Biomed 170:23–29
Hassan TM, Elmogy M, Sallam ES (2017) Diagnosis of focal liver diseases based on deep learning technique for ultrasound images. Arab J Sci Eng 42(8):3127–3140
Das A, Rajendra Acharya U, Panda SS, Sabut S (2019) Deep learning based liver cancer detection using watershed transform and Gaussian mixture model techniques. Cogn Syst Res 54:165–175
Lee T, Kim J, Uh Y, Lee H (2019) Deep neural network for estimating low density lipoprotein cholesterol. Clin Chim Acta 489:35–40
Kannadasan K, Edla DR, Kuppili V (2018) Type 2 diabetes data classification using stacked autoencoders in deep neural networks. Clin Epidemiol Glob Health
Singaravel S, Suykens J, Geyer P (2018) Deep-learning neural-network architectures and methods: using component based models in building-design energy prediction. Adv Eng Inform 38:81–90
Aung SWY, Khaing SS, Aung ST (2019) Multi-label land cover indices classification of satellite images using deep learning. In: ICBDL 2018: big data analysis and deep learning applications, vol 744. Springer, Singapore, pp 94–103
Chemali E, Kollmeyer P, Preindl M, Emadi A (2018) State-of-charge estimation of Li-ion batteries using deep neural networks: a machine learning approach. J Power Sour 400:242–255
Bazrafkan S, Thavalengal S, Corcoran P (2018) An end to end deep neural network for iris segmentation in unconstrained scenarios. Neural Netw 106:79–95
Zhang L, Zhang C, Gao R, Yang R, Song Q (2016) Using the SMOTE technique and hybrid features to predict the types of ion channel-targeted conotoxins. J Theoret Biol 403:75–84
Guo H, Zhou J, Wu C-A (2018) Imbalanced learning based on data-partition and SMOTE. Information 9:238–250
Raghuwanshi BS, Shukla S (2019) SMOTE based class-specific extreme learning machine for imbalanced learning. Knowl-Based Syst (2019)
Maldonado S, Lopez J, Vairetti C (2019) An alternative SMOTE oversampling strategy for high-dimensional datasets. Appl Soft Comput 76:380–389
Douzas G, Bacao F, Last F (2018) Improving imbalanced learning through a heuristic oversampling method based on k-means and SMOTE. Inf Sci 465:1–20
Goodfellow I et al (2016) Deep learning (Adaptive Computation and Machine Learning Series). The MIT Press
Acknowledgments
The authors wish to thank Universitas YARSI for funding this research (No. 183/INT/UM/WRII/UY/VIII/2016).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Diana, N.E., Ahmad, A.B., Mahardika, Z.P. (2020). Investigating the Optimal Parameterization of Deep Neural Network and Synthetic Data Workflow for Imbalance Liver Disorder Dataset Classification. In: Ghazali, R., Nawi, N., Deris, M., Abawajy, J. (eds) Recent Advances on Soft Computing and Data Mining. SCDM 2020. Advances in Intelligent Systems and Computing, vol 978. Springer, Cham. https://doi.org/10.1007/978-3-030-36056-6_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-36056-6_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36055-9
Online ISBN: 978-3-030-36056-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)