Abstract
In remote area where scarcity of doctors is evident, health kiosks are deployed for collecting primary health records of patients like blood pressure, pulse rate, etc. However, the symptoms in the records are often imprecise due to measurement error and contain missing value for various reasons. Moreover, the medical records contain multivariate symptoms with different data types and a particular symptom may be the cause of more than one diseases. The records collected in health kiosks are not adequate so, imputing missing value by analyzing such dataset is a challenging task. In the paper the imprecise medical datasets are fuzzified and fuzzy c-mean clustering algorithm has been applied to group the symptoms into different disease classes. In the paper missing symptom values are imputed using linear regression models corresponding to each disease using fuzzified input of 1000 patients’ health-related data obtained from the kiosk. With the imputed symptom values new patients are diagnosed into appropriate disease classes achieving 97% accuracy. The results are verified with ground truth provided by the experts.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Tian J et al (2012) A fuzzy clustering approach for missing value imputation with non-parameter outlier test
Das S, Sil J (2017) Uncertainity management of health attributes for primary diagnosis. In: International conference on big data analytics and computational intelligence (ICBDACI). https://doi.org/10.1109/ICBDACI.2017.8070864
Wu X, Kumar V, Quinlan JR et al (2007) Top 10 algorithms in data mining. Knowl Inf Syst 14(1):1–37. https://doi.org/10.1007/s1011500701142
Jabbar MA et al An evolutionary algorithm for heart disease prediction. In: Communications in computer and information science, vol 292. Springer, 378–389. http://dx.doi.org/10.1007/978-3-642-31686-9_44
Jabbar MA, Deekshatulu BL, Chandra P Graph based approach for heart disease prediction. In: Proceedings of the third international conference on trends in information, telecommunication and computing, Volume 150 of the series lecture notes in electrical engineering, pp 465–474
Roddick JF, Fule P, Graco WJ (2003) Exploratory medical knowledge discovery: experiences and issues. SIGKDD Explor Newsl 5(1), 94–99. http://doi.acm.org/10.1145/959242.959243
Schneider T (2001) Analysis of incomplete climate data: Estimation of mean values and covariance matrices and imputation of missing values. J Clim 14:853–871
Thirukumaran S, Sumathi A (2012) Missing value imputation techniques depth survey and an imputation Algorithm to improve the efficiency of imputation. In: 2012 fourth international conference on advanced computing (ICoAC), Chennai
Andridge R, Little R (2010) A review of hot deck imputation for survey non-response. Int Stat Rev 78(1):40–64
Sebag M, Aze J, Lucas N ROC-based evolutionary learning: application to medical data mining. In: Artificial evolution volume 2936 of the series lecture notes in computer science, pp 384–396
Krishnaiah V, Narsimha G, Subhash Chandra N Heart disease prediction system using data mining technique by fuzzy K-NN approach. In: Emerging ICT for bridging the future—Proceedings of the 49th annual convention of the computer society of india (CSI). Series advances in intelligent systems and computing, vol 337, pp 371–384
Joshi S, Nair MK Prediction of heart disease using classification based data mining techniques. In: Computational intelligence in data mining—volume 2, volume 32 of the series smart innovation, systems and technologies, pp 503–511
Khaleel MA, Dash GN, Choudhury KS, Khan MA Medical data mining for discovering periodically frequent diseases from transactional databases. In: Computational intelligence in data mining—volume 1, volume 31 of the series smart innovation, systems and technologies, pp 87–96
Madhu G et al (2012) A novel index measure imputation algorithm for missing data values: a machine learning approach. In: IEEE international conference on computational intelligence & computing research
A novel discretization method for continuous attributes: a machine learning approach. Int J Data Min Emerg Technol 4(1), 34–43
Fauci AS, Kasper DL, Harrison R (1950) Harrison’s principles of internal medicine
https://healthfinder.gov. Accessed 22 Oct 2017
http://www.mayoclinic.org. Accessed 21 Oct 2017
Glynn M, Drake WM (2012) Hutchison’s clinical methods
Bezdek JC, Ehrlich R, Full W (1984) FCM: the fuzzy c-means clustering algorithm. Comput Geosci (Elsevier)
Ryan TP (2008) Modern regression methods. John Wiley & Sons
Mamdani EH, Assilian S (1975) An experiment in linguistic synthesis with a fuzzy logic controller. Int J Man Mach Stud 7(1):1–13
Fortemps P, Roubens M (1996) Ranking and defuzzification methods based on area compensation. Fuzzy Sets Syst (Elsevier)
Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell 1(2):224–227
Kärkkäinen I, Fränti P (2000) Minimization of the value of Davies-Bouldin index. In: Proceedings of the LASTED international conference signal processing and communications, pp 426–432
Acknowledgements
This work is supported by Information Technology Research Academy (ITRA), Government of India under, ITRA-Mobile grant [ITRA/15(59)/Mobile/Remote Health/01].
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Das, S., Sil, J. (2019). Missing Value Imputation in Medical Records for Remote Health Care. In: Mishra, D., Yang, XS., Unal, A. (eds) Data Science and Big Data Analytics. Lecture Notes on Data Engineering and Communications Technologies, vol 16. Springer, Singapore. https://doi.org/10.1007/978-981-10-7641-1_28
Download citation
DOI: https://doi.org/10.1007/978-981-10-7641-1_28
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-7640-4
Online ISBN: 978-981-10-7641-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)