Skip to main content

Missing Value Imputation in Medical Records for Remote Health Care

  • Conference paper
  • First Online:
  • 1630 Accesses

Part of the book series: Lecture Notes on Data Engineering and Communications Technologies ((LNDECT,volume 16))

Abstract

In remote area where scarcity of doctors is evident, health kiosks are deployed for collecting primary health records of patients like blood pressure, pulse rate, etc. However, the symptoms in the records are often imprecise due to measurement error and contain missing value for various reasons. Moreover, the medical records contain multivariate symptoms with different data types and a particular symptom may be the cause of more than one diseases. The records collected in health kiosks are not adequate so, imputing missing value by analyzing such dataset is a challenging task. In the paper the imprecise medical datasets are fuzzified and fuzzy c-mean clustering algorithm has been applied to group the symptoms into different disease classes. In the paper missing symptom values are imputed using linear regression models corresponding to each disease using fuzzified input of 1000 patients’ health-related data obtained from the kiosk. With the imputed symptom values new patients are diagnosed into appropriate disease classes achieving 97% accuracy. The results are verified with ground truth provided by the experts.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Tian J et al (2012) A fuzzy clustering approach for missing value imputation with non-parameter outlier test

    Google Scholar 

  2. Das S, Sil J (2017) Uncertainity management of health attributes for primary diagnosis. In: International conference on big data analytics and computational intelligence (ICBDACI). https://doi.org/10.1109/ICBDACI.2017.8070864

  3. Wu X, Kumar V, Quinlan JR et al (2007) Top 10 algorithms in data mining. Knowl Inf Syst 14(1):1–37. https://doi.org/10.1007/s1011500701142

  4. Jabbar MA et al An evolutionary algorithm for heart disease prediction. In: Communications in computer and information science, vol 292. Springer, 378–389. http://dx.doi.org/10.1007/978-3-642-31686-9_44

  5. Jabbar MA, Deekshatulu BL, Chandra P Graph based approach for heart disease prediction. In: Proceedings of the third international conference on trends in information, telecommunication and computing, Volume 150 of the series lecture notes in electrical engineering, pp 465–474

    Google Scholar 

  6. Roddick JF, Fule P, Graco WJ (2003) Exploratory medical knowledge discovery: experiences and issues. SIGKDD Explor Newsl 5(1), 94–99. http://doi.acm.org/10.1145/959242.959243

  7. Schneider T (2001) Analysis of incomplete climate data: Estimation of mean values and covariance matrices and imputation of missing values. J Clim 14:853–871

    Article  Google Scholar 

  8. Thirukumaran S, Sumathi A (2012) Missing value imputation techniques depth survey and an imputation Algorithm to improve the efficiency of imputation. In: 2012 fourth international conference on advanced computing (ICoAC), Chennai

    Google Scholar 

  9. Andridge R, Little R (2010) A review of hot deck imputation for survey non-response. Int Stat Rev 78(1):40–64

    Article  Google Scholar 

  10. Sebag M, Aze J, Lucas N ROC-based evolutionary learning: application to medical data mining. In: Artificial evolution volume 2936 of the series lecture notes in computer science, pp 384–396

    Google Scholar 

  11. Krishnaiah V, Narsimha G, Subhash Chandra N Heart disease prediction system using data mining technique by fuzzy K-NN approach. In: Emerging ICT for bridging the future—Proceedings of the 49th annual convention of the computer society of india (CSI). Series advances in intelligent systems and computing, vol 337, pp 371–384

    Google Scholar 

  12. Joshi S, Nair MK Prediction of heart disease using classification based data mining techniques. In: Computational intelligence in data mining—volume 2, volume 32 of the series smart innovation, systems and technologies, pp 503–511

    Google Scholar 

  13. Khaleel MA, Dash GN, Choudhury KS, Khan MA Medical data mining for discovering periodically frequent diseases from transactional databases. In: Computational intelligence in data mining—volume 1, volume 31 of the series smart innovation, systems and technologies, pp 87–96

    Google Scholar 

  14. Madhu G et al (2012) A novel index measure imputation algorithm for missing data values: a machine learning approach. In: IEEE international conference on computational intelligence & computing research

    Google Scholar 

  15. A novel discretization method for continuous attributes: a machine learning approach. Int J Data Min Emerg Technol 4(1), 34–43

    Google Scholar 

  16. Fauci AS, Kasper DL, Harrison R (1950) Harrison’s principles of internal medicine

    Google Scholar 

  17. https://healthfinder.gov. Accessed 22 Oct 2017

  18. http://www.mayoclinic.org. Accessed 21 Oct 2017

  19. Glynn M, Drake WM (2012) Hutchison’s clinical methods

    Google Scholar 

  20. Bezdek JC, Ehrlich R, Full W (1984) FCM: the fuzzy c-means clustering algorithm. Comput Geosci (Elsevier)

    Google Scholar 

  21. Ryan TP (2008) Modern regression methods. John Wiley & Sons

    Google Scholar 

  22. Mamdani EH, Assilian S (1975) An experiment in linguistic synthesis with a fuzzy logic controller. Int J Man Mach Stud 7(1):1–13

    Article  Google Scholar 

  23. Fortemps P, Roubens M (1996) Ranking and defuzzification methods based on area compensation. Fuzzy Sets Syst (Elsevier)

    Google Scholar 

  24. Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell 1(2):224–227

    Article  Google Scholar 

  25. Kärkkäinen I, Fränti P (2000) Minimization of the value of Davies-Bouldin index. In: Proceedings of the LASTED international conference signal processing and communications, pp 426–432

    Google Scholar 

Download references

Acknowledgements

This work is supported by Information Technology Research Academy (ITRA), Government of India under, ITRA-Mobile grant [ITRA/15(59)/Mobile/Remote Health/01].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sayan Das .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Das, S., Sil, J. (2019). Missing Value Imputation in Medical Records for Remote Health Care. In: Mishra, D., Yang, XS., Unal, A. (eds) Data Science and Big Data Analytics. Lecture Notes on Data Engineering and Communications Technologies, vol 16. Springer, Singapore. https://doi.org/10.1007/978-981-10-7641-1_28

Download citation

Publish with us

Policies and ethics