An Efficient Framework for Prediction in Healthcare Data Using Soft Computing Techniques

Bhat, Veena H.; Rao, Prasanth G.; Krishna, S.; Shenoy, P. Deepa; Venugopal, K. R.; Patnaik, L. M.

doi:10.1007/978-3-642-22720-2_55

An Efficient Framework for Prediction in Healthcare Data Using Soft Computing Techniques

Veena H. Bhat^6,7,
Prasanth G. Rao⁸,
S. Krishna⁶,
P. Deepa Shenoy⁶,
K. R. Venugopal⁶ &
…
L. M. Patnaik⁹

Conference paper

1632 Accesses
5 Citations

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 192))

Abstract

Healthcare organizations aim at deriving valuable insights employing data mining and soft computing techniques on the vast data stores that have been accumulated over the years. This data however, might consist of missing, incorrect and most of the time, incomplete instances that can have a detrimental effect on the predictive analytics of the healthcare data. Preprocessing of this data, specifically the imputation of missing values offers a challenge for reliable modeling. This work presents a novel preprocessing phase with missing value imputation for both numerical and categorical data. A hybrid combination of Classification and Regression Trees (CART) and Genetic Algorithms to impute missing continuous values and Self Organizing Feature Maps (SOFM) to impute categorical values is adapted in this work. Further, Artificial Neural Networks (ANN) is used to validate the improved accuracy of prediction after imputation. To evaluate this model, we use PIMA Indians Diabetes Data set (PIDD), and Mammographic Mass Data (MMD). The accuracy of the proposed model that emphasizes on a preprocessing phase is shown to be superior over the existing techniques. This approach is simple, easy to implement and practically reliable.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Cabena, P., Hadjinian, P., Stadler, R., Verhees, J., Zanasi, A.: Discovering Data Mining: from Concepts to Implementation. Prentice Hall, Englewood Cliffs (1998)
Google Scholar
Acuna, E., Rodriguez, C.: The Treatment of Missing Values and its Effect in the Classifier Accuracy. In: Multiscale Methods in Science and Engineering. LNCS, pp. 639–647. Springer, Heidelberg (2004)
Google Scholar
Peng, L., Lei, L.: A Review of Missing Data Treatment Methods. Intelligent Information Management Systems and Technologies 1(3), 412–419 (2005)
Google Scholar
Bhat, V.H., Rao, P.G., Shenoy, P.D., Venugopal, K.R., Patnaik, L.M.: An Efficient Prediction Model for Diabetic Database Using Soft Computing Techniques. In: Sakai, H., Chakraborty, M.K., Hassanien, A.E., Ślęzak, D., Zhu, W. (eds.) RSFDGrC 2009. LNCS, vol. 5908, pp. 328–335. Springer, Heidelberg (2009)
Chapter Google Scholar
Mehala, B., Ranjit, J.T.P., Vivekanandan, K.: Selecting Scalable Algorithms to Deal with Missing Values. International Journal of Recent Trends in Engineering 1(2) (2009)
Google Scholar
Batista, G.E.A.P.A., Monard, M.C.: K-Nearest Neighbour as Imputation Method. Experimental Results. Tech. Report 186, ICMC-USP (2002)
Google Scholar
Breault, J.L.: Data Mining Diabetic Databases: Are Rough Sets a Useful Addition? Artificial Intelligence in Medicine 27, 227–236 (2003)
Article Google Scholar
King, M.A., Elder IV, J.F., et al.: Evaluation of Fourteen Desktop Data Mining Tools. In: Proc. of IEEE International Conference on Systems, Man and Cybernetics, San Diego, CA (1998)
Google Scholar
Khan, A.H.: Multiplier-free Feedforward Networks. In: Proc. of the IEEE International Joint Conference on Neural Networks (IJCNN), Honolulu, Hawaii, vol. 3, pp. 2698–2703 (2002)
Google Scholar
Elsayad, A.M.: Predicting the Severity of Breast Masses with Ensemble of Bayesian Classifiers. Journal of Computer Science 6(5), 576–584 (2010)
Article Google Scholar
Machine Learning Database Repository at the University of California, Irvine, http://www.ics.uci.edu/mlearn/MLRepository
Kayaer, K., Yildirim, T.: Medical Diagnosis on Pima Indian Diabetes using General Regression Neural Networks. In: Proc. of the International Conference on Artificial Neural Networks/International Conference on Neural Information Processing, Istanbul, Turkey, pp. 181–184 (2003)
Google Scholar
Aslam, M.W., Nandi, A.K.: Detection of Diabetes using Genetic Programming. In: 18th European Signal Processing Conference, Denmark, pp. 1184–1188 (August 2010)
Google Scholar
Magnani, M.: Techniques for Dealing with Missing Data in Knowledge Discovery Tasks. By Department of Computer Science, University of Bologna (2004)
Google Scholar
Estébanez, C., Aler, R., José, M.: Method Based on Genetic Programming for Improving the Quality of Data Sets in Classification Problems. International Journal of Computer Science and Applications 4(1), 69–80 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, University Visvesvaraya College of Engineering, Bangalore, India
Veena H. Bhat, S. Krishna, P. Deepa Shenoy & K. R. Venugopal
IBS-Bangalore, Bangalore, India
Veena H. Bhat
Bangalore, India
Prasanth G. Rao
Defense Institute of Advanced Technology, Pune, India
L. M. Patnaik

Authors

Veena H. Bhat
View author publications
You can also search for this author in PubMed Google Scholar
Prasanth G. Rao
View author publications
You can also search for this author in PubMed Google Scholar
S. Krishna
View author publications
You can also search for this author in PubMed Google Scholar
P. Deepa Shenoy
View author publications
You can also search for this author in PubMed Google Scholar
K. R. Venugopal
View author publications
You can also search for this author in PubMed Google Scholar
L. M. Patnaik
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Machine Intelligence Research Labs (MIR Labs), Auburn, 98071-2259, Washington, USA
Ajith Abraham
Departamento de Comunicaciones, Universidad Politcnica de Valencia, 46071, Valencia, Spain
Jaime Lloret Mauri
Avaya Labs Research, Basking Ridge, NJ, USA
John F. Buford
University of Massachusetts, 100 Morrissey Blvd., 02125-3393, Boston, MA, USA
Junichi Suzuki
Rajagiri School of Engineering and Technology, Rajagiri Valley, Kakkanad, 682 039, Kochi, India
Sabu M. Thampi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bhat, V.H., Rao, P.G., Krishna, S., Shenoy, P.D., Venugopal, K.R., Patnaik, L.M. (2011). An Efficient Framework for Prediction in Healthcare Data Using Soft Computing Techniques. In: Abraham, A., Mauri, J.L., Buford, J.F., Suzuki, J., Thampi, S.M. (eds) Advances in Computing and Communications. ACC 2011. Communications in Computer and Information Science, vol 192. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22720-2_55

Download citation

DOI: https://doi.org/10.1007/978-3-642-22720-2_55
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22719-6
Online ISBN: 978-3-642-22720-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics