Skip to main content

Advertisement

Log in

Automated disease diagnosis and precaution recommender system using supervised machine learning

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Similar to many other professions, the medical field has undergone immense automation during the past decade. The complexity and rise of healthcare data led to a surge in artificial intelligence applications. Despite increased automation, such applications lack the desired accuracy and efficiency for healthcare problems. To address the aforementioned issue, this study presents an automatic health care system that can effectively substitute a doctor at an initial stage of diagnosis and help save time by recommending the necessary precautions. The proposed approach comprises two modules where Modul-1 aims at training the machine learning models using the disease symptoms dataset and their corresponding symptoms and precautions. Preprocessing and feature extraction are done as prerequisite steps. In Module-1 several algorithms are applied to the disease dataset such as support vector machine, random forest, extra trees classifier, logistic regression, multinomial naive Bayes, and decision tree. Module-2 interacts with the user (patient) through which the patient can describe the illness symptoms using a microphone. The voice data are transformed into text using the Google speech recognizer. The transformed data is later used with the trained model for disease prediction, as well as, recommending the precautions. The proposed approach achieves an accuracy of 99.9% during the real-time evaluation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Bookscorpus dataset. https://yknzhu.wixsite.com/mbweb, Accessed: 2022-01-20

  2. Al-Nazer A, Helmy T, Al-Mulhem M (2014) User’s profile ontology-based semantic framework for personalized food and nutrition recommendation. Procedia Computer Science 32:101–108

    Article  Google Scholar 

  3. Ashraf I, Hur S, Park Y (2018) Magio: Magnetic field strength based indoor-outdoor detection with a commercial smartphone. Micromachines 9(10):534

    Article  Google Scholar 

  4. Balog K, Azzopardi L, de Rijke M (2009) A language modeling framework for expert finding. Information Processing & Management 45(1):1–19

    Article  Google Scholar 

  5. Banu MA Nishara, Gomathy B (2013) Disease predicting system using data mining techniques. International Journal of Technical Research and Applications 1(5):41–45

    Google Scholar 

  6. Bao Y, Jiang X (2016) An intelligent medicine recommender system framework 2016 IEEE 11th Conference on Industrial Electronics and Applications (ICIEA), IEEE, pp 1383–1388

  7. Benediktsson J A, Swain P H (1992) Consensus theoretic classification methods. IEEE transactions on Systems, Man, and Cybernetics 22(4):688–704

    Article  Google Scholar 

  8. Bennett K P, Campbell C (2000) Support vector machines: hype or hallelujah?. Acm Sigkdd Explorations Newsletter 2(2):1–13

    Article  Google Scholar 

  9. Bhat S, Aishwarya K (2013) Item-based hybrid recommender system for newly marketed pharmaceutical drugs 2013 International Conference on Advances in Computing, Communications and Informatics (ICACCI), IEEE, pp 2107–2111

  10. Biau G, Scornet E (2016) A random forest guided tour. Test 25 (2):197–227

    Article  MathSciNet  Google Scholar 

  11. Bird S (2005) Nltk-lite: Efficient scripting for natural language processing Proceedings of the 4th International Conference on Natural Language Processing (ICON), Allied Publishers Private Limited, pp 11–18

  12. Breiman L (1996) Bagging predictors. Machine learning 24(2):123–140

    MATH  Google Scholar 

  13. Breiman L (2001) Random forests. Machine learning 45(1):5–32

    Article  Google Scholar 

  14. Burke R (2007) Hybrid web recommender systems. Springer

    Book  Google Scholar 

  15. Chang C-C, Lin C-J (2011) Libsvm: A library for support vector machines. ACM transactions on intelligent systems and technology (TIST) 2(3):1–27

    Article  Google Scholar 

  16. Davis D A, Chawla N V, Blumm N, Christakis N, Barabasi A-L (2008) Predicting individual disease risk based on medical history Proceedings of the 17th ACM conference on Information and knowledge management, pp 769–778

  17. Deng H, Han J, Lyu M R, King I (2012) Modeling and exploiting heterogeneous bibliographic networks for expertise ranking Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries, pp 71–80

  18. Deshpande M, Karypis G (2004) Item-based top-n recommendation algorithms. ACM Transactions on Information Systems (TOIS) 22(1):143–177

    Article  Google Scholar 

  19. Désir C, Petitjean C, Heutte L, Salaun M, Thiberville L (2012) Classification of endomicroscopic images of the lung based on random subwindows and extra-trees. IEEE transactions on biomedical engineering 59(9):2677–2683

    Article  Google Scholar 

  20. Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805

  21. Feldman K, Davis D, Chawla N V (2015) Scaling and contextualizing personalized healthcare: A case study of disease prediction algorithm integration. Journal of biomedical informatics 57:377–385

    Article  Google Scholar 

  22. Fox S, Duggan M (2013) Health online 2013. Health 2013:1–55

    Google Scholar 

  23. Freund Y, Schapire R E (1997) A decision-theoretic generalization of on-line learning and an application to boosting. Journal of computer and system sciences 55(1):119–139

    Article  MathSciNet  Google Scholar 

  24. Geurts P, Ernst D, Wehenkel L (2006) Extremely randomized trees. Machine learning 63(1):3–42

    Article  Google Scholar 

  25. Gomathi K, Priyaa D D S (2016) Multi disease prediction using data mining techniques. International Journal of System and Software Engineering 4:2

    Google Scholar 

  26. Grégoire G (2014) Multiple linear regression. EAS Publ Ser 66:45–72

    Article  Google Scholar 

  27. Guo X, Lu J (2007) Intelligent e-government services with personalized recommendation techniques. International journal of intelligent systems 22(5):401–417

    Article  Google Scholar 

  28. Hansen L K, Salamon P (1990) Neural network ensembles. IEEE transactions on pattern analysis and machine intelligence 12(10):993–1001

    Article  Google Scholar 

  29. Imtiaz Z, Umer M, Ahmad M, Ullah S, Choi G S, Mehmood A (2020) Duplicate questions pair detection using siamese malstm. IEEE Access 8:21932–21942

    Article  Google Scholar 

  30. Khanday A M U D, Rabani S T, Khan Q R, Rouf N, Din M M U (2020) Machine learning based approaches for detecting covid-19 using clinical text data. Int J Inf Technol 12(3):731–739

    Google Scholar 

  31. Kotsiantis S B, Zaharakis I, Pintelas P (2007) Supervised machine learning: A review of classification techniques. Emerging artificial intelligence applications in computer engineering 160(1):3–24

    Google Scholar 

  32. Kuncheva L I (2003) That elusive diversity in classifier ensembles Iberian conference on pattern recognition and image analysis, Springer, pp 1126–1138

  33. Lashari S A, Ibrahim R, Senan N, Taujuddin NSAM (2018) Application of data mining techniques for medical data classification: A review MATEC Web of Conferences, vol 150, EDP Sciences, p 06003

  34. Lee T, Chun J, Shim J, Lee S- (2006) An ontology-based product recommender system for b2b marketplaces. Int J Electron Commer 11 (2):125–155

    Article  Google Scholar 

  35. Liaw A, Wiener M, et al. (2002) Classification and regression by randomforest. R news 2(3):18–22

    Google Scholar 

  36. Loper E, Bird S (2002) Nltk: the natural language toolkit

  37. Macdonald C, Ounis I (2008) Voting techniques for expert search. Knowledge and information systems 16(3):259–280

    Article  Google Scholar 

  38. McCormick T, Rudin C, Madigan D (2011) A hierarchical model for association rule mining of sequential events: An approach to automated medical symptom prediction

  39. Middleton S E, De Roure D, Shadbolt N R (2004) Ontology-based recommender systems. Springer

    Book  Google Scholar 

  40. MS Windows NT kernel description. Wikipedia dataset, Accessed: 2022-01-20

  41. Osmar R Z, et al. (1999) Introduction to data mining

  42. Palmer D S, O’Boyle N M, Glen R C, Mitchell John BO (2007) Random forest models to predict aqueous solubility. Journal of chemical information and modeling 47(1):150–158

    Article  Google Scholar 

  43. Patil P (2020) Disease symptom prediction, version 2. https://www.kaggle.com/itachi9604/disease-symptom-description-dataset?select=dataset.csv, Online: accessed 28 September 2020

  44. Pazzani M J, Billsus D (2007) Content-based recommendation systems. Springer

    Book  Google Scholar 

  45. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, et al. (2011) Scikit-learn: Machine learning in python. the Journal of machine Learning research 12:2825–2830

    MathSciNet  MATH  Google Scholar 

  46. Periasamy AR Pon, Mohan S (2015) A review on health data using data mining techniques. International Research Journal of Engineering and Technology (IRJET) 2:2395–0056

    Google Scholar 

  47. Pheng L T, Husain W (2010) I-wellness: A hybrid case-based framework for personalized wellness therapy 2010 International Symposium on Information Technology, vol 3, IEEE, pp 1193–1198

  48. Ramalingam VV, Dandapath A, Raja M K (2018) Heart disease prediction using machine learning techniques: a survey. International Journal of Engineering & Technology 7(2.8):684–687

    Article  Google Scholar 

  49. Rao A S, D'Mello D A, Anand R, Nayak S (2020) Clinical significance of measles and its prediction using data mining techniques: A systematic review

  50. Rustam F, Ashraf I, Mehmood A, Ullah S, Choi G S (2019) Tweets classification on the base of sentiments for us airline companies. Entropy 21(11):1078

    Article  Google Scholar 

  51. Rustam F, Mehmood A, Ahmad M, Ullah S, Khan D M, Choi G S (2020) Classification of shopify app user reviews using novel multi text features. IEEE Access 8:30234–30244

    Article  Google Scholar 

  52. Sahoo A K, Pradhan C, Barik R K, Dubey H (2019) Deepreco: deep learning based health recommender system using collaborative filtering. Computation 7(2):25

    Article  Google Scholar 

  53. Saleh B, Saedi A, Al-Aqbi A, Salman L (2020) Analysis of weka data mining techniques for heart disease prediction system. International Journal of Medical Reviews 7(1):15–24

    Google Scholar 

  54. Schafer J B, Konstan J A, Riedl J (2001) E-commerce recommendation applications. Data mining and knowledge discovery 5(1-2):115–153

    Article  Google Scholar 

  55. Schapire R E (1999) A brief introduction to boosting Ijcai, vol 99, pp 1401–1406

  56. Sharma M, Ahuja L (2017) A data mining approach towards healthcare recommender system International Conference on Next Generation Computing Technologies, Springer, pp 199–210

  57. Svetnik V, Liaw A, Tong C, Culberson J C, Sheridan R P, Feuston B P (2003) Random forest: a classification and regression tool for compound classification and qsar modeling. Journal of chemical information and computer sciences 43(6):1947–1958

    Article  Google Scholar 

  58. Tung H-W, Soo V-W (2004) A personalized restaurant recommender agent for mobile e-service IEEE International Conference on e-Technology, e-Commerce and e-Service, 2004. EEE’04. 2004, IEEE, pp 259–262

  59. Weider D Y, Gill J S, Dalal M, Jha P, Shah S (2016) Big data approach in healthcare used for intelligent design–software as a service 2016 IEEE International Conference on Big Data (Big Data), IEEE, pp 3443–3449

  60. Zaíane O R (2002) Building a recommender agent for e-learning systems International Conference on Computers in Education, 2002. Proceedings., IEEE, pp 55–59

Download references

Funding

This research receives no external funding.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Sadia Din or Imran Ashraf.

Ethics declarations

Conflict of Interests

The authors declare that there is no conflict of interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rustam, F., Imtiaz, Z., Mehmood, A. et al. Automated disease diagnosis and precaution recommender system using supervised machine learning. Multimed Tools Appl 81, 31929–31952 (2022). https://doi.org/10.1007/s11042-022-12897-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-12897-x

Keywords

Navigation