Abstract
Similar to many other professions, the medical field has undergone immense automation during the past decade. The complexity and rise of healthcare data led to a surge in artificial intelligence applications. Despite increased automation, such applications lack the desired accuracy and efficiency for healthcare problems. To address the aforementioned issue, this study presents an automatic health care system that can effectively substitute a doctor at an initial stage of diagnosis and help save time by recommending the necessary precautions. The proposed approach comprises two modules where Modul-1 aims at training the machine learning models using the disease symptoms dataset and their corresponding symptoms and precautions. Preprocessing and feature extraction are done as prerequisite steps. In Module-1 several algorithms are applied to the disease dataset such as support vector machine, random forest, extra trees classifier, logistic regression, multinomial naive Bayes, and decision tree. Module-2 interacts with the user (patient) through which the patient can describe the illness symptoms using a microphone. The voice data are transformed into text using the Google speech recognizer. The transformed data is later used with the trained model for disease prediction, as well as, recommending the precautions. The proposed approach achieves an accuracy of 99.9% during the real-time evaluation.
Similar content being viewed by others
References
Bookscorpus dataset. https://yknzhu.wixsite.com/mbweb, Accessed: 2022-01-20
Al-Nazer A, Helmy T, Al-Mulhem M (2014) User’s profile ontology-based semantic framework for personalized food and nutrition recommendation. Procedia Computer Science 32:101–108
Ashraf I, Hur S, Park Y (2018) Magio: Magnetic field strength based indoor-outdoor detection with a commercial smartphone. Micromachines 9(10):534
Balog K, Azzopardi L, de Rijke M (2009) A language modeling framework for expert finding. Information Processing & Management 45(1):1–19
Banu MA Nishara, Gomathy B (2013) Disease predicting system using data mining techniques. International Journal of Technical Research and Applications 1(5):41–45
Bao Y, Jiang X (2016) An intelligent medicine recommender system framework 2016 IEEE 11th Conference on Industrial Electronics and Applications (ICIEA), IEEE, pp 1383–1388
Benediktsson J A, Swain P H (1992) Consensus theoretic classification methods. IEEE transactions on Systems, Man, and Cybernetics 22(4):688–704
Bennett K P, Campbell C (2000) Support vector machines: hype or hallelujah?. Acm Sigkdd Explorations Newsletter 2(2):1–13
Bhat S, Aishwarya K (2013) Item-based hybrid recommender system for newly marketed pharmaceutical drugs 2013 International Conference on Advances in Computing, Communications and Informatics (ICACCI), IEEE, pp 2107–2111
Biau G, Scornet E (2016) A random forest guided tour. Test 25 (2):197–227
Bird S (2005) Nltk-lite: Efficient scripting for natural language processing Proceedings of the 4th International Conference on Natural Language Processing (ICON), Allied Publishers Private Limited, pp 11–18
Breiman L (1996) Bagging predictors. Machine learning 24(2):123–140
Breiman L (2001) Random forests. Machine learning 45(1):5–32
Burke R (2007) Hybrid web recommender systems. Springer
Chang C-C, Lin C-J (2011) Libsvm: A library for support vector machines. ACM transactions on intelligent systems and technology (TIST) 2(3):1–27
Davis D A, Chawla N V, Blumm N, Christakis N, Barabasi A-L (2008) Predicting individual disease risk based on medical history Proceedings of the 17th ACM conference on Information and knowledge management, pp 769–778
Deng H, Han J, Lyu M R, King I (2012) Modeling and exploiting heterogeneous bibliographic networks for expertise ranking Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries, pp 71–80
Deshpande M, Karypis G (2004) Item-based top-n recommendation algorithms. ACM Transactions on Information Systems (TOIS) 22(1):143–177
Désir C, Petitjean C, Heutte L, Salaun M, Thiberville L (2012) Classification of endomicroscopic images of the lung based on random subwindows and extra-trees. IEEE transactions on biomedical engineering 59(9):2677–2683
Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805
Feldman K, Davis D, Chawla N V (2015) Scaling and contextualizing personalized healthcare: A case study of disease prediction algorithm integration. Journal of biomedical informatics 57:377–385
Fox S, Duggan M (2013) Health online 2013. Health 2013:1–55
Freund Y, Schapire R E (1997) A decision-theoretic generalization of on-line learning and an application to boosting. Journal of computer and system sciences 55(1):119–139
Geurts P, Ernst D, Wehenkel L (2006) Extremely randomized trees. Machine learning 63(1):3–42
Gomathi K, Priyaa D D S (2016) Multi disease prediction using data mining techniques. International Journal of System and Software Engineering 4:2
Grégoire G (2014) Multiple linear regression. EAS Publ Ser 66:45–72
Guo X, Lu J (2007) Intelligent e-government services with personalized recommendation techniques. International journal of intelligent systems 22(5):401–417
Hansen L K, Salamon P (1990) Neural network ensembles. IEEE transactions on pattern analysis and machine intelligence 12(10):993–1001
Imtiaz Z, Umer M, Ahmad M, Ullah S, Choi G S, Mehmood A (2020) Duplicate questions pair detection using siamese malstm. IEEE Access 8:21932–21942
Khanday A M U D, Rabani S T, Khan Q R, Rouf N, Din M M U (2020) Machine learning based approaches for detecting covid-19 using clinical text data. Int J Inf Technol 12(3):731–739
Kotsiantis S B, Zaharakis I, Pintelas P (2007) Supervised machine learning: A review of classification techniques. Emerging artificial intelligence applications in computer engineering 160(1):3–24
Kuncheva L I (2003) That elusive diversity in classifier ensembles Iberian conference on pattern recognition and image analysis, Springer, pp 1126–1138
Lashari S A, Ibrahim R, Senan N, Taujuddin NSAM (2018) Application of data mining techniques for medical data classification: A review MATEC Web of Conferences, vol 150, EDP Sciences, p 06003
Lee T, Chun J, Shim J, Lee S- (2006) An ontology-based product recommender system for b2b marketplaces. Int J Electron Commer 11 (2):125–155
Liaw A, Wiener M, et al. (2002) Classification and regression by randomforest. R news 2(3):18–22
Loper E, Bird S (2002) Nltk: the natural language toolkit
Macdonald C, Ounis I (2008) Voting techniques for expert search. Knowledge and information systems 16(3):259–280
McCormick T, Rudin C, Madigan D (2011) A hierarchical model for association rule mining of sequential events: An approach to automated medical symptom prediction
Middleton S E, De Roure D, Shadbolt N R (2004) Ontology-based recommender systems. Springer
MS Windows NT kernel description. Wikipedia dataset, Accessed: 2022-01-20
Osmar R Z, et al. (1999) Introduction to data mining
Palmer D S, O’Boyle N M, Glen R C, Mitchell John BO (2007) Random forest models to predict aqueous solubility. Journal of chemical information and modeling 47(1):150–158
Patil P (2020) Disease symptom prediction, version 2. https://www.kaggle.com/itachi9604/disease-symptom-description-dataset?select=dataset.csv, Online: accessed 28 September 2020
Pazzani M J, Billsus D (2007) Content-based recommendation systems. Springer
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, et al. (2011) Scikit-learn: Machine learning in python. the Journal of machine Learning research 12:2825–2830
Periasamy AR Pon, Mohan S (2015) A review on health data using data mining techniques. International Research Journal of Engineering and Technology (IRJET) 2:2395–0056
Pheng L T, Husain W (2010) I-wellness: A hybrid case-based framework for personalized wellness therapy 2010 International Symposium on Information Technology, vol 3, IEEE, pp 1193–1198
Ramalingam VV, Dandapath A, Raja M K (2018) Heart disease prediction using machine learning techniques: a survey. International Journal of Engineering & Technology 7(2.8):684–687
Rao A S, D'Mello D A, Anand R, Nayak S (2020) Clinical significance of measles and its prediction using data mining techniques: A systematic review
Rustam F, Ashraf I, Mehmood A, Ullah S, Choi G S (2019) Tweets classification on the base of sentiments for us airline companies. Entropy 21(11):1078
Rustam F, Mehmood A, Ahmad M, Ullah S, Khan D M, Choi G S (2020) Classification of shopify app user reviews using novel multi text features. IEEE Access 8:30234–30244
Sahoo A K, Pradhan C, Barik R K, Dubey H (2019) Deepreco: deep learning based health recommender system using collaborative filtering. Computation 7(2):25
Saleh B, Saedi A, Al-Aqbi A, Salman L (2020) Analysis of weka data mining techniques for heart disease prediction system. International Journal of Medical Reviews 7(1):15–24
Schafer J B, Konstan J A, Riedl J (2001) E-commerce recommendation applications. Data mining and knowledge discovery 5(1-2):115–153
Schapire R E (1999) A brief introduction to boosting Ijcai, vol 99, pp 1401–1406
Sharma M, Ahuja L (2017) A data mining approach towards healthcare recommender system International Conference on Next Generation Computing Technologies, Springer, pp 199–210
Svetnik V, Liaw A, Tong C, Culberson J C, Sheridan R P, Feuston B P (2003) Random forest: a classification and regression tool for compound classification and qsar modeling. Journal of chemical information and computer sciences 43(6):1947–1958
Tung H-W, Soo V-W (2004) A personalized restaurant recommender agent for mobile e-service IEEE International Conference on e-Technology, e-Commerce and e-Service, 2004. EEE’04. 2004, IEEE, pp 259–262
Weider D Y, Gill J S, Dalal M, Jha P, Shah S (2016) Big data approach in healthcare used for intelligent design–software as a service 2016 IEEE International Conference on Big Data (Big Data), IEEE, pp 3443–3449
Zaíane O R (2002) Building a recommender agent for e-learning systems International Conference on Computers in Education, 2002. Proceedings., IEEE, pp 55–59
Funding
This research receives no external funding.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of Interests
The authors declare that there is no conflict of interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Rustam, F., Imtiaz, Z., Mehmood, A. et al. Automated disease diagnosis and precaution recommender system using supervised machine learning. Multimed Tools Appl 81, 31929–31952 (2022). https://doi.org/10.1007/s11042-022-12897-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-12897-x