Abstract
Health-related social media data, particularly patients’ opinions about drugs, have recently provided knowledge for research on the adverse reactions, allergies that a patient experiences and drug efficacy and safety. We develop an effective method for analyzing medicines’ efficiency and conditions-specific prescription from patient reviews provided by Drug Review Dataset (drug.com). Our approach relies on the Natural Language Processing (NLP) principle and a word embedding vectorization method to preserve semantics. For this purpose, we conducted experiments using various sampling techniques, precisely random sampling and balanced random sampling. Furthermore, we applied several statistical models: Logistic Regression, Decision Tree, Random Forests, K-Nearest Neighbors (KNN) and Neural Network models (simple perceptron, multilayer perceptron and convolutional neural network). We varied the size of training and test data sets to study the effect of the sampling techniques on model efficiency. Compared to other models, the results show that the proposed models in this paper: KNN, Embedding-100, and CNN-Maxpooling outclass models proposed by several researchers. Indeed, Embedding-100 has achieved better training accuracy and test accuracy. Moreover, during our study, we concluded that different factors influence the effectiveness of the models, mainly the text preprocessing method, sampling techniques in terms of size and type, text vectorization method and machine learning models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Haute Autorité de Santé. https://www.has-sante.fr/. Accessed 01 Aug 2022
Campesato, O.: Artificial intelligence, machine learning, and deep learning. Mercury Learning and Information (2020)
Bemila, T., Kadam, I., Sidana, A., et al.: An approach to sentimental analysis of drug reviews using RNN-BiLSTM model. In: Proceedings of the 3rd International Conference on Advances in Science & Technology (ICAST) (2020)
Mascio, A., Kraljevic, Z., Bean, D., et al.: Comparative analysis of text classification approaches in electronic health records. arXiv preprint arXiv:2005.06624 (2020)
Mercadier, Y.: Classification automatique de textes par réseaux de neurones profonds: application au domaine de la santé. Diss. Université Montpellier (2020)
Graber, F., Kallumadi, S., Malberg, H., et al.: Aspect-based sentiment analysis of drug reviews applying cross-domain and cross-data learning. In: Proceedings of the 2018 International Conference on Digital Health, pp. 121–125 (2018)
Vijayaraghavan, S., Basu, D.: Sentiment analysis in drug reviews using supervised machine learning algorithms. arXiv preprint arXiv:2003.11643 (2020)
Jiménez-Zafra, S.M., Martín-Valdivia, M.T., et al.: How do we talk about doctors and drugs? Sentiment analysis in forums expressing opinions for medical domain. Artif. Intell. Med. 93, 50–57 (2018)
Yadav, A., Vishwakarma, D.K.: A weighted text representation framework for sentiment analysis of medical drug reviews. In: 2020 IEEE Sixth International Conference on Multimedia Big Data (BigMM), pp. 326–332. IEEE (2020)
Colón-Ruiz, C., Segura-Bedmar, I.: Comparing deep learning architectures for sentiment analysis on drug reviews. J. Biomed. Inform. 110, 103539 (2020)
Mercadier, Y., Azé, J., Bringay, S.: Divide to better classify. In: Michalowski, M., Moskovitch, R. (eds.) AIME 2020. LNCS (LNAI), vol. 12299, pp. 89–99. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59137-3_9
Min, Z.: Drugs reviews sentiment analysis using weakly supervised model. In: 2019 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA), pp. 332–336. IEEE (2019)
UCI Machine Learning Repository: Drug Review Dataset. https://archive.ics.uci.edu/ml/datasets/Drug+Review+Dataset+%28Drugs.com%29. Accessed 26 Aug 2022
Split Your Dataset With scikit-learn’s train_test_split() Real Python. https://realpython.com/train-test-split-python-data/. Accessed 14 Apr 2022
Userguide: contents scikit learn. https://scikitlearn.org/stable/user_guide.html. Accessed 10 June 2022
Le modèle séquentiel TensorFlow Core. https://www.tensorflow.org/guide/keras/sequential_model. Accessed 12 May 2022
Géron, A.: Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. O’Reilly Media Inc., USA (2019)
Kingma, D.P., Jimmy, L.B.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Intuition d’Adam Optimizer – StackLima. https://stacklima.com/intuition-d-adam-optimizer/. Accessed 14 Apr 2022
Na, J.C., Kyaing, W.Y.M.: Sentiment analysis of user-generated content on drug review websites. J. Inf. Sci. Theory Pract. 3(1), 6–23 (2015)
Basiri, M.E., Abdar, M., Cifci, M.A., et al.: A novel method for sentiment classification of drug reviews using fusion of deep and machine learning techniques. Knowl.-Based Syst. 198, 105949 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Tiaiba, H., Sabri, L., Chibani, A., Kazar, O. (2023). Machine Learning for Drug Efficiency Prediction. In: Cunha, A., M. Garcia, N., Marx Gómez, J., Pereira, S. (eds) Wireless Mobile Communication and Healthcare. MobiHealth 2022. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 484. Springer, Cham. https://doi.org/10.1007/978-3-031-32029-3_27
Download citation
DOI: https://doi.org/10.1007/978-3-031-32029-3_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-32028-6
Online ISBN: 978-3-031-32029-3
eBook Packages: Computer ScienceComputer Science (R0)