Abstract
Everyday patients’ narratives on social media can reveal crucial public health issues. Mining those online narratives, which remained so far unconsidered, may mirror further hidden patient health status. Deep learning-based sentiment analysis (SA) approaches broadly focus on grammar directions such as semantic direction or only center on extract sentiment words. They provide both richer representation capabilities and better performance but do not consider the related medication concepts. As a result, the inaccurate recognition of related drug entities may seriously fail to retrieve the relevant sentiment expressed, leading to a lower recall than desired. Thus, the frequent use of informal medical language, non-standard format, wrongly spelled, and abbreviation forms, as well as typos in social media messages, has to be taken into consideration. In other words, the core of efficiently quantifying the sentimental aspects for related medication texts hardly involves a degree of medical language comprehension. In this paper, we seek to improve the importance of considering related drug entities that keep appearing in new Unicode Versions, ranging from drugs’ names, disease symptoms, drug misuse to potentially adverse effects. We propose N-gram-based convolution vocabulary scheme, which is dedicated mainly to featuring text under medical setting and clarifying related sentiment at the same level. This vectorization results in highly sentiment extraction, which produces medical concept normalization under distributed dependency. This architecture’s layers are a shared neural network between the medical featuring channel and the bidirectional sentiment information detector channel. Fewer approaches are proposed in this matter, we evaluate the effectiveness and transferability of this study across five benchmarking datasets and various online medication-related posts (Twitter posts, Parkinson’s disease forum’s discussions), which were significantly better than all other baselines.












Similar content being viewed by others
Notes
References
Adil B, Hanane G, EL-Habib N (2017) Sentiment analysis tool for Pharmaceutical Industry and Healthcare. Transactions on Machine Learning and Artificial Intelligence, [S.l.]
Araque O, Corcuera-Platas I, Sánchez-Rada JF, Iglesias CA (2017) Enhancing deep learning sentiment analysis with ensemble techniques in social applications. Expert Systems with Applications
Belousov M, Dixon W, Nenadic G (2017) Using an ensemble of generalised linear and deep learning models in the SMM4H 2017 medical concept normalisation task. In CEUR Workshop Proceedings
Cocos A, Fiks AG, Masino AJ (2017) Deep learning for pharmacovigilance: recurrent neural network architectures for labeling adverse drug reactions in Twitter posts. J Am Med Inform Assoc 24(4):813–821
Frank EB, Allen N, Young J, Kaplan A, Helms JA, Schneider RA (2007) Skeletogenesis in the swell shark Cephaloscyllium ventriosum. J Anat 210(5):542–554
Garcia-Pelaez J, Rodriguez D, Medina-Molina R, Garcia-Rivas G, Jerjes-Sánchez C, Trevino V (2019) PubTerm: A web tool for organizing, annotating and curating genes, diseases, molecules and other concepts from PubMed records. Database
Go A, Bhayani R, Huang L (2009) Twitter sentiment classification using distant supervision. CS224N project report, Stanford
Grissette H, Nfaoui EH (2019) A conditional sentiment analysis model for the embedding patient self-report experiences on social media, vol 914. Springer, Cham
Grissette H, EL-Habib N (2019) Daily Life Patients Sentiment Analysis Model Based on Well-Encoded Embedding Vocabulary for Related-Medication Text. In Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM ’19, pages 921–928, New York, NY, USA, 2019. Association for Computing Machinery
Han S, Tran T, Rios A, Kavuluru R (2017) Team UKNLP: detecting ADRs, classifying medication intake messages, and normalizing ADR mentions on twitter. In CEUR Workshop Proceedings
Kai S, Zhixuan Z, Hao G, Jonathan L (2018) A sentiment information Collector–Extractor architecture based neural network for sentiment analysis. Inf Sci 467:549–558
Kim S, Yeganova L, John WW (2016) Meshable: searching PubMed abstracts by utilizing MeSH and MeSH-derived topical terms. Bioinformatics 32(19):3044–3046
Limsopatham N, Collier N (2016) Normalising medical concepts in social media texts by learning semantic representation. In 54th annual meeting of the Association for Computational Linguistics, ACL 2016–Long Papers
Mike T, Kevan B, Georgios P (2012) Sentiment strength detection for the social web. J Am Soc Inf Sci Technol 63(1):163–173
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. In 1st International Conference on Learning Representations, ICLR 2013–Workshop Track Proceedings
Nikfarjam A, Sarker A, O’Connor K, Ginn R, Gonzalez G (2015) Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features. J Am Med Inf Assoc 22(3):671–681
Pennington J, Socher R, Manning CD (2014) GloVe: global vectors for word representation. In EMNLP 2014–2014 Conference on empirical methods in natural language processing, Proceedings of the conference
Rodrigues RG, das Dores RM, Camilo-Junior CG, Rosa TC, (2014) SentiHealth-Cancer: a sentiment analysis tool to help detecting mood of patients in online social networks. Int J Med Inf 85(1):80–95
Rosenthal S, Farra N, Nakov P (2018) SemEval-2017 Task 4: sentiment analysis in Twitter
Sarker A, Gonzalez G (2017) A corpus for mining drug-related knowledge from Twitter chatter: language models and their utilities. Data Brief 10:122–131
Sarker A, Belousov M, Friedrichs J, Hakala K, Kiritchenko S, Mehryary F, Han S, Tran T, Rios A, Kavuluru R, De Bruijn B, Ginter F, Mahata D, Mohammad SM, Nenadic G, Gonzalez-Hernandez G (2018) Data and systems for medication-related text classification and concept normalization from Twitter: insights from the Social Media Mining for Health (SMM4H)-2017 shared task. J Am Med Inf Assoc 25(10):1274–1283
Speriosu M, Sudan N, Upadhyay S, Baldridge J (2011) Twitter polarity classification with label propagation over lexical links and the follower graph. In: Proceedings of the conference on empirical methods in natural language processing. ISBN: 9781937284138
Tu-Bao H, Ly L, Dang TT, Siriwon T (2016) Data-driven approach to detect and predict adverse drug reactions. Curr Pharm Design 22(23):3498–3526
Wei CH, Kao HY, Lu Z (2013) PubTator: a web-based text mining tool for assisting biocuration. Nucleic Acids Res 41(W1):518–522
Wishart DS, Feunang YD, Guo AC, Lo EJ, Marcu A, Grant JR, Sajed T, Johnson D, Li C, Sayeeda Z, Assempour N, Iynkkaran I, Liu Y, MacIejewski A, Gale N, Wilson A, Chin L, Cummings R, Le D, Pon A, Knox C, Wilson M (2018) DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res 46(D1):1074–1082
Zolnoori M, Fung KW, Patrick TB, Fontelo P, Kharrazi H, Faiola A, Shah ND, Shirley WYS, Eldredge CE, Luo J, Conway M, Zhu J, Park SK, Xu K, Moayyed H (2019) The PsyTAR dataset: from patients generated narratives to a corpus of adverse drug events and effectiveness of psychiatric medications. Data Brief 24:103838
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Grissette, H., Nfaoui, E.H. Enhancing convolution-based sentiment extractor via dubbed N-gram embedding-related drug vocabulary. Netw Model Anal Health Inform Bioinforma 9, 42 (2020). https://doi.org/10.1007/s13721-020-00248-5
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13721-020-00248-5