Skip to main content

Advertisement

Log in

Automatic discovery of adverse reactions through Chinese social media

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

Despite tremendous efforts made before the release of every drug, some adverse drug reactions (ADRs) may go undetected and thus, cause harm to both the users and to the pharmaceutical companies. One plausible venue to collect evidence of such ADRs is online social media, where patients and doctors discuss medical conditions and their treatments. There is substantial previous research on ADRs extraction from English online forums. However, very limited research was done on Chinese data. In this paper, we try to use the posts from two popular Chinese social media as the original dataset. We propose a semi-supervised learning framework that detects mentions of medications and colloquial ADR terms and extracts lexicon-syntactic features from natural language text to recognize positive associations between drug use and ADRs. The key contribution is an automatic label generation algorithm, which requires very little manual annotation. This bootstrapping algorithm could also be further applied on English data. The research results indicate that our algorithm outperforms the hidden Markov model and conditional random fields. With this approach, we discovered a large number of side effects for a variety of popular medicines in real world scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. https://www.cuttingedgeinfo.com/2011/us-phase-iv-budgets/.

  2. http://club.xywy.com/.

  3. http://www.haodf.com/.

  4. http://weibo.com.

  5. Sogou Pinyin is a Chinese input method, and there are many available lexicons, one of which is the ADRs lexicon, http://pinyin.sogou.com/dict/detail/index/644.

  6. https://translate.google.com/.

  7. AveP is defined athttps://en.wikipedia.org/wiki/Information_retrieval.

References

  • Benton A, Ungar LH, Hill S, Hennessy S, Mao J, Chung A, Leonard CE, Holmes JH (2011) Identifying potential adverse effects using the web: a new approach to medical hypothesis generation. J Biomed Inform 44(6):989–996

    Article  Google Scholar 

  • Bombardier C, Laine L, Reicin A, Shapiro D, Burgos-Vargas R, Davis B, Day R, Ferraz MB, Hawkey CJ, Hochberg MC et al (2000) Comparison of upper gastrointestinal toxicity of rofecoxib and naproxen in patients with rheumatoid arthritis. N Engl J Med 343(21):1520–1528

    Article  Google Scholar 

  • Bresalier RS, Sandler RS, Quan H, Bolognese JA, Oxenius B, Horgan K, Lines C, Riddell R, Morton D, Lanas A et al (2005) Cardiovascular events associated with rofecoxib in a colorectal adenoma chemoprevention trial. N Engl J Med 352(11):1092–1102

    Article  Google Scholar 

  • Brown E, Wood L, Wood S (1999) The medical dictionary for regulatory activities (meddra). Drug Saf 20(2):109–117

    Article  Google Scholar 

  • Cocos A, Fiks AG, Masino AJ (2017) Deep learning for pharmacovigilance: recurrent neural network architectures for labeling adverse drug reactions in twitter posts. J Am Med Inform Assoc 24(4):813–821

    Article  Google Scholar 

  • Freifeld CC, Brownstein JS, Menone CM, Bao W, Filice R, Kass-Hout T, Dasgupta N (2014) Digital drug safety surveillance: monitoring pharmaceutical products in twitter. Drug Saf 37(5):343–350

    Article  Google Scholar 

  • Graham DJ, Campen D, Hui R, Spence M, Cheetham C, Levy G, Shoor S, Ray WA (2005) Risk of acute myocardial infarction and sudden cardiac death in patients treated with cyclo-oxygenase 2 selective and non-selective non-steroidal anti-inflammatory drugs: nested case–control study. The Lancet 365(9458):475–481

    Article  Google Scholar 

  • Gurulingappa H, Toldo L, Rajput AM, Kors JA, Taweel A, Tayrouz Y (2013) Automatic detection of adverse events to predict drug label changes using text and data mining techniques. Pharmacoepidemiol Drug Saf 22(11):1189–1194

    Article  Google Scholar 

  • Hahn U, Cohen KB, Garten Y, Shah NH (2012) Mining the pharmacogenomics literaturea survey of the state of the art. Brief Bioinform 13(4):460–494

    Article  Google Scholar 

  • Harpaz R, Haerian K, Chase HS, Friedman C (2010) Statistical mining of potential drug interaction adverse effects in FDAS spontaneous reporting system. In: AMIA annual symposium proceedings, vol 2010. American Medical Informatics Association, p 281

  • Harpaz R, DuMouchel W, Shah NH, Madigan D, Ryan P, Friedman C (2012) Novel data-mining methodologies for adverse drug event discovery and analysis. Clin Pharmacol Ther 91(6):1010–1021

    Article  Google Scholar 

  • Huynh T, He Y, Willis A, Rüger S (2016) Adverse drug reaction classification with deep neural networks. COLING

  • Jiang L, Yang CC, Li J (2013) Discovering consumer health expressions from consumer-contributed content. In: SBP. Springer, Berlin, pp 164–174

  • Jonnagaddala J, Jue TR, Dai H (2016) Binary classification of twitter posts for adverse drug reactions. In: Proceedings of the social media mining shared task workshop at the pacific symposium on biocomputing, Big Island, HI, USA, pp 4–8

  • Karimi S, Kim S, Cavedon L (2011) Drug side-effects: What do patient forums reveal. In: The second international workshop on Web science and information exchange in the medical Web. ACM, pp 10–11

  • Leaman R, Wojtulewicz L, Sullivan R, Skariah A, Yang J, Gonzalez G (2010) Towards internet-age pharmacovigilance: extracting adverse drug reactions from user posts to health-related social networks. In: Proceedings of the 2010 workshop on biomedical natural language processing. Association for Computational Linguistics, pp 117–125

  • Lee K, Qadir A, Hasan SA, Datla V, Prakash A, Liu J, Farri O (2017) Adverse drug event detection in tweets with semi-supervised convolutional neural networks. In: Proceedings of the 26th international conference on World Wide Web. International World Wide Web Conferences Steering Committee, pp 705–714

  • Li YA (2011) Medical data mining: improving information accessibility using online patient drug reviews. PhD thesis, Massachusetts Institute of Technology

  • Liu X, Chen H (2013) Azdrugminer: an information extraction system for mining patient-reported adverse drug events in online patient forums. In: International conference on smart health. Springer, Berlin, pp 134–150

  • Liu X, Liu J, Chen H (2014) Identifying adverse drug events from health social media: a case study on heart disease discussion forums. In: International conference on smart health. Springer, Berlin, pp 25–36

  • Nikfarjam A, Gonzalez GH (2011) Pattern mining for extraction of mentions of adverse drug reactions from user comments. In: AMIA annual symposium proceedings, vol 2011. American Medical Informatics Association, p 1019

  • Nikfarjam A, Sarker A, OConnor K, Ginn R, Gonzalez G (2015) Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features. J Am Med Inform Assoc 22(3):671–681

    Google Scholar 

  • Pandey C, Ibrahim Z, Wu H, Iqbal E, Dobson R (2017) Improving RNN with attention and embedding for adverse drug reactions. In: Proceedings of the 2017 international conference on digital health. ACM, pp 67–71

  • Sampathkumar H, Xw Chen, Luo B (2014) Mining adverse drug reactions from online healthcare forums using hidden Markov model. BMC Med Inform Decis Mak 14(1):91

    Article  Google Scholar 

  • Sarker A, Gonzalez G (2015) Portable automatic text classification for adverse drug reaction detection via multi-corpus training. J Biomed Inform 53:196–207

    Article  Google Scholar 

  • Scheiber J, Jenkins JL, Sukuru SCK, Bender A, Mikhailov D, Milik M, Azzaoui K, Whitebread S, Hamon J, Urban L et al (2009) Mapping adverse drug reactions in chemical space. J Med Chem 52(9):3103–3107

    Article  Google Scholar 

  • Sharif H, Zaffar F, Abbasi A, Zimbra D (2014) Detecting adverse drug reactions using a sentiment classification framework. In: SocialCom, Academy of Science and Engineering (ASE), USA, ASE 2014

  • Sohn S, Kocher JPA, Chute CG, Savova GK (2011) Drug side effect extraction from clinical narratives of psychiatry and psychology patients. J Am Med Inform Assoc 18(Supplement-1):i144–i149

    Article  Google Scholar 

  • Trotti A, Colevas AD, Setser A, Rusch V, Jaques D, Budach V, Langer C, Murphy B, Cumberlin R, Coleman CN et al (2003) Ctcae v3. 0: development of a comprehensive grading system for the adverse effects of cancer treatment. Semin Radiat Oncol 13:176–181

    Article  Google Scholar 

  • Wang W, Haerian K, Salmasian H, Harpaz R, Chase H, Friedman C (2011) A drug-adverse event extraction algorithm to support pharmacovigilance knowledge mining from pubmed citations. In: AMIA annual symposium proceedings, vol 2011. American Medical Informatics Association, p 1464

  • Wang F, Zhang P, Cao N, Hu J, Sorrentino R (2014) Exploring the associations between drug side-effects and therapeutic indications. J Biomed Inform 51:15–23

    Article  Google Scholar 

  • Warrer P, Hansen EH, Juhl-Jensen L, Aagaard L (2012) Using text-mining techniques in electronic patient records to identify ADRs from medicine use. Br J Clin Pharmacol 73(5):674–684

    Article  Google Scholar 

  • Wu H, Fang H, Stanhope SJ (2012) An early warning system for unrecognized drug side effects discovery. In: Proceedings of the 21st international conference on World Wide Web. ACM, pp 437–440

  • Wu H, Fang H, Stanhope S et al (2013) Exploiting online discussions to discover unrecognized drug side effects. Methods Inf Med 52(2):152–9

    Article  Google Scholar 

  • Xiao C, Zhang P, Chaowalitwongse WA, Hu J, Wang F (2017) Adverse drug reaction prediction with symbolic latent Dirichlet allocation. In: Proceedings of the thirty-first AAAI conference on artificial intelligence

  • Xie L, Li J, Xie L, Bourne PE (2009) Drug discovery using chemical systems biology: identification of the protein–ligand binding network to explain the side effects of CETP inhibitors. PLoS Comput Biol 5(5):e1000387

    Article  Google Scholar 

  • Yamanishi Y, Pauwels E, Kotera M (2012) Drug side-effect prediction based on the integration of chemical and biological spaces. J Chem Inf Model 52(12):3284–3292

    Article  Google Scholar 

  • Yang C, Srinivasan P, Polgreen PM (2012a) Automatic adverse drug events detection using letters to the editor. In: AMIA annual symposium proceedings. American Medical Informatics Association, vol 2012, p 1030

  • Yang CC, Jiang L, Yang H, Tang X (2012b) Detecting signals of adverse drug reactions from health consumer contributed content in social media. In: Proceedings of ACM SIGKDD workshop on health informatics

  • Yates A, Goharian N (2013) ADRTrace: detecting expected and unexpected adverse drug reactions from user reviews on social media sites. Springer, Berlin

    Google Scholar 

  • Ye H, Liu Q, Wei J (2014) Construction of drug network based on side effects and its application for drug repositioning. PLoS ONE 9(2):e87864

    Article  Google Scholar 

  • Yeleswarapu S, Rao A, Joseph T, Saipradeep VG, Srinivasan R (2014) A pipeline to extract drug-adverse event pairs from multiple data sources. BMC Med Inform Decis Mak 14(1):13

    Article  Google Scholar 

  • Zhang HP, Yu HK, Xiong DY, Liu Q (2003) HHMM-based Chinese lexical analyzer ICTCLAS. In: Proceedings of the second SIGHAN workshop on Chinese language processing, -volume 17. Association for Computational Linguistics, pp 184–187

Download references

Acknowledgements

This work has been partially supported by AstraZeneca and NSFC grant 91646205.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Jia Wei or Kenny Q. Zhu.

Additional information

Responsible editor: Fei Wang

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A List of 79 drugs studied

A List of 79 drugs studied

figure ba
figure bb
figure bc

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, M., Zhang, M., Ge, C. et al. Automatic discovery of adverse reactions through Chinese social media. Data Min Knowl Disc 33, 848–870 (2019). https://doi.org/10.1007/s10618-018-00610-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-018-00610-2

Keywords

Navigation