Abstract
Drug abuse and addiction is a serious healthcare problem and social phenomenon that has not received the interest deserved in scientific research due to the lack of information. Today, social media have become an ubiquitous source of information in this field since they are the environment on which addicted individuals rely to talk about their dependencies. However, extracting salient information from social media is a difficult task regarding their noisy, dynamic and unstructured character. In addition, natural language processing tools (NLP) are not conceived to manage social data and cannot extract semantic and domain-specific entities.
In this paper, we propose a framework for real time collection and analysis of Twitter data which heart is a personalized NLP process for the extraction of drug abuse information. We extend Stanford CoreNLP pipeline with a customized annotator based on fuzzy matching with drug abuse and addiction lexicons in a dictionary. Our system, ran on 86 041 tweets, achieved 82 % of accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
References
Abboute, A., Boudjeriou, Y., Entringer, G., Azé, J., Bringay, S., Poncelet, P.: Mining twitter for suicide prevention. In: Métais, E., Roche, M., Teisseire, M. (eds.) NLDB 2014. LNCS, vol. 8455, pp. 250–253. Springer, Heidelberg (2014)
Abeed, S., Graciela, G.: Portable automatic text classification for adverse drug reaction detection via multi-corpus training. J. Biomed. Inf. 53, 196–207 (2014)
Abeed, S., Rachel, G., Azadeh, N., Karen, O., Karen, S., Swetha, J., Tejaswi, U., Graciela, G.: Utilizing social media data for pharmacovigilance: a review. J. Biomed. Inf. 54, 202–212 (2015)
Achrekar, H., Gandhe, A., Lazarus, R., Yu, S., Liu, B.: Twitter improves seasonal influenza prediction (2012)
Aramaki, E., Maskwa, S., Morita, M.: Twitter catches the flu: detecting influenza epidemics using twitter. In: Proceedings of 2011 Conference on Empirical Methods in Natural Language Processing, Edinburgh, Scotland, pp. 1568–1576 (2011)
Aronson, A.: Effective mapping of biomedical text to the umls metathesaurus: the metamap program. In: Proceedings of the AMIA (2001)
Carbonell, P., Mayer, M., Bravo, À.: Exploring brand-name drug mentions on twitter for pharmacovigilance. In: Digital Healthcare Empowering Europeans 2015 European Federation for Medical Informatics (EFMI), pp. 55–59 (2015)
Corley, C.D., Cook, D.J., Mikler, A.R., Singh, K.P.: Using web and social media for influenza surveillance. In: Arabnia, H.R. (ed.) Advances in Computational Biology. Advances in Experimental Medicine and Biology, vol. 680, pp. 559–564. Springer, New York (2010)
Culotta, A.: Toward detecting influenza epidemics by analyzing twitter messages. In: First Workshop on Social Media Analysis (SOMA 2010), Washington, USA (2010)
De Choudhury, M., Gamon, M., Counts, S., Horvitz, E.: Predicting depression via social media. In: Association for the Advancement of Artificial Intelligence (2013)
De Coster, X., De Groote, C., Destin, A., Deville, P.: Mahalanobis distance, jaro-winkler distance and ndollar in usigesture (2012)
Delroy, C., Gary, A., Raminta, D., Amit, P., Drashti, D., Lu, C., Gaurish, A., Robert, C., Kera, Z., Russel, F.: PREDOSE: a semantic web platform for drug abuse epidemiology using social media. J. Biomed. Inf. 46(6), 985–997 (2013)
Dredze, M.: How social media will change public health. IEEE Intell. Syst. 27(4), 81–84 (2012). IEEE Computer Society
Lee, K., Agrawal, A., Choudhary, A.: Real time disease surveillance using twitter data: demonstration on flu and cancer. In: KDD 2013, Chicago Illinois, USA (2013)
Leon, D., Diana, M., Giuseppe, R., van Marieke, E., Genevieve, G., Raphal, T., Johann, P., Kalina, B.: Analysis of named entity recognition and linking for tweets. Inf. Process. Manage. 51, 32–49 (2015)
Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S.J., McClosky, D.: The stanford corenlp natural language processing toolkit. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55–60 (2014)
Metke-Jimenez, A., Karimi, S.: Concept extraction to identify adverse drug reactions in medical forums: a comparaison of algorithms (2015)
Paul, M., Dredz, M.: You are what you tweet: analyzing twitter for public health. In: Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media (2011)
Piskorski, J., Yangarber, R.: Information extraction: Past, present and future. In: Poibeau, T., Saggion, H., Piskorski, J., Yangarber, R. (eds.) Multi-source, Multilingual Information Extraction and Summarization. Theory and Applications of Natural Language Processing, pp. 23–49. Springer, Heidelberg (2013)
Sadilek, A., Kautz, H., Silenzio, V.: Modeling spread of disease from social interactions. In: Proceedings of the Sixth International AAAI Conference on Weblogs and Social Media (2012)
Sadilek, A., Kautz, H., Silenzio, V.: Predicting disease transmission from geo tagged micro blog data. In: Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, pp. 136–142 (2012)
Savova, G., Bethard, S., Styler, W., Martin, J., Palmer, M., Masanz, J., Ward, W.: Towards temporal relation discovery from the clinical narrative. In: Proceedings of AMIA Annual Symposium (2009)
Segua-Bedmar, I., Martinez, P., Revert, R., Moreno-Shneider, J.: Exploring spanish health social media for detecting drug effects. Med. Inf. Decis. Making 15, S6 (2015). From Louhi 2014: The Fifth International Workshop on Health Text Mining and Information Analysis. Gothenburg, Sweden
Zirikly, A., Diab, M.: Named entity recognition for arabic social media. In: Proceedings of NAACL-HLT, pp. 176–185, Denver, Colorado (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Jenhani, F., Gouider, M.S., Said, L.B. (2016). Lexicon-Based System for Drug Abuse Entity Extraction from Twitter. In: Kozielski, S., Mrozek, D., Kasprowski, P., Małysiak-Mrozek, B., Kostrzewa, D. (eds) Beyond Databases, Architectures and Structures. Advanced Technologies for Data Mining and Knowledge Discovery. BDAS BDAS 2015 2016. Communications in Computer and Information Science, vol 613. Springer, Cham. https://doi.org/10.1007/978-3-319-34099-9_54
Download citation
DOI: https://doi.org/10.1007/978-3-319-34099-9_54
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-34098-2
Online ISBN: 978-3-319-34099-9
eBook Packages: Computer ScienceComputer Science (R0)