skip to main content
10.1145/1099554.1099670acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
Article

Mining officially unrecognized side effects of drugs by combining web search and machine learning

Published: 31 October 2005 Publication History

Abstract

We consider the problem of finding officially unrecognized side effects of drugs. By submitting queries to the Web involving a given drug name, it is possible to retrieve pages concerning the drug. However, many retrieved pages are irrelevant and some relevant pages are not retrieved. More relevant pages can be obtained by adding the active ingredient of the drug to the query. In order to eliminate irrelevant pages, we propose a machine learning process to filter out the undesirable pages. The process is shown experimentally to be very effective. Since obtaining training data for the machine learning process can be time consuming and expensive, we provide an automatic method to generate the training data. The method is also shown to be very accurate. The side effects of three drugs which are not recognized by FDA are validated by an expert. We believe that the same approach can be applied to many real life problems and will yield high precision. Thus, this could lead a new way to perform retrieval with high accuracy.

References

[1]
S. Ahmad. Adverse drug event monitoring at the food and drug administration. J Gen Intern Med 2003, 18:57--60.
[2]
R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval http://sunsite.dcc.uchile.cl/irbook/. Addison-Wesley, Wokingham, UK, 1999.
[3]
Burnham TH, editor. Drug facts and comparisons. St. Louis: Facts and Comparisons; 2005
[4]
J. Couzin. Drug safety. Withdrawal of Vioxx casts a shadow over COX-2 inhibitors, Science, Oct 2004.
[5]
Eye 2001;15(Pt 1):115--6.
[6]
D. Harman TREC-4, NIST, 1995.
[7]
D. Harman, TREC-5, NIST 1996.
[8]
B. Honigman, D. W. Bates and P. Light, Computerized Data Mining for Adverse Drug Events in an outpatient Setting, Proceedings of the 1998 AMIA Annual Symposium.
[9]
D. Lewis, Robert E. Schapire, James P. Callan, Ron Papka. Training Algorithms for Linear Text Classifier. SIGIR 1996.
[10]
D. Graupe and H. Kordylewski. A Large Memory Storage and Retrieval Neural Network for Adaptive Retrieval and Diagnosis. Internat. J. Software Eng. and Knowledge Eng, 1998.
[11]
J. Han and M. Kamber, Data Mining: Concepts and techniques, Morgan Kaufmann, 2000.
[12]
T. Kohonen, Self-organizing formation of topologically correct feature maps, Biological Cybernetics, 43(1):59--69, 1982
[13]
K. Kwok, L. Grunfeld, N. Dinstl, and P. Deng TREC 2003 Robust, HARD and QA Track experiments using PIRCS, TREC 2003.
[14]
O. Kurland and L. Lee Corpus Structure, Language Models and Ad Hoc Information Retrieval, ACM SIGIR, 2004.
[15]
K. Lasser, P. Allen, S. Woolhandler, D. Himmelstein, M. Wolfe and D. Bor. Timing of new black box warnings and withdrawals for prescription medications. JAMA 2002.
[16]
S. Liu, F. Liu, C. Yu and W. Meng An effective approach to document retrieval using Wordnet and recognizing phrases ACM SIGIR Conference, pp.266--272, 2004.
[17]
McEvoy G. American Hospital Formulary Service 2005. Betheseda: American Society of Health-System Pharmacists.
[18]
Micromedex® Healthcare Series, (electronic version). Thomson Micromedex, Greenwood Village, Colorado, USA. Available at: http://www.thomsonhc.com (cited: 05/24/2005)
[19]
K. Nigam, A. Mccallum, S. Thrun, and T. Mitchell. "Text Classification from Labeled and Unlabeled Documents using EM", Machine Learning 1999.
[20]
Eugene P. van Puijenbroek, Andrew Bate, Hubert G. M. Leufkens, Marie Lindquist, Roland Orre5 and Antoine C. G. Egberts, A comparison of measures of disproportionality for signal detection in spontaneous reporting systems for adverse drug reactions, pharmacoepidemiology and drug safety 2002.
[21]
S. Robertson and S. Walker Okapi. at TREC-8, Trec-8, 1999.
[22]
G. Salton. Automatic Text Processing, Addison Wesley, 1989.
[23]
Thomson MICROMEDEX. Drug information for the health care professional. 25th ed. Greenwood Village: Thomson MICROMEDEX; 2005.
[24]
E. Voorhees edited, Question Answering Track in TREC, 2001, 2002, 2003, 2004.
[25]
Yiming Yang and Xin Liu. A re-examination of text categorization methods. ACM SIGIR, 1999.
[26]
T. Mitchell. Machine Learning. McGraw-Hill, 1997
[27]
C. J van Rijsbergen. Information Retrieval Butterworth 1979.
[28]
C. Yu and H. Mizuno "Two learning algorithms in information retrieval", ACM SIGIR 1998.

Cited By

View all
  • (2021)Artificial intelligence in the diagnosis and detection of heart failure: the past, present, and futureReviews in Cardiovascular Medicine10.31083/j.rcm220412122:4Online publication date: 22-Dec-2021
  • (2020)Use of Social Media for Pharmacovigilance Activities: Key Findings and Recommendations from the Vigi4Med ProjectDrug Safety10.1007/s40264-020-00951-243:9(835-851)Online publication date: 16-Jun-2020
  • (2018)Exploiting Online Discussions to Discover Unrecognized Drug Side EffectsMethods of Information in Medicine10.3414/ME12-02-000452:02(152-159)Online publication date: 20-Jan-2018
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge management
October 2005
854 pages
ISBN:1595931406
DOI:10.1145/1099554
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 October 2005

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. accurate retrieval
  2. machine learning
  3. mining side effects of drugs
  4. precision

Qualifiers

  • Article

Conference

CIKM05
Sponsor:
CIKM05: Conference on Information and Knowledge Management
October 31 - November 5, 2005
Bremen, Germany

Acceptance Rates

CIKM '05 Paper Acceptance Rate 77 of 425 submissions, 18%;
Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 08 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2021)Artificial intelligence in the diagnosis and detection of heart failure: the past, present, and futureReviews in Cardiovascular Medicine10.31083/j.rcm220412122:4Online publication date: 22-Dec-2021
  • (2020)Use of Social Media for Pharmacovigilance Activities: Key Findings and Recommendations from the Vigi4Med ProjectDrug Safety10.1007/s40264-020-00951-243:9(835-851)Online publication date: 16-Jun-2020
  • (2018)Exploiting Online Discussions to Discover Unrecognized Drug Side EffectsMethods of Information in Medicine10.3414/ME12-02-000452:02(152-159)Online publication date: 20-Jan-2018
  • (2016)Veri Madenciliği İle Yazılım Hata TespitiEl-Cezeri Fen ve Mühendislik Dergisi10.31202/ecjse.2641973:2Online publication date: 31-May-2016
  • (2011)Neural Network application in diagnosis of patient: A case studyInternational Conference on Computer Networks and Information Technology10.1109/ICCNIT.2011.6020937(245-249)Online publication date: Jul-2011

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media