skip to main content
10.1145/2110363.2110395acmconferencesArticle/Chapter ViewAbstractPublication PagesihiConference Proceedingsconference-collections
research-article

Robust discovery of local patterns: subsets and stratification in adverse drug reaction surveillance

Published: 28 January 2012 Publication History

Abstract

The identification of unanticipated statistical associations is a core activity in exploratory analysis of high-dimensional biomedical data. Specifically, post-marketing surveillance for harmful effects of medicines relies on effective algorithms to detect associations between drugs and suspected adverse drug reactions. The WHO global individual case safety reports database, VigiBase, holds over six million reports and covers more than ten thousand medicinal products and thousands of distinct medical concepts. It collects data from more than 100 countries across the world and its first reports date back to the late 1960s. Local patterns may not show in database-wide analyses, and many others will vary substantially in strength or direction across data subsets. Still, routine screening of this and similar databases relies on global measures of association. In this paper, we propose a framework to detect local associations and characterise subset variability in high-dimensional data. We use shrinkage observed-to-expected ratios and employ multiple stratification by one or two covariates at a time. We consider subset-specific, stratified-then-pooled adjusted measures, and a novel measure to detect associations that hold in all-but-one subset. We use covariate permutation to select stratification covariates and gauge the vulnerability to spurious associations. Chance findings are a major concern! A naive subgroup analysis yielded more than 50% spurious local associations in VigiBase. To improve on this, we enforce conservative credibility intervals and also look for subset-specific associations that reproduce in at least one additional subset (e.g. two time periods). In addition to 119,500 global associations between drugs and medical events in VigiBase, such robust subgroup analysis uncovered 14,600 local associations at an estimated rate of 2.2% spurious.

References

[1]
O. Caster, G. N. Norén, D. Madigan, and A. Bate. Large-scale regression-based pattern discovery: The example of screening the WHO global drug safety database. Statistical Analysis and Data Mining, 3(4):197--208, 2010.
[2]
G. Dong and J. Li. Efficient mining of emerging patterns: discovering trends and differences. In Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '99, pages 43--52, New York, NY, USA, 1999. ACM.
[3]
W. DuMouchel and D. Pregibon. Empirical Bayes screening for multi-item associations. In KDD '01: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pages 67--76, 2001.
[4]
D. J. Hand and R. Bolton. Pattern discovery and detection: A unified statistical methodology. Journal of Applied Statistics, 31(8):885--924, 2004.
[5]
J. Hopstadius, G. N. Norén, A. Bate, and I. R. Edwards. Adjustment for potential confounders in adverse drug reaction surveillance. Drug Safety, 31(11):1035--1048, 2008.
[6]
W. Klösgen. Deviation and association patterns for subgroup mining in temporal, spatial, and textual data bases. In Rough Sets and Current Trends in Computing, volume 1424 of Lecture Notes in Computer Science, pages 1--18. Springer Berlin / Heidelberg, 1998.
[7]
J. Li, A. Fu, H. He, J. Chen, H. Jin, D. McAullay, G. Williams, R. Sparks, and C. Kelman. Mining risk patterns in medical data. In KDD '05: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery and data mining, pages 770--775, 2005.
[8]
G. N. Norén. Statistical methods for knowledge discovery in adverse drug reaction surveillance. PhD thesis, Department of Mathematics, Stockholm University, 2007.
[9]
G. N. Norén, J. Hopstadius, and A. Bate. Shrinkage observed-to-expected ratios for robust and transparent large-scale pattern discovery. Statistical Methods in Medical Research, Epub 2011 Jun 24.
[10]
G. N. Norén, J. Hopstadius, A. Bate, K. Star, and I. R. Edwards. Temporal pattern discovery in longitudinal electronic patient records. Data Mining and Knowledge Discovery, 20(3):361--387, 2010.
[11]
G. N. Norén, R. Orre, A. Bate, and I. R. Edwards. Duplicate detection in adverse drug reaction surveillance. Data Mining and Knowledge Discovery, pages 305--328, 2007.
[12]
G. N. Norén, R. Sundberg, A. Bate, and I. R. Edwards. A statistical methodology for drug-drug interaction surveillance. Statistics in Medicine, 27(16):3057--3070, 2008.
[13]
P. K. Novak, N. Lavrač, and G. I. Webb. Supervised descriptive rule discovery: A unifying survey of contrast set, emerging pattern and subgroup mining. J. Mach. Learn. Res., 10:377--403, June 2009.
[14]
R. Orre, A. Lansner, A. Bate, and M. Lindquist. Bayesian neural networks with confidence estimations applied to data mining. Computational Statistics & Data Analysis, 34:473--493, 2000.
[15]
E. Simpson. The interpretation of interaction in contingency tables. Journal of the Royal Statistical Society. Series B (Methodological), 13(2):238--241, 1951.
[16]
E. Suzuki. Autonomous discovery of reliable exception rules. In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD-97), pages 259--262, 1997.
[17]
G. I. Webb. Discovering significant patterns. Machine Learning, 68(1):1--33, 2007.
[18]
S. Wrobel. An algorithm for multi-relational discovery of subgroups. In Principles of Data Mining and Knowledge Discovery, volume 1263 of Lecture Notes in Computer Science, pages 78--87. Springer Berlin / Heidelberg, 1997.

Cited By

View all
  • (2023)Statistical methods for exploring spontaneous adverse event reporting databases for drug-host factor interactionsBMC Medical Research Methodology10.1186/s12874-023-01885-w23:1Online publication date: 27-Mar-2023
  • (2023)Performance of subgrouped proportional reporting ratios in the US Food and Drug Administration (FDA) adverse event reporting systemExpert Opinion on Drug Safety10.1080/14740338.2023.2182289(1-9)Online publication date: 7-Mar-2023
  • (2023)Identifying Safety Subgroups at Risk: Assessing the Agreement Between Statistical Alerting and Patient Subgroup RiskDrug Safety10.1007/s40264-023-01306-346:6(601-614)Online publication date: 2-May-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
IHI '12: Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium
January 2012
914 pages
ISBN:9781450307819
DOI:10.1145/2110363
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 January 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. adverse drug reaction surveillance
  2. associations
  3. frequent patterns
  4. multiple comparisons
  5. permutation test
  6. stratification
  7. subset analysis

Qualifiers

  • Research-article

Conference

IHI '12
Sponsor:
IHI '12: ACM International Health Informatics Symposium
January 28 - 30, 2012
Florida, Miami, USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)1
Reflects downloads up to 15 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Statistical methods for exploring spontaneous adverse event reporting databases for drug-host factor interactionsBMC Medical Research Methodology10.1186/s12874-023-01885-w23:1Online publication date: 27-Mar-2023
  • (2023)Performance of subgrouped proportional reporting ratios in the US Food and Drug Administration (FDA) adverse event reporting systemExpert Opinion on Drug Safety10.1080/14740338.2023.2182289(1-9)Online publication date: 7-Mar-2023
  • (2023)Identifying Safety Subgroups at Risk: Assessing the Agreement Between Statistical Alerting and Patient Subgroup RiskDrug Safety10.1007/s40264-023-01306-346:6(601-614)Online publication date: 2-May-2023
  • (2023)Effect of age on the risk of immune-related adverse events in patients receiving immune checkpoint inhibitorsClinical and Experimental Medicine10.1007/s10238-023-01055-823:7(3907-3918)Online publication date: 4-Apr-2023
  • (2021)The Use of Subgroup Disproportionality Analyses to Explore the Sensitivity of a Global Database of Individual Case Safety Reports to Known Pharmacogenomic Risk Variants Common in JapanDrug Safety10.1007/s40264-021-01063-1Online publication date: 10-Apr-2021
  • (2020)Bottom-up Subspace Clustering Based Occupational Hearing Loss Signal Detection2020 13th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)10.1109/CISP-BMEI51763.2020.9263651(814-819)Online publication date: 17-Oct-2020
  • (2020)The association of clozapine and haematological malignancies needs to be replicated by other studies and more importantly by analyses of subsamples from VigiBasePsychological Medicine10.1017/S0033291720001233(1-2)Online publication date: 6-May-2020
  • (2020)Risk Factor Considerations in Statistical Signal Detection: Using Subgroup Disproportionality to Uncover Risk Groups for Adverse Drug Reactions in VigiBaseDrug Safety10.1007/s40264-020-00957-wOnline publication date: 20-Jun-2020
  • (2020)Disproportionality Analysis for Pharmacovigilance Signal Detection in Small Databases or Subsets: Recommendations for Limiting False-Positive AssociationsDrug Safety10.1007/s40264-020-00911-wOnline publication date: 1-Feb-2020
  • (2019)Interactive Detection of Potential Occupational Hazard Signals2019 11th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC)10.1109/IHMSC.2019.10121(109-112)Online publication date: Aug-2019
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media