These proceedings are being made available in advance of the conference on April 2, 2020, the original date of ACM CHIL 2020 that was impacted by COVID-19. Research in machine learning and health requires a cross-disciplinary representation of clinicians and researchers in machine learning, health policy, causality, fairness, and other related areas. The goal of the ACM CHIL conference is to foster excellent research that addresses the unique challenges and opportunities that arise at the intersection of machine learning and health.
Proceeding Downloads
Defining admissible rewards for high-confidence policy evaluation in batch reinforcement learning
A key impediment to reinforcement learning (RL) in real applications with limited, batch data is in defining a reward function that reflects what we implicitly know about reasonable behaviour for a task and allows for robust off-policy evaluation. In ...
Variational learning of individual survival distributions
The abundance of modern health data provides many opportunities for the use of machine learning techniques to build better statistical models to improve clinical decision making. Predicting time-to-event distributions, also known as survival analysis, ...
Interpretable subgroup discovery in treatment effect estimation with application to opioid prescribing guidelines
- Chirag Nagpal,
- Dennis Wei,
- Bhanukiran Vinzamuri,
- Monica Shekhar,
- Sara E. Berger,
- Subhro Das,
- Kush R. Varshney
The dearth of prescribing guidelines for physicians is one key driver of the current opioid epidemic in the United States. In this work, we analyze medical and pharmaceutical claims data to draw insights on characteristics of patients who are more prone ...
Adverse drug reaction discovery from electronic health records with deep neural networks
Adverse drug reactions (ADRs) are detrimental and unexpected clinical incidents caused by drug intake. The increasing availability of massive quantities of longitudinal event data such as electronic health records (EHRs) has redefined ADR discovery as a ...
CaliForest: calibrated random forest for health data
Real-world predictive models in healthcare should be evaluated in terms of discrimination, the ability to differentiate between high and low risk events, and calibration, or the accuracy of the risk estimates. Unfortunately, calibration is often ...
BMM-Net: automatic segmentation of edema in optical coherence tomography based on boundary detection and multi-scale network
Retinal effusions and cysts caused by the leakage of damaged macular vessels and choroid neovascularization are symptoms of many ophthalmic diseases. Optical coherence tomography (OCT), which provides clear 10-layer cross-sectional images of the retina, ...
Survival cluster analysis
Conventional survival analysis approaches estimate risk scores or individualized time-to-event distributions conditioned on covariates. In practice, there is often great population-level phenotypic heterogeneity, resulting from (unknown) subpopulations ...
An adversarial approach for the robust classification of pneumonia from chest radiographs
While deep learning has shown promise in the domain of disease classification from medical images, models based on state-of-the-art convolutional neural network architectures often exhibit performance loss due to dataset shift. Models trained using data ...
Explaining an increase in predicted risk for clinical alerts
- Michaela Hardt,
- Alvin Rajkomar,
- Gerardo Flores,
- Andrew Dai,
- Michael Howell,
- Greg Corrado,
- Claire Cui,
- Moritz Hardt
Much work aims to explain a model's prediction on a static input. We consider explanations in a temporal setting where a stateful dynamical model produces a sequence of risk estimates given an input at each time step. When the estimated risk increases, ...
Fast learning-based registration of sparse 3D clinical images
We introduce SparseVM, a method that registers clinical-quality 3D MR scans both faster and more accurately than previously possible. Deformable alignment, or registration, of clinical scans is a fundamental task for many clinical neuroscience studies. ...
Multiple instance learning for predicting necrotizing enterocolitis in premature infants using microbiome data
Necrotizing enterocolitis (NEC) is a life-threatening intestinal disease that primarily affects preterm infants during their first weeks after birth. Mortality rates associated with NEC are 15-30%, and surviving infants are susceptible to multiple ...
Hurtful words: quantifying biases in clinical contextual word embeddings
In this work, we examine the extent to which embeddings may encode marginalized populations differently, and how this may lead to a perpetuation of biases and worsened performance on clinical tasks. We pretrain deep embedding models (BERT) on medical ...
Disease state prediction from single-cell data using graph attention networks
Single-cell RNA sequencing (scRNA-seq) has revolutionized bio-logical discovery, providing an unbiased picture of cellular heterogeneity in tissues. While scRNA-seq has been used extensively to provide insight into health and disease, it has not been ...
Using SNOMED to automate clinical concept mapping
The International Classification of Disease (ICD) is a widely used diagnostic ontology for the classification of health disorders and a valuable resource for healthcare analytics. However, ICD is an evolving ontology and subject to periodic revisions (...
MMiDaS-AE: multi-modal missing data aware stacked autoencoder for biomedical abstract screening
Systematic review (SR) is an essential process to identify, evaluate, and summarize the findings of all relevant individual studies concerning health-related questions. However, conducting a SR is labor-intensive, as identifying relevant studies is a ...
Hidden stratification causes clinically meaningful failures in machine learning for medical imaging
Machine learning models for medical image analysis often suffer from poor performance on important subsets of a population that are not identified during training or testing. For example, overall performance of a cancer detection model may be high, but ...
Interactive hybrid approach to combine machine and human intelligence for personalized rehabilitation assessment
Automated assessment of rehabilitation exercises using machine learning has a potential to improve current rehabilitation practices. However, it is challenging to completely replicate therapist's decision making on the assessment of patients with ...
Extracting medical entities from social media
Accurately extracting medical entities from social media is challenging because people use informal language with different expressions for the same concept, and they also make spelling mistakes. Previous work either focused on specific diseases (e.g., ...
Population-aware hierarchical bayesian domain adaptation via multi-component invariant learning
While machine learning is rapidly being developed and deployed in health settings such as influenza prediction, there are critical challenges in using data from one environment to predict in another due to variability in features. Even within disease ...
TASTE: temporal and static tensor factorization for phenotyping electronic health records
- Ardavan Afshar,
- Ioakeim Perros,
- Haesun Park,
- Christopher deFilippi,
- Xiaowei Yan,
- Walter Stewart,
- Joyce Ho,
- Jimeng Sun
Phenotyping electronic health records (EHR)focuses on defining meaningful patient groups (e.g., heart failure group and diabetes group) and identifying the temporal evolution of patients in those groups. Tensor factorization has been an effective tool ...
Analyzing the role of model uncertainty for electronic health records
- Michael W. Dusenberry,
- Dustin Tran,
- Edward Choi,
- Jonas Kemp,
- Jeremy Nixon,
- Ghassen Jerfel,
- Katherine Heller,
- Andrew M. Dai
In medicine, both ethical and monetary costs of incorrect predictions can be significant, and the complexity of the problems often necessitates increasingly complex models. Recent work has shown that changing just the random seed is enough for otherwise ...
Deidentification of free-text medical records using pre-trained bidirectional transformers
The ability of caregivers and investigators to share patient data is fundamental to many areas of clinical practice and biomedical research. Prior to sharing, it is often necessary to remove identifiers such as names, contact details, and dates in order ...
MIMIC-Extract: a data extraction, preprocessing, and representation pipeline for MIMIC-III
- Shirly Wang,
- Matthew B. A. McDermott,
- Geeticka Chauhan,
- Marzyeh Ghassemi,
- Michael C. Hughes,
- Tristan Naumann
Machine learning for healthcare researchers face challenges to progress and reproducibility due to a lack of standardized processing frameworks for public datasets. We present MIMIC-Extract, an open source pipeline for transforming the raw electronic ...
Index Terms
- Proceedings of the ACM Conference on Health, Inference, and Learning
Recommendations
Acceptance Rates
Year | Submitted | Accepted | Rate |
---|---|---|---|
CHIL '21 | 110 | 27 | 25% |
Overall | 110 | 27 | 25% |