Conferences >2016 IEEE International Confe...

CAMIL: Clustering and Assembly with Multiple Instance Learning for phenotype prediction

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

The recent advent of Metagenome-Wide Association Studies (MGWAS) has allowed for increased accuracy in the prediction of patient phenotype (disease), but has also present...Show More

Metadata

Abstract:

The recent advent of Metagenome-Wide Association Studies (MGWAS) has allowed for increased accuracy in the prediction of patient phenotype (disease), but has also presented big data challenges. Meanwhile, Multiple Instance Learning (MIL) is useful in the domain of bioinformatics because, in addition to classifying patient phenotype, it can also identify individual parts of the microbiome that are indicative of that phenotype, leading to better understanding of the disease. We demonstrate a novel, efficient, and effective MIL-based computational pipeline to predict patient phenotype from MGWAS data. Specifically, we use a Bag of Words method, which has been shown to be one of the most effective and efficient MIL methods. This involves assembly of the metagenomic sequence data, clustering of the assembled contigs, extracting features from the contigs, and using an SVM classifier to predict patient labels and identify the most relevant read clusters. With the exception of the given labels for the patients, this entire process is de novo (unsupervised). We use data from a well-known MGWAS study of patients with Type-2 Diabetes and show that our pipeline significantly outperforms the classifier used in that paper, as well as other common MIL methods. We call our pipeline “CAMIL”, which stands for Clustering and Assembly with Multiple Instance Learning.

Published in: 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

Date of Conference: 15-18 December 2016

Date Added to IEEE Xplore: 19 January 2017

ISBN Information:

DOI: 10.1109/BIBM.2016.7822489

Conference Location: Shenzhen, China

Contents

References is not available for this document.

CAMIL: Clustering and Assembly with Multiple Instance Learning for phenotype prediction

Abstract:

Metadata

Abstract:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

CAMIL: Clustering and Assembly with Multiple Instance Learning for phenotype prediction

Alerts

Abstract:

Metadata

Abstract:

References

IEEE Account

Purchase Details

Profile Information

Need Help?