Deep Learning Approach to Biogeographical Ancestry Inference

https://doi.org/10.1016/j.procs.2019.09.210Get rights and content
Under a Creative Commons license
open access

Abstract

Biogeographical ancestry (BGA) inference is based on the understanding of genetic diversity distribution among population groups. BGA inference is used to detect and measure the population structure that presents the natural assignment in genetic terms, identify genetic patterns found in individuals’ genotypes, and estimate an individual’s BGAs. In the context of forensic, BGA inference at an individual level gives the possibilities to achieve more complete identification of missing person or suspect. Current machine learning approach to BGA inference based on Bayesian theory and principle component analysis cannot operate on the data sequence directly and require predefined features extracted from the data sequence based on prior knowledge. In this paper, we conduct a survey of the state of the art of BGA inference and propose a new approach based on deep learning to BGA inference without prior feature extraction to find hidden genetic structure and provide more accurate predictions. Our experiments conducted on the dataset for Human Genome Diversity Project (HGDP) show better results for the proposed approach.

Keywords

Biogeographical ancestry inference
machine learning

Cited by (0)