Loading [MathJax]/extensions/MathZoom.js
DeepGenBind: a novel deep learning model for predicting transcription factor binding sites | IEEE Conference Publication | IEEE Xplore

DeepGenBind: a novel deep learning model for predicting transcription factor binding sites


Abstract:

Transcription factors are a class of protein factors that bind directly or indirectly to RNA polymerases and regulate the initiation of transcription by recognizing cis-a...Show More

Abstract:

Transcription factors are a class of protein factors that bind directly or indirectly to RNA polymerases and regulate the initiation of transcription by recognizing cis-acting elements in the DNA sequence. The prediction of transcription factor binding sites is an important part of the study of gene transcriptional regulation. Therefore, accurate prediction of TFBS helps one to understand and study the spatiotemporal nature of transcriptional regulation of target genes by different transcription factors. In recent years, an increasing number of deep learning methods have been used to predict transcription factor binding sites, however, existing methods still much room to improve performance. In this paper, we present a deep learning framework combining convolutional neural networks and recurrent neural networks to predict transcription factor binding sites, called DeepGenBind, for the systematic identification of transcription factor binding sites from DNA sequences. The novelty of our proposed approach relies on two key aspects: (1) the framework combines a three-layer parallel convolutional neural network CNN with a two-layer LSTM to efficiently extract useful features from large-scale genomic sequences obtained by high-throughput sequencing techniques (2) the use of k-mer coding to transform DNA sequences, with the transformed short sequences allowing for better data reading. Experimental results on 165 datasets from ENCODE show that DeepGenBind outperforms several other state-of-the-art methods in identifying transcription factor binding sites. In addition, we tested the effect of varying the k-mer vector length on model performance, demonstrating the variation in model performance under different k-mer related parameter settings. Overall, DeepGenBind is a useful tool for the cost-effective and accurate identification of potential transcription factor binding sites in biological genomes.
Date of Conference: 06-08 December 2022
Date Added to IEEE Xplore: 02 January 2023
ISBN Information:
Conference Location: Las Vegas, NV, USA

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.