Loading [a11y]/accessibility-menu.js
Discovering Motifs in Biological Sequences Using the Micron Automata Processor | IEEE Journals & Magazine | IEEE Xplore

Discovering Motifs in Biological Sequences Using the Micron Automata Processor


Abstract:

Finding approximately conserved sequences, called motifs, across multiple DNA or protein sequences is an important problem in computational biology. In this paper, we con...Show More

Abstract:

Finding approximately conserved sequences, called motifs, across multiple DNA or protein sequences is an important problem in computational biology. In this paper, we consider the (l; d) motif search problem of identifying one or more motifs of length l present in at least q of the n given sequences, with each occurrence differing from the motif in at most d substitutions. The problem is known to be NP-complete, and the largest solved instance reported to date is (26;11). We propose a novel algorithm for the (l; d) motif search problem using streaming execution over a large set of non-deterministic finite automata (NFA). This solution is designed to take advantage of the micron automata processor, a new technology close to deployment that can simultaneously execute multiple NFA in parallel. We demonstrate the capability for solving much larger instances of the (l; d) motif search problem using the resources available within a single automata processor board, by estimating run-times for problem instances (39; 18) and (40; 17). The paper serves as a useful guide to solving problems using this new accelerator technology.
Published in: IEEE/ACM Transactions on Computational Biology and Bioinformatics ( Volume: 13, Issue: 1, 01 Jan.-Feb. 2016)
Page(s): 99 - 111
Date of Publication: 06 May 2015

ISSN Information:

PubMed ID: 26886735

Funding Agency:


References

References is not available for this document.