Abstract:
The recognition of regulatory motifs of co-regulated genes is essential for understanding the regulatory mechanisms. However, the automatic extraction of regulatory motif...Show MoreMetadata
Abstract:
The recognition of regulatory motifs of co-regulated genes is essential for understanding the regulatory mechanisms. However, the automatic extraction of regulatory motifs from a given data set of the upstream noncoding DNA sequences of a family of co-regulated genes is difficult because regulatory motifs are often subtle and inexact. This problem is further complicated by the corruption of the data sets. In this paper, a new approach called mismatch-allowed probabilistic suffix tree motif extraction (MISAE) is proposed. It combines the mismatch-allowed probabilistic suffix tree that is a probabilistic model and local prediction for the extraction of regulatory motifs. The proposed approach is tested on 15 co-regulated gene families and compares favorably with other state-of-the-art approaches. Moreover, MISAE performs well on "corrupted" data sets. It is able to extract the motif from a "corrupted" data set with less than one fourth of the sequences containing the real motif.
Published in: Proceedings. 2004 IEEE Computational Systems Bioinformatics Conference, 2004. CSB 2004.
Date of Conference: 19-19 August 2004
Date Added to IEEE Xplore: 08 October 2004
Print ISBN:0-7695-2194-0
PubMed ID: 16448011