No abstract available.
The sequence of the human genome (abstract only)
A consensus sequence of the euchromatic portion of human genome has been generated by the whole genome shot-gun sequencing method that was developed while sequencing the genomes of Haemophilus influenzae and Drosophila melanogaster. The 2.9 billion bp ...
A new approach to sequence comparison: normalized sequence alignment
The Smith-Waterman algorithm for local sequence alignment is one of the most important techniques in computational molecular biology. This ingenious dynamic programming approach was designed to reveal the highly conserved fragments by discarding poorly ...
Context-specific Bayesian clustering for gene expression data
The recent growth in genomic data and measurement of genome-wide expression patterns allows to examine gene regulation by transcription factors using computational tools. In this work, we present a class of mathematical models that help in understanding ...
An optimal procedure for gap closing in whole genome shotgun sequencing
Tettelin et. al. proposed a new method for closing the gaps in whole genome shotgun sequencing projects. The method uses a multiplex PCR strategy in order to minimize the time and effort required to sequence the DNA in the missing gaps. This procedure ...
Class discovery in gene expression data
Recent studies (Alizadeh et al, [1]; Bittner et al,[5]; Golub et al, [11]) demonstrate the discovery of putative disease subtypes from gene expression data. The underlying computational problem is to partition the set of sample tissues into ...
On the predictive power of sequence similarity in yeast
Perhaps the most direct way to infer functional linkage of proteins is through structural similarity. However, structure determination lags behind DNA sequencing. Here we show that sequence similarity based on nucleotide sequences alone between ORFs in ...
Algorithms for phylogenetic footprinting
Phylogenetic footprinting is a technique that identifies regulatory elements by finding unusually well conserved regions in a set of orthologous non-coding DNA sequences from multiple species. In an earlier paper, we presented an exact algorithm that ...
Predicting the β-helix fold from protein sequence data
A method is presented that uses β-strand interactions to predict the right-handed β-helix super-secondary structural motif in protein sequences. A program called BetaWrap implements this method, and is shown to score known β-helices above non-β-helices ...
Information processing by cells and biologists (abstract only)
The core agenda of post-WWII molecular biology has been defined as the molecular understanding of how genetic information was transmitted and read out (see for example Stent 1968), and, by the 1950's, the analogy between the tape in a Turing machine and ...
Finding motifs using random projections
Pevzner and Sze [23] considered a precise version of the motif discovery problem and simultaneously issued an algorithmic challenge: find a motif M of length 15, where each planted instance differs from M in 4 positions. Whereas previous algorithms all ...
Rapid significance estimation in local sequence alignment with gaps
In order to assess the significance of sequence alignments it is crucial to know the distribution of alignment scores of pairs of random sequences. For gapped local alignment it is empirically known that the shape of this distribution is of the Gumbel ...
Regulatory element detection using correlation with expression (abstract only)
We present a new computational method for discovering cis- regulatory elements which circumvents the need to cluster genes based on their expression profiles. Based on a model in which upstream motifs contribute additively to the expression level of a ...
Gene-finding via tandem mass spectrometry
We propose a new gene-finding methodology that combines high performance liquid chromatograph (HPLC)-tandem mass spectrometry experiments with a fast computer algorithm to locate coding regions and introns. Proteins are first extracted from cells and ...
Algorithms for identifying protein cross-links via tandem mass spectrometry
Cross-linking technology combined with tandem mass spectrometry (MS-MS) is a powerful method that provides a rapid solution to the discovery of protein-protein interactions and protein structures. We studied the problem of detecting cross-linked ...
Hunger for new technologies, metrics, and spatiotemporal models in functional genomic (abstract only)
Functional genomics, as a field, is applying genomic self-improvement protocols (cost-effective, comprehensive, precise, accurate, and useful) to the kinetics of complex cellular systems. Radical surgery in functional biology aims to mimic the success ...
Fast recovery of evolutionary trees with thousands of nodes
We present a novel distance-based algorithm for evolutionary tree reconstruction. Our algorithm reconstructs the topology of a tree with n leaves in O(n2) time using O(n) working space. In the general Markov model of evolution the algorithm recovers the ...
Geometric algorithms for the analysis of 2D-electrophoresis gels
In proteomics 2-dimensional gel electrophoresis (2-DE) is a separation technique for proteins. The resulting protein spots can be identified by either using picking robots and subsequent mass spectrometry or by visual cross inspection of a new gel image ...
Analysis techniques for microarray time-series data
We introduce new methods for the analysis of short-term time-series data, and apply them to gene expression data in yeast. These include (1) methods for automated period detection in a predominately cycling data set and (2) phase detection between phase-...
A structural EM algorithm for phylogenetic inference
A central task in the study of evolution is the reconstruction of a phylogenetic tree from sequences of current-day taxa. A well supported approach to tree reconstruction performs maximum likelihood (ML) analysis. Unfortunately, searching for the ...
Optimal sequencing by hybridization in rounds
Sequencing by hybridization (SBH) is a method for reconstructing a sequence over a small finite alphabet from a collection of probes (substrings). Substring queries can be arranged on an array (SBH chip) and then a combinatorial method is used to ...
Efficient algorithms for lateral gene transfer problems
This paper develops a model for lateral gene transfer events (a.k.a. horizontal gene transfer events) between a set of gene trees T1, T2, …, Tk and a species tree S. To the best of our knowledge, this model possesses a higher degree of biological and ...
The greedy path-merging algorithm for sequence assembly
Two different approaches to determining the human genome are currently being pursued: one is the “clone-by-clone” approach, employed by the publicly-funded. Human Genome Project, and the other is the “whole genome shotgun” approach, favored by ...
Extracting structural information using time-frequency analysis of protein NMR data
High-throughput, data-directed computational protocols for Structural Genomics (or Proteomics) are required in order to evaluate the protein products of genes for structure and function at rates comparable to current gene-sequencing technology. To ...
Separating repeats in DNA sequence assembly
One of the key open problems in large-scale DNA sequence assembly is the correct reconstruction of sequences that contain repeats. A long repeat can confound a sequence assembler into falsely overlaying fragments that sample its copies, effectively ...
A NMR-spectra-based scoring function for protein docking
A well studied problem in the area of Computational Molecular Biology is the so-called Protein-Protein Docking problem (PPD) that can be formulated as follows: Given two proteins A and B that form a protein complex, compute the 3D-structure of the ...
101 optimal PDB structure alignments: a branch-and-cut algorithm for the maximum contact map overlap problem
Structure comparison is a fundamental problem for structural genomics. A variety of structure comparison methods were proposed and several protein structure classification servers e.g., SCOP, DALI, CATH, were designed based on them, and are extensively ...
Comparative analysis of organelle genomes, a biologist's view of computational challenges (abstract only)
With genomic data (generated by classical, functional, structural, proteo- and other `omic' approaches) accumulating at a stupendous rate, there is an ever increasing need for the development of new, more efficient and more sensitive computational ...
DNA segmentation as a model selection process
Previous divide-and-conquer segmentation analyses of DNA sequences do not provide a satisfactory stopping criterion for the recursion. This paper proposes that segmentation be considered as a model selection process. Using the tools in model selection, ...
Edit distance between two RNA structures
Arc-annotated sequences are useful in representiug the structural information of RNA sequences. Typically, RNA secondary and tertiary structures could be represented by a set of nested arcs and a set of crossing arcs, respectively. As the specified RNA ...
Genetics and genemoics: impact on drug discovery and development
complex diseases is limited by a still rudimentary understanding of the molecular basis of disease as well as of drug action. At the heart of this is our current inability to account for inter-individual differences in disease etiology and drug ...
Recommendations
Acceptance Rates
Year | Submitted | Accepted | Rate |
---|---|---|---|
RECOMB '03 | 175 | 35 | 20% |
RECOMB '02 | 118 | 35 | 30% |
RECOMB '01 | 128 | 35 | 27% |
RECOMB '97 | 117 | 43 | 37% |
Overall | 538 | 148 | 28% |