Skip to main content

A Branch-and-Reduce Algorithm for the Contact Map Overlap Problem

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 3909))

Abstract

A fundamental problem in molecular biology is the comparison of 3-dimensional protein folds in order to develop similarity measures and exploit them for protein clustering, database searches, and drug design. Contact map overlap (CMO) is one of the most reliable and robust measures of protein structure similarity. Fold comparison can be done by aligning the amino acid residues of two proteins in a way that maximizes the number of common residue contacts. CMO maximization is gaining increasing attention because it results in protein clusterings in good agreement with classification by experts. However, CMO maximization is an \({\mathcal{NP}}\)-hard problem and few exact algorithms exist for solving this problem to global optimality.

In this paper, we propose a branch-and-reduce exact algorithm for the CMO problem. Contrary to previous approaches, we do not transform CMO to other combinatorial optimization problems for solution. Instead, we address the problem directly in its natural form. By exploiting the problem’s mathematical structure, we develop bounding and reduction procedures that lead to a very efficient algorithm. We present extensive computational results for over 36000 test problems from the literature. These results demonstrate that our algorithm is significantly faster and solves many more challenging test sets than the best previous algorithms for CMO. Furthermore, the algorithm results in protein clusters that are in excellent agreement with the SCOP database.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. NIH: Protein structural initiative: Better tools and better knowledge for structural genomics (Web), http://nigms.nih.gov/psi/

  2. Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E.: The protein data bank. Nucleic Acids Research 28, 235–242 (2000)

    Article  Google Scholar 

  3. Hulo, N., Sigrist, C.J.A., Saux, V.L., Langendijk-Genevaux, P.S., Bordoli, L., Gattiker, A., De Castro, E., Bucher, P., Bairoch, A.: Recent improvements to the PROSITE database. Nucleic Acids Research 32, 134–137 (2004)

    Article  Google Scholar 

  4. Pearson, W.R., Sierk, M.L.: The limits of protein sequence comparison?. Current opinion in structural biology 15, 254–260 (2005)

    Article  Google Scholar 

  5. Vogt, G., Etzold, T., Argos, P.: An assessment of amino acid exchange matrices in aligning protein sequences: The twilight zone revisited. Journal of Molecular Biology 249, 816–831 (1995)

    Article  Google Scholar 

  6. Orengo, C.A., Michie, A.D., Jones, S., Jones, D.T., Swindells, M.B., Thornton, J.M.: CATH-A hierarchic classification of protein domain structures. Structure 5, 1093–1108 (1997)

    Article  Google Scholar 

  7. Godzik, A.: The structural alignment between two proteins: Is there a unique answer?. Protein science 5, 1325–1338 (1996)

    Article  Google Scholar 

  8. Godzik, A., Skolnick, J., Kolinski, A.: A topology fingerprint approach to inverse protein folding problem. Journal of Molecular Biology 227, 227–238 (1992)

    Article  Google Scholar 

  9. Godzik, A., Skolnick, J.: Flexible algorithm for direct multiple alignment of protein structures and sequences. Computer applications in biosciences: CABIOS 10, 587–596 (1994)

    Google Scholar 

  10. Zaki, M.J., Jin, S., Bystroff, C.: Mining residue contacts in proteins using local structure predictions. In: Proceedings. IEEE Symposium on Bioinformatics and Biomedical Engineering, pp. 168–175. IEEE Computer Society, Los Alamitos (2000)

    Chapter  Google Scholar 

  11. Zhao, Y., Karypis, G.: Prediction of contact maps using support vector machines. In: Proceedings Third IEEE International Symposium on Bioinformatics and Bioengineering, pp. 26–36. IEEE Computer Society, Los Alamitos (2003)

    Chapter  Google Scholar 

  12. Caprara, A., Carr, R., Istrail, S., Lancia, G., Walenz, B.: 1001 optimal PDB structure alignments: Integer programming methods for finding the maximum contact map overlap. Journal of Computational Biology 11, 27–52 (2004)

    Article  Google Scholar 

  13. Goldman, D.: Algorithmic aspects of protein folding and protein structure similarity. PhD thesis, University of California at Berkeley (2000)

    Google Scholar 

  14. Carr, R.D., Lancia, G., Istrail, S.: Branch-and-cut algorithms for independent set problems: Integrality gap and an application to protein structural alignment. Technical report, Sandia National laboratories (2000)

    Google Scholar 

  15. Lancia, G., Carr, R., Walenz, B., Istrail, S.: 101 optimal PDB structure alignments: A branch-and-cut algorithm for the maximum contact map overlap problem. In: Proceedings of Annual International Conference on Computational Biology (RECOMB), pp. 193–202 (2001)

    Google Scholar 

  16. Caprara, A., Lancia, G.: Structural alignment of large-size proteins via Lagrangian relaxation. In: Proceeding of Internation Conference on Computational Biology (RECOMB), pp. 100–108 (2002)

    Google Scholar 

  17. Strickland, D.M., Barnes, E., Sokol, J.S.: Optimal protein structure alignment using maximum cliques. Operations Research 53, 389–402 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  18. Xie, W., Sahinidis, N.V.: A reduction-based exact algorithm for the contact map overlap problem (in preparation, 2005)

    Google Scholar 

  19. Dongarra, J.J.: Performance of various computers using standard linear equations software. Technical report, University of Tennessee, Knoxville, TN (2005), http://www.netlib.org/benchmark/performance.ps

  20. Kohlbacher, O., Lenhof, H.: BALL—Rapid software prototyping in computational molecular biology. Bioinformatics 16, 815–824 (2000)

    Article  Google Scholar 

  21. Murzin, A., Brenner, S.E., Hubbard, T., Chothia, C.: SCOP: A structural classification of protein database for the investigation of sequences and structures. Journal of Molecular Biology 247, 536–540 (1995)

    Google Scholar 

  22. Andreeva, A., Howorth, D., Brenner, S.E., Hubbard, T.J.P., Chothia, C., Murzin, A.G.: SCOP database in 2004: Refinements integrate structure and sequence family data. Nucleic Acids Research 32, D226–D229 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Xie, W., Sahinidis, N.V. (2006). A Branch-and-Reduce Algorithm for the Contact Map Overlap Problem. In: Apostolico, A., Guerra, C., Istrail, S., Pevzner, P.A., Waterman, M. (eds) Research in Computational Molecular Biology. RECOMB 2006. Lecture Notes in Computer Science(), vol 3909. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11732990_43

Download citation

  • DOI: https://doi.org/10.1007/11732990_43

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-33295-4

  • Online ISBN: 978-3-540-33296-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics