skip to main content
10.1145/3180382.3180405acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicbbbConference Proceedingsconference-collections
research-article

Improvement of template-based protein structure prediction by using chimera alignment

Published: 18 January 2018 Publication History

Abstract

The determination of a protein's structure provides important information that can be used for various practical applications in the biological sciences, such as virtual screening, function prediction, etc. Protein structures can be precisely predicted using template-based modeling if we can find good template structures from a database. However, such predictions sometimes fail even if a template with sufficient quality is found because the sequence alignment used for the modeling is incorrect.
In this paper, we propose a new method for improving sequence alignment in single-template-based modeling. The sequence alignments used as an input of template-based modeling are normally generated by homology search tools, and the alignments vary depending on the search algorithm used. Each single alignment is often imperfect, but most of them have suitable parts for template-based modeling at different positions. Thus, a profile of multiple alignments is typically constructed to obtain a consensus among the alignments by multiple template search tools. Integrated alignments are generated by random sampling, and the final prediction model is selected based on model quality assessment scores and the joint probability of the profile.
We performed evaluation tests using template-based modeling targets in CASP11 and compared the proposed method to several existing major alignment algorithms. The results showed that the proposed method could improve the model accuracy of single-template modeling.

References

[1]
Stephen F. Altschul, Warren Gish, Webb Miller, Eugene W. Myers, and David J. Lipman. 1990. Basic local alignment search tool. Journal of Molecular Biology 215, 3 (1990), 403--410.
[2]
Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schäffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research 25, 17 (1997), 3389--3402.
[3]
Vikram Alva, Seung-Zin Nam, Johannes Söding, and Andrei N Lupas. 2016. The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysis. Nucleic Acids Research 44, W1 (2016), W410--W415.
[4]
H M Berman, J Westbrook, Z Feng, G Gilliland, T N Bhat, H Weissig, I N Shindyalov, and P E Bourne. 2000. The Protein Data Bank. Nucleic acids research 28, 1 (2000), 235--42.
[5]
Robert D Finn, Jody Clements, William Arndt, Benjamin L Miller, Travis J Wheeler, Fabian Schreiber, Alex Bateman, and Sean R Eddy. 2015. HMMER web server: 2015 update. Nucleic Acids Research 43, W1 (2015), W30--W38.
[6]
Limin Fu, Beifang Niu, Zhengwei Zhu, Sitao Wu, and Weizhong Li. 2012. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28, 23 (12 2012), 3150--3152.
[7]
Liisa Holm and Chris Sander. 1995. Dali: a network tool for protein structure comparison. Trends in Biochemical Sciences 20, 11 (1995), 478--480.
[8]
Kazutaka Katoh and Daron M Standley. 2013. MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability. Molecular Biology and Evolution 30, 4 (2013), 772--780.
[9]
Lisa N Kinch, Wenlin Li, Bohdan Monastyrskyy, Andriy Kryshtafovych, and Nick V Grishin. 2016. Evaluation of free modeling targets in CASP11 and ROLL. Proteins: Structure, Function, and Bioinformatics 84, S1 (2016), 51--66.
[10]
Lisa N. Kinch, Wenlin Li, R. Dustin Schaeffer, Roland L. Dunbrack, Bohdan Monastyrskyy, Andriy Kryshtafovych, and Nick V. Grishin. 2016. CASP 11 target classification. Proteins: Structure, Function, and Bioinformatics 84, S1 (2016), 20--33.
[11]
Andriy Kryshtafovych, Alessandro Barbato, Bohdan Monastyrskyy, Krzyszt of Fidelis, Torsten Schwede, and Anna Tramontano. 2016. Methods of model accuracy estimation can help selecting the best models from decoy sets: Assessment of model accuracy estimations in CASP11. Proteins: Structure, Function, and Bioinformatics 84, S1 (2016), 349--369.
[12]
Jesper Lundström, Leszek Rychlewski, Janusz Bujnicki, and Arne Elofsson. 2001. Pcons: A neural-network-based consensus predictor that improves fold recognition. Protein Science 10, 11 (1 2001), 2354--2362.
[13]
John Moult, Krzysztof Fidelis, Andriy Kryshtafovych, Torsten Schwede, and Anna Tramontano. 2016. Critical assessment of methods of protein structure prediction: Progress and new directions in round XI. Proteins: Structure, Function, and Bioinformatics 84, S1 (2016), 4--14.
[14]
Cédric Notredame, Desmond G Higgins, and Jaap Heringa. 2000. T-coffee: a novel method for fast and accurate multiple sequence alignment. Journal of Molecular Biology 302, 1 (2000), 205--217.
[15]
Nuala A. O'Leary, MathewW.Wright, J. Rodney Brister, Stacy Ciufo, Diana Haddad, Rich McVeigh, Bhanu Rajput, Barbara Robbertse, Brian Smith-White, Danso Ako-Adjei, Alexander Astashyn, Azat Badretdin, Yiming Bao, Olga Blinkova, Vyacheslav Brover, Vyacheslav Chetvernin, Jinna Choi, Eric Cox, Olga Ermolaeva, Catherine M. Farrell, Tamara Goldfarb, Tripti Gupta, Daniel Haft, Eneida Hatcher, Wratko Hlavina, Vinita S. Joardar, Vamsi K. Kodali, Wenjun Li, Donna Maglott, Patrick Masterson, Kelly M. McGarvey, Michael R. Murphy, Kathleen O'Neill, Shashikant Pujar, Sanjida H. Rangwala, Daniel Rausch, Lillian D. Riddick, Conrad Schoch, Andrei Shkeda, Susan S. Storz, Hanzhen Sun, Francoise Thibaud-Nissen, Igor Tolstoy, Raymond E. Tully, Anjana R. Vatsan, Craig Wallin, David Webb, Wendy Wu, Melissa J. Landrum, Avi Kimchi, Tatiana Tatusova, Michael DiCuccio, Paul Kitts, Terence D. Murphy, and Kim D. Pruitt. 2016. Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Research 44, D1 (2016), D733--D745.
[16]
Eric F. Pettersen, Thomas D. Goddard, Conrad C. Huang, Gregory S. Couch, Daniel M. Greenblatt, Elaine C. Meng, and Thomas E. Ferrin. 2004. UCSF Chimera--A visualization system for exploratory research and analysis. Journal of Computational Chemistry 25, 13 (4 2004), 1605--1612.
[17]
Arjun Ray, Erik Lindahl, and Björn Wallner. 2012. Improved model quality assessment using ProQ2. BMC Bioinformatics 13, 1 (12 2012), 1--12.
[18]
INShindyalov and P E Bourne. 1998. Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Engineering, Design and Selection 11, 9 (1998), 739--747.
[19]
Fabian Sievers, Andreas Wilm, David Dineen, Toby J Gibson, Kevin Karplus, Weizhong Li, Rodrigo Lopez, Hamish McWilliam, Michael Remmert, Johannes Söding, Julie D Thompson, and Desmond G Higgins. 2011. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Molecular Systems Biology 7, 1 (11 2011), 539.
[20]
Naomi Siew, Arne Elofsson, Leszek Rychlewski, and Daniel Fischer. 2000. Max-Sub: an automated measure for the assessment of protein structure prediction quality. Bioinformatics 16, 9 (2000), 776--785.
[21]
Andrej Šali and Tom L Blundell. 1993. Comparative Protein Modelling by Satisfaction of Spatial Restraints. Journal of Molecular Biology 234, 3 (1993), 779--815.
[22]
Sitao Wu and Yang Zhang. 2007. LOMETS: A local meta-threading-server for protein structure prediction. Nucleic Acids Research 35, 10 (7 2007), 3375--3382.
[23]
SitaoWu and Yang Zhang. 2008. MUSTER: Improving protein sequence profile-profile alignments by using multiple sources of structure information. Proteins: Structure, Function, and Bioinformatics 72, 2 (8 2008), 547--556.
[24]
Dong Xu, Lukasz Jaroszewski, Zhanwen Li, and Adam Godzik. 2014. FFAS-3D: improving fold recognition by including optimized structural features and template re-ranking. Bioinformatics 30, 5 (2014), 660--667.
[25]
Yuedong Yang, Eshel Faraggi, Huiying Zhao, and Yaoqi Zhou. 2011. Improving protein fold recognition and template-based modeling by employing probabilistic-based matching between predicted one-dimensional structural properties of query and corresponding native properties of templates. Bioinformatics 27, 15 (11 2011), 2076--2082.
[26]
Adam Zemla. 2003. LGA: A method for finding 3D similarities in protein structures. Nucleic acids research 31, 13 (3 2003), 3370--4.
[27]
Yang Zhang and Jeffrey Skolnick. 2004. Scoring function for automated assessment of protein structure template quality. Proteins: Structure, Function, and Bioinformatics 57, 4 (12 2004), 702--710.
[28]
Yang Zhang and Jeffrey Skolnick. 2005. TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Research 33, 7 (5 2005), 2302--2309.

Cited By

View all
  • (2023)Yapay Sinir Ağları Kullanılarak Protein Katlanması TanımaProtein Folding Recognition by Artificial Neural NetworksBilişim Teknolojileri Dergisi10.17671/gazibtd.114146816:2(95-105)Online publication date: 30-Apr-2023
  • (2020)Protein Fold Recognition by Combining Support Vector Machines and Pairwise Sequence Similarity ScoresIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2020.296645018:5(2008-2016)Online publication date: 13-Jan-2020

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICBBB '18: Proceedings of the 2018 8th International Conference on Bioscience, Biochemistry and Bioinformatics
January 2018
164 pages
ISBN:9781450353410
DOI:10.1145/3180382
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • RIED, Tokai Univ., Japan: RIED, Tokai University, Japan

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 January 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Bioinformatics
  2. Hidden Markov models
  3. Protein structure prediction

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICBBB 2018

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 08 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Yapay Sinir Ağları Kullanılarak Protein Katlanması TanımaProtein Folding Recognition by Artificial Neural NetworksBilişim Teknolojileri Dergisi10.17671/gazibtd.114146816:2(95-105)Online publication date: 30-Apr-2023
  • (2020)Protein Fold Recognition by Combining Support Vector Machines and Pairwise Sequence Similarity ScoresIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2020.296645018:5(2008-2016)Online publication date: 13-Jan-2020

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media