ABSTRACT
Motivation: Scaffold proteins are known as crucial regulators of various cellular functions by assembling multiple proteins involved in signaling and metabolic pathways. Identification of scaffold proteins and the study of their molecular mechanisms can open a new aspect of cellular systemic regulation and the resulting application on medicine and engineering. There have been highlighted the regulatory roles of dozens of scaffold proteins, but just one computational approach tried to find scaffold proteins from interactomes until now. However, there was a limitation to find diverse types of scaffold proteins because their criteria were restricted to the classical scaffold proteins.
Methods: Here we suggest a systematic approach to predict massive scaffold proteins from interactomes and to characterize the roles of scaffold proteins comprehensively. We gathered known and predicted scaffold proteins from articles and interactomes. First, we carried out extraction of known scaffold proteins from review articles and collection of scaffold protein candidates from the UniProt and the PubMed using query search. Second, we proposed criteria for finding scaffold proteins and predicted scaffold proteins from interactome according to the criteria. Scaffold proteins are defined as proteins that: (i) directly interact with at least two proteins, (ii) have domain-domain interactions with two partner proteins using different domain regions, and (iii) are component of the same protein complex with two partner proteins. To characterize functional roles, we assigned the localization, pathway, and enzyme information of their partner proteins and classified functional roles of them from their partners' function
Results: From total 10,419 basic scaffold protein candidates in protein interactomes, we classified them into three classes according to structural evidences for scaffolding, such as domain architectures, domain interactions and protein complexes. Finally, we could define 2,716 highly reliable scaffold protein candidates and characterized functional features of these scaffold proteins. To assess the accuracy of our prediction, the gold standard positive and negative data sets are constructed. We prepared 158 gold standard positive data and 844 gold standard negative data based on functional information from Gene Ontology consortium. The precision, sensitivity and specificity of our testing was 83.8%, 42.4%, and 90.1%, respectively. Through function enrichment analysis about highly reliable scaffold proteins, we could confirm significantly enriched functions related on binding and found unexpected functions like transcription regulator activity, kinase activity, and ubiquitin protein ligase binding, etc. Furthermore, we identified functional association between scaffold proteins and their recruited proteins. From these results, we inferred novel functional information of scaffold proteins from recruited proteins. We also compared disease association of scaffold proteins with kinases and the result showed that the disease association of scaffold proteins is higher than kinases. As revealing more understandings about the roles of scaffold proteins in future studies, scaffolds will be used as generate novel and predictable pathway to program useful cellular behaviors. In conclusion, our finding can support further investigation to understand scaffold proteins and discover targets for molecular engineering and therapy.
Index Terms
- Prediction of Scaffold Proteins based on Protein Interaction and Domain Architectures
Recommendations
Do cancer proteins really interact strongly in the human protein-protein interaction network?
Graphical abstractDisplay Omitted Highlights Compared topological properties of four categories of proteins in four interactomes. Cancer proteins interact more strongly than other proteins in four organisms. Strong interaction of cancer proteins is not ...
Protein-ligand interaction prediction
Motivation: Predicting interactions between small molecules and proteins is a crucial step to decipher many biological processes, and plays a critical role in drug discovery. When no detailed 3D structure of the protein target is available, ligand-...
Comments