Abstract:
Methods for estimating the accuracy of structure prediction models (EMA) are crucial in modern protein structure prediction pipelines. State-of-art EMA methods use Suppor...Show MoreMetadata
Abstract:
Methods for estimating the accuracy of structure prediction models (EMA) are crucial in modern protein structure prediction pipelines. State-of-art EMA methods use Support Vector Machines as an inference engine. Convolutional Neural Networks (CNN) are widely used in pattern recognition tasks like image classification and speech recognition. We approach the EMA problem as a classification task and perform training of CNNs to estimate GDT TS and lDDT class ranges from the secondary structure and relative solvent exposure consensus as one-dimensional information using several datasets built from CASP assessments data. Our results show that CNNs models can achieve accuracies near 80.0% classifying proteins structures of the same sequence, and accuracies near 30.0% for structures of different sequences. This potentially indicates a data scarcity problem and a deficiency of transferability of the consensus information. However, the results strongly suggest the applicability of CNNs to the EMA problem.
Date of Conference: 08-13 July 2018
Date Added to IEEE Xplore: 14 October 2018
ISBN Information:
Electronic ISSN: 2161-4407