Skip to main content

Machine Learning for Multi-class Protein Fold Classification Based on Neural Networks with Feature Gating

  • Conference paper
  • First Online:
Artificial Neural Networks and Neural Information Processing — ICANN/ICONIP 2003 (ICANN 2003, ICONIP 2003)

Abstract

The success of a classification system depends heavily on two things: the tools being used and the features considered. For the bioinformatics applications the role of appropriate features has not been paid adequate importance. In this investigation we use two novel ideas. First, we use neural networks where each input node is associated with a gate. At the beginning of the training all gates are almost closed, i.e., no feature is allowed to enter the network. During the training, depending on the requirements, gates are either opened or closed. At the end of the training, gates corresponding to good features are completely opened while gates corresponding to bad features are closed more tightly. And of course, some gates may be partially open. So the network can not only select features in an online manner when the learning goes on, it also does some feature extraction. The second novel idea is to use a hierarchical machine learning architecture. Where at the first level the network classifies the data into four major folds : all alpha, all beta, alpha + beta and alpha / beta. And in the next level we have another set of networks, which further classifies the data into twenty seven folds. This approach helps us to achieve the following. The gating network is found to reduce the number of features drastically. It is interesting to observe that for the first level using just 50 features selected by the gating network we can get a comparable test accuracy as that using 125 features using neural classifiers. The process also helps us to get a better insight into the folding process. For example, tracking the evolution of different gates we can find which characteristics (features) of the data are more important for the folding process. And, of course, it reduces the computation time. The use of the hierarchical architecture helps us to get a better performance also.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. I. Dubchak, I. Muchnik, S. R. Holbrook and S-H Kim, “Prediction of protein folding class using global description of amino acid sequence”, Proc. Natl. Acad. (Biophysics) USA, Vol. 92, pp. 8700–8704, Sep., 1995.

    Article  Google Scholar 

  2. I. Dubchak, I. Muchnik, C. Mayor, I. Dralyuk, and S. H. Kim, “Recognition of a Protein Fold in the context of the SCOP Classification”, PROTEINS: Structure, Function and Genetics, Vol. 35, pp. 401–407, 1999.

    Article  Google Scholar 

  3. P. Y. In Chou and G. D. Fashman, editor, “Prediction of protein structure and principles of protein conformation”, Plenum Press, New York:, pp. 549–586, 1989.

    Google Scholar 

  4. H. Nakashima, K. Nishikawa, and T. Ooi,“The folding type of a protein is relevant to the amino acid composition”, J. Biochem, Vol. 99, pp.152–162, 1986.

    Google Scholar 

  5. I. Dubchak, S. R. Holbrook, and S. H. Kim, “Prediction of protein folding class from amino acid composition”, PROTEINS: Structure, Function and Genetics, Vol. 16, pp. 79–91,1993.

    Article  Google Scholar 

  6. N. R. Pal and K. Chintalapudi, “Connectionist system for feature selection,” Neural, Parallel and Scientific Computation, Vol. 5,No. 3, pp. 359–381, 1997.

    Google Scholar 

  7. R. De, N. R. Pal, and S. K. Pal, “Feature analysis: neural network and fuzzy set theoretic approaches”, Pattern Recognition, Vol. 30,No. 10, pp. 1579–1590, 1997.

    Article  MATH  Google Scholar 

  8. K. Fukunaga and W. Koontz, “Applications of the karhunen-Loeve expansion to feature selection and ordering”, IEEE Trans. Comp., Vol. C-19, 1970.

    Google Scholar 

  9. K. L. Priddy, S. K. Rogers, D. W. Ruck, G. L. Tarr, and M. Kabrisby, “Bayesian Selection of Important Features for Feed-forward Neural Network”, NeuroComputing, Vol. 5, pp. 91–103, 1993.

    Article  Google Scholar 

  10. A. Verikas and M. Bacauskiene, “Feature Selection with Neural Networks,” Pattern Recognition Letter, Vol. 23, pp. 1323–1335, 2002.

    Article  MATH  Google Scholar 

  11. I F. Chung, C. D. Huang, Y. H. Shen, and C. T. Lin, “Recognition of StructureC lassification of Protein Folding by NN and SVM Hierarchical Learning Architecture”, Int. Conf. Neural Infor. Processing, ICONIP’03, Turkey, 2003.

    Google Scholar 

  12. I. Dubchak and C. H. Q. Ding, “Multi-class protein fold recognition using support vector machines and neural networks,” Bioinformatics, Vol. 17,No. 4, pp. 349–358, 2001.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Huang, 2D., Chung, IF., Pal, N.R., Lin, CT. (2003). Machine Learning for Multi-class Protein Fold Classification Based on Neural Networks with Feature Gating. In: Kaynak, O., Alpaydin, E., Oja, E., Xu, L. (eds) Artificial Neural Networks and Neural Information Processing — ICANN/ICONIP 2003. ICANN ICONIP 2003 2003. Lecture Notes in Computer Science, vol 2714. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44989-2_139

Download citation

  • DOI: https://doi.org/10.1007/3-540-44989-2_139

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40408-8

  • Online ISBN: 978-3-540-44989-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics