Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3177))

  • 1301 Accesses

Abstract

Poly-transformation is the extension of the idea of ensemble learning to the transformation step of Knowledge Discovery in Databases (KDD). In poly-transformation multiple transformations of the data are made before learning (data mining) is applied. The theoretical basis for poly-transformation is the same as that for other combining methods – using different predictors to remove uncorrelated errors. It is not possible to demonstrate the utility of poly-transformation using standard datasets, because no pre-transformed data exists for such datasets. We therefore demonstrate its utility by applying it to a single well-known hard problem for which we have expertise – the problem of predicting protein secondary structure from primary structure. We applied four different transformations of the data, each of which was justifiable by biological background knowledge. We then applied four different learning methods (linear discrimination, back-propagation, C5.0, and learning vector quantization) both to the four transformations, and to combining predictions from the different transformations to form the poly-transformation predictions. Each of the learning methods produced significantly higher accuracy with poly-transformation than with only a single transformation. Poly-transformation is the basis of the secondary structure prediction method Prof, which is one of the most accurate existing methods for this problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Fayyad, U., Pietetsky-Shapiro, G., Smyth, P.: Advances in Knowledge Discovery and Data Mining. MIT Press, Cambridge (1996)

    Google Scholar 

  2. Dietterich, T.G., Lathrop, R.H., Lozano-Perez, T.: Solving the multiple-instance problem with axis-parallel rectangles. Artificial Intelligence 89(1-2), 31–71 (1997)

    Article  MATH  Google Scholar 

  3. King, R.D., Srinivasan, A., Sternberg, M.J.E.: Relating chemical activity to structure: an examination of ILP successes. New Gen. Computing. 13, 411–433 (1995)

    Article  Google Scholar 

  4. Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: bagging, boosting, and variants. Machine Learning 36, 105–139 (1999)

    Article  Google Scholar 

  5. Cherkauer, K.J.: Human expert-level performance on a scientific image analysis by a system using combined artificial neural network. In: Chan, P. (ed.) Working Notes of AAAI Workshop on Integrating Multiple Learned Models, pp. 15–21 (1996)

    Google Scholar 

  6. Zheng, Z., Webb, G.I.: Stochastic attribute selection committees. In: Proceedings of the Eleventh Australian Joint Conference on Artificial Intelligence (AI 1998), pp. 321–332. Springer, Berlin (1998)

    Google Scholar 

  7. King, R.D., Sternberg, J.E.: Identification and application of the concepts important for accurate and reliable protein secondary structure prediction. Protein Science 5, 2298–2310 (1996)

    Article  Google Scholar 

  8. Salamov, A.A., Solovyev, V.V.: Prediction of protein secondary structure by combining nearest-neighbor algorithms and multiple sequence alignments. J. Mol. Biol. 247, 11–15 (1995)

    Article  Google Scholar 

  9. Muggleton, S., King, R.D., Sternberg, M.J.E.: Protein secondary structure prediction using logic. Protein Eng. 5, 647–657 (1992)

    Article  Google Scholar 

  10. Rost, B., Sander, C.: Prediction of protein secondary structure at better than 70% accuracy. J. Mol. Biol. 232, 584–599 (1993)

    Article  Google Scholar 

  11. Dietterich, T.G.: Machine Learning Research: Four Current Directions. AI Magazine 18, 97–136 (1997)

    Google Scholar 

  12. Ouali, M., King, R.D.: Cascaded multiple classifiers for secondary structure prediction. Protein Sci. 9, 1162–1176 (2000)

    Article  Google Scholar 

  13. Garnier, J., Gibrat, J.F., Robson, B.G.: Method for Predicting Protein Secondary Structure from Amino Acid Sequence. Methods in Enzymology 266, 541–553 (1996)

    Article  Google Scholar 

  14. Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research 25, 3389–3402 (1997)

    Article  Google Scholar 

  15. Jones, D.: Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol. 292, 195–202 (1999)

    Article  Google Scholar 

  16. Hastie, T., Tibshirani, R., Friedman, J.: The elements of statistical learning. Springer, Heidelberg (2001)

    MATH  Google Scholar 

  17. Kohonen, T., Kangas, J., Laaksonen, J., Torkkola, K.: LVQ_PAK: A program package for the correct application of Learning Vector Quantization algorithms. In: Proceedings of the International Joint Conference on Neural Networks, pp. 725–730 (1992)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

King, R.D., Ouali, M. (2004). Poly-transformation. In: Yang, Z.R., Yin, H., Everson, R.M. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2004. IDEAL 2004. Lecture Notes in Computer Science, vol 3177. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-28651-6_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-28651-6_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22881-3

  • Online ISBN: 978-3-540-28651-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics