Skip to main content

The Greedy Prepend Algorithm for Decision List Induction

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4263))

Abstract

We describe a new decision list induction algorithm called the Greedy Prepend Algorithm (GPA). GPA improves on other decision list algorithms by introducing a new objective function for rule selection and a set of novel search algorithms that allow application to large scale real world problems. GPA achieves state-of-the-art classification accuracy on the protein secondary structure prediction problem in bioinformatics and the English part of speech tagging problem in computational linguistics. For both domains GPA produces a rule set that human experts find easy to interpret, a marked advantage in decision support environments. In addition, we compare GPA to other decision list induction algorithms as well as support vector machines, C4.5, naive Bayes, and a nearest neighbor method on a number of standard data sets from the UCI machine learning repository.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Rivest, R.L.: Learning decision lists. Machine Learning 2, 229–246 (1987)

    MathSciNet  Google Scholar 

  2. Newman, D.J., Hettich, S., Blake, C.L., Merz, C.J.: UCI repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html

  3. Clark, P., Niblett, T.: The CN2 induction algorithm. Machine Learning 3, 261–283 (1989)

    Google Scholar 

  4. Webb, G.I.: Recent progress in learning decision lists by prepending inferred rules. In: Proceedings of the Second Singapore International Conference on Intelligent Systems (SPICIS 1994), Singapore, pp. B280–B285 (1994)

    Google Scholar 

  5. Newlands, D., Webb, G.I.: Alternative strategies for decision list construction. In: Proceedings of the Fourth Data Mining Conference (DM IV 2003), pp. 265–273 (2004)

    Google Scholar 

  6. Clark, P., Boswell, R.: Rule induction with CN2: Some recent improvements. In: Kodratoff, Y. (ed.) EWSL 1991. LNCS, vol. 482, pp. 151–163. Springer, Heidelberg (1991)

    Chapter  Google Scholar 

  7. Webb, G.I.: Opus: An efficient admissible algorithm for unordered search. JAIR 3, 431–465 (1995)

    MATH  Google Scholar 

  8. Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceedings of the Workshop on Massive Datasets, Washington, DC, NRC, Committee on Applied and Theoretical Statistics (1993)

    Google Scholar 

  9. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

  10. Chou, P.Y., Fasman, G.D.: Conformational parameters for amino acids in helical, beta sheet and random coil regions calculated from proteins. Biochemistry 13(2), 211–222 (1974)

    Article  Google Scholar 

  11. Levin, J.M., Pascarella, S., Argos, P., Garnier, J.: Quantification of secondary structure prediction improvement using multiple alignment. Prot. Engin. 6, 849–854 (1993)

    Article  Google Scholar 

  12. Rost, B., Sander, C.: Prediction of protein secondary structure at better than 70% accuracy. Journal of Molecular Biology 232, 584–599 (1993)

    Article  Google Scholar 

  13. Huang, J.T., Wang, M.T.: Secondary structural wobble: The limits of protein prediction accuracy. Biochemical and Biophysical Research Communications 294(3), 621–625 (2002)

    Article  Google Scholar 

  14. Cuff, J.A., Barton, G.J.: Evaluation and improvement of multiple sequence methods for protein secondary structure prediction. Proteins: Structure, Function, and Genetics 34, 508–519 (1999)

    Article  Google Scholar 

  15. King, R.D., Sternberg, M.J.E.: Identification and application of the concepts important for accurate and reliable protein secondary structure prediction. Protein Sci 5, 2298–2310 (1996)

    Article  Google Scholar 

  16. Frishman, D., Argos, P.: Seventy-five percent accuracy in protein secondary structure prediction. Proteins: Structure, Function, and Genetics 27, 329–335 (1997)

    Article  Google Scholar 

  17. Salamov, A.A., Solovyev, V.V.: Prediction of protein secondary structure by combining nearest-neighbor algorithms and multiple sequence alignments. Journal of Molecular Biology 247, 11–15 (1995)

    Article  Google Scholar 

  18. Marcus, M.P., Santorini, B., Marcinkiewicz, M.A.: Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics 19(2), 313–330 (1993)

    Google Scholar 

  19. Weischedel, R., Meteer, M., Schwartz, R., Ramshaw, L.: Coping with ambiguity and unknown words through probabilistic models. Computational Linguistics 19(2), 359–382 (1993)

    Google Scholar 

  20. Brill, E.: Transformation-based error-driven learning and natural language processing: A case study in part-of-speech tagging. Computational Linguistics 21(4), 543–565 (1995)

    Google Scholar 

  21. Ratnaparkhi, A.: A maximum entropy model for part-of-speech tagging. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (1996)

    Google Scholar 

  22. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yuret, D., de la Maza, M. (2006). The Greedy Prepend Algorithm for Decision List Induction. In: Levi, A., Savaş, E., Yenigün, H., Balcısoy, S., Saygın, Y. (eds) Computer and Information Sciences – ISCIS 2006. ISCIS 2006. Lecture Notes in Computer Science, vol 4263. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11902140_6

Download citation

  • DOI: https://doi.org/10.1007/11902140_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-47242-1

  • Online ISBN: 978-3-540-47243-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics