Skip to main content

Heuristic Search for Model Structure: the Benefits of Restraining Greed

  • Chapter
Book cover Learning from Data

Part of the book series: Lecture Notes in Statistics ((LNS,volume 112))

Abstract

Inductive modeling or “machine learning” algorithms are able to discover structure in high-dimensional data in a nearly automated fashion. These adaptive statistical methods — including decision trees, polynomial networks, projection pursuit models, and additive networks — repeatedly search for, and add on, the model component judged best at that state. Because of the huge model space of possible components, the choice is typically greedy; that is, optimal only in the very short term. In fact, it is usual for the analyst and algorithm to be greedy at three levels: when choosing a 1) term within a model, 2) model within a family, and 3) family within a wide collection of methods. It is better, we argue, to “take a longer view” in each stage. For the first stage (term selection) examples are presented for classification using decision trees and estimation using regression. To improve the third stage (method selection) we propose fusing information from disparate models to make a combined model more robust. (Fused models merge their output estimates but also share information on, for example, variables to employ and cases to ignore.) Benefits of fusing are demonstrated on a challenging classification dataset, where the task is to infer the species of a bat from its chirps.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Barron, A. R. (1984). Predicted Squared Error: A Criterion for Automatic Model Selection. Ch. 4 of (Farlow, 1984 )

    Google Scholar 

  • Barron, R. L. & D. Abbott (1988). User of Polynomial Networks in Optimum, Real-time, Two-Point Boundary Value Guidance of Tactical Weapons,Proc. Military Comp. Conf. , Anaheim, CA, May 3–5.

    Google Scholar 

  • Berk, K. N. (1978). Comparing Subset Regression Procedures, Technometrics, 20, no. 1: 1–6.

    Article  MathSciNet  MATH  Google Scholar 

  • Breiman, L., J. H. Friedman, R. A. Olshen, & C. J. Stone (1984). Classification and Regression Trees. Wadsworth & Brooks, Pacific Grove, CA.

    MATH  Google Scholar 

  • Cover, T. M. (1974). The Best Two Independent Measurements Are Not the Two Best. IEEE Trans. Systems, Man & Cybernetics, 4.

    Google Scholar 

  • Desroachers, A. & S. Mohseni (1984). On Determining the Structure of a Non-Linear System, International Journal of Control, 40: 923–938.

    Article  MathSciNet  Google Scholar 

  • Draper, N. R. & H. Smith (1966).Applied Regression Analysis. Wiley, New York.

    Google Scholar 

  • Elder, J. F. IV (1985). User’s Manual: ASPN: Algorithm for Synthesis of Polynomial Networks (4th Ed., 1988 ). Barron Assoc. Inc., Stanardsville, VA.

    Google Scholar 

  • Elder, J. F. IV (1990). Feature Elimination Using High-Order Correlation,Proc. Aerospace Applications of Artificial Intelligence, Dayton, OH, Oct. 29–31: 65–72.

    Google Scholar 

  • Elder, J. F. IV (1993). Assisting Inductive Modeling through Visualization, Proc. Joint Statistical Mtg. , San Francisco, CA, Aug. 7–11.

    Google Scholar 

  • Elder, J. F. IV & R. L. Barron (1988). Automated Design of Continuously-Adaptive Control: The “Super-Controller” Strategy for Reconfigurable Systems,Proc. American Control Conf. , Atlanta, GA, June 15–17.

    Google Scholar 

  • Elder, J. F. IV & D. E. Brown (1992). Induction and Polynomial Networks, Univ. VA Tech. Report IPC-TR-92–9. (Forthcoming in 1995 as Chapter 3 in Advances in Control Networks and Large Scale Parallel Distributed Processing Models, Vol. 2. Ablex, Norwood, NJ.

    Google Scholar 

  • Elder, J.F.IV & D. Pregibon (1995, in press) A Statistical Perspective on Knowledge Discovery in Databases, Chapter 4 in Advances in Knowledge Discovery and Data Mining, eds. U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, & R. Uthurusamy, AAAI/MIT Press.

    Google Scholar 

  • Farlow, S. J. (1984), Ed. Self-Organizing Methods in Modeling: GMDH Type Algorithms. Marcel Dekker.

    MATH  Google Scholar 

  • Fulcher, G. E. & D. E. Brown (1991). A Polynomial Network for Predicting Temperature Distributions, Institute for Parallel Computation Tech. Report 91–008, Univ. VA.

    Google Scholar 

  • Ivakhnenko, A. G. (1968). The Group Method of Data Handling — A Rival of the Method of Stochastic Approximation, Soviet Automatic Control, 3.

    Google Scholar 

  • Lloyd, D. K., & M. Lipow (1962). Reliability: Management, Methods, and Mathematics. Prentice Hall, Englewood Cliffs: 360.

    Google Scholar 

  • Mallows, C. L. (1973). Some Comments on Cp, Technometrics. 15: 661–675.

    Article  MATH  Google Scholar 

  • Miller, A. J. (1990).Subset Selection in Regression. Chapman and Hall, NY.

    MATH  Google Scholar 

  • Mucciardi, A. N. (1982). ALN 4000 Ultrasonic Pipe Inspection System. Nondestructive Evaluation Program: Progress in 1981, EPRI Report NP-2088-SR, Jan.

    Google Scholar 

  • Murthy, S. K., S. Kasif, & S. Salzberg (1994). A System for Induction of Oblique Decision Trees, Journal of Artificial Intelligence, 2: 1–32.

    MATH  Google Scholar 

  • Prager, M. H. (1988). Group Method of Data Handling: A New Method for Stock Identification. Trans. American Fisheries Society, 117: 290–296.

    Article  Google Scholar 

  • Rissanen, J. (1978). Modeling by Shortest Data Description, Automatica, 14: 465–471.

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Springer-Verlag New York, Inc.

About this chapter

Cite this chapter

Elder, J.F. (1996). Heuristic Search for Model Structure: the Benefits of Restraining Greed. In: Fisher, D., Lenz, HJ. (eds) Learning from Data. Lecture Notes in Statistics, vol 112. Springer, New York, NY. https://doi.org/10.1007/978-1-4612-2404-4_13

Download citation

  • DOI: https://doi.org/10.1007/978-1-4612-2404-4_13

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-0-387-94736-5

  • Online ISBN: 978-1-4612-2404-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics