A Minimal Description Length Scheme for Polynomial Regression

Pečkov, Aleksandar; Džeroski, Sašo; Todorovski, Ljupčo

doi:10.1007/978-3-540-68125-0_26

Aleksandar Pečkov¹,
Sašo Džeroski¹ &
Ljupčo Todorovski¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5012))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

2547 Accesses
2 Citations

Abstract

The paper addresses the task of polynomial regression, i.e., the task of inducing polynomials from numeric data that can be used to predict the value of a selected numeric variable. As in other learning tasks, we face the problem of finding an optimal trade-off between the complexity of the induced model and its predictive error. One of the approaches to finding this optimal trade-off is the minimal description length (MDL) principle. In this paper, we propose an MDL scheme for polynomial regression, which includes coding schemes for polynomials and the errors they make on data. We empirically compare this principled MDL scheme to an ad-hoc MDL scheme and show that it performs better. The improvements in performance are such that the polynomial regression approach we propose is now comparable in performance to other commonly used methods for regression, such as model trees.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Integer Weighted Regression Tsetlin Machines

PreciSplit: A Novel Approach to Predicting Polynomial Regression Data

Polynomial Multivariate Approximation with Genetic Algorithms

References

Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth International, Belmont (1984)
MATH Google Scholar
Grünwald, P., Myung, I., Pitt, M. (eds.): Advances in minimum description length: Theory and applications. MIT Press, Cambridge (2005)
Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.: The elements of statistical learning. Springer, New York (2001)
MATH Google Scholar
Newman, D., Hettich, C.B.S., Merz, C.: UCI repository of machine learning databases (1998)
Google Scholar
Rissanen, J.: A universal prior for integers and estimation by minimum description length. The Annals of Statistics 11, 416–431 (1983)
Article MathSciNet MATH Google Scholar
Rissanen, J.: Mdl denoising. IEEE Transactions on Information Theory 46, 2537–2543 (1999)
Article MathSciNet Google Scholar
Robnik, M.: Pruning regression trees with mdl. In: Proceedings of the European Conference on Artificial Intelligence, pp. 455–459. John Wiley and Sons, Brighton (1998)
Google Scholar
Todorovski, L., Ljubič, P., Džeroski, S.: Inducing polynomial equations for regression. In: Proceedings of the Fifteenth International Conference on Machine Learning, pp. 441–452. ACM Press, Banff, Alberta, Canada (2004)
Google Scholar
Torgo, L.: Regression datasets (1998)
Google Scholar
Witten, I.H., Frank, E. (eds.): Data mining: Practical machine learning tools and techniques. Morgan Kaufmann, San Francisco (2005)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Jozef Stefan Institute, Jamova 39, 1000, Ljubljana, Slovenia
Aleksandar Pečkov, Sašo Džeroski & Ljupčo Todorovski

Authors

Aleksandar Pečkov
View author publications
You can also search for this author in PubMed Google Scholar
Sašo Džeroski
View author publications
You can also search for this author in PubMed Google Scholar
Ljupčo Todorovski
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Takashi Washio Einoshin Suzuki Kai Ming Ting Akihiro Inokuchi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pečkov, A., Džeroski, S., Todorovski, L. (2008). A Minimal Description Length Scheme for Polynomial Regression. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2008. Lecture Notes in Computer Science(), vol 5012. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68125-0_26

Download citation

DOI: https://doi.org/10.1007/978-3-540-68125-0_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68124-3
Online ISBN: 978-3-540-68125-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Minimal Description Length Scheme for Polynomial Regression

Abstract

Access this chapter

Preview

Similar content being viewed by others

Integer Weighted Regression Tsetlin Machines

PreciSplit: A Novel Approach to Predicting Polynomial Regression Data

Polynomial Multivariate Approximation with Genetic Algorithms

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

A Minimal Description Length Scheme for Polynomial Regression

Abstract

Access this chapter

Preview

Similar content being viewed by others

Integer Weighted Regression Tsetlin Machines

PreciSplit: A Novel Approach to Predicting Polynomial Regression Data

Polynomial Multivariate Approximation with Genetic Algorithms

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation