Abstract
This paper describes problem of prediction that is based on direct marketing data coming from Nationwide Products and Services Questionnaire (NPSQ) prepared by Polish division of Acxiom Corporation. The problem that we analyze is stated as prediction of accessibility to Internet. Unit of the analysis corresponds to a group of individuals in certain age category living in a certain building located in Poland. We used several machine learning methods to build our prediction models. Particularly, we applied ensembles of weak learners and ModLEM algorithm that is based on rough set approach. Comparison of results generated by these methods is included in the paper. We also report some of problems that we encountered during the analysis.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Booker, L.B., Goldberg, D.E., Holland, J.F.: Classifier systems and genetic algorithms. In: Carbonell, J.G. (ed.) Machine Learning. Paradigms and Methods, pp. 235–282. The MIT Press, Cambridge, MA (1990)
Breiman, L.: Bagging Predictors. Machine Learning 24(2), 123–140 (1996)
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth (1984)
Friedman, J.H., Popescu, B.E.: Predictive Learning via Rule Ensembles. Research Report, Stanford University (February 2005) (last access 1.06.2006), http://www-stat.stanford.edu/~jhf/
Grzymala-Busse, J.W., Stefanowski, J.: Three discretization methods for rule induction. International Journal of Intelligent Systems 16(1), 29–38 (2001)
Pawlak, Z.: Rough Sets. Theoretical Aspects of Reasoning about Data. Kluwer Academic Publishers, Dordrecht (1991)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)
Rough Sets Data Explorer (ROSE2) (last access 1.06.2006), http://idss.cs.put.poznan.pl/site/rose.html
Schapire, R.E., Freund, Y., Bartlett, P., Lee, W.E.: Boosting the margin: A new explanation for the effectiveness of voting methods. The Annals of Statistics. 26(5), 1651–1686 (1998)
Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1995)
Witten, I., Frank, H., Data Mining, E.: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Błaszczyński, J., Dembczyński, K., Kotłowski, W., Pawłowski, M. (2006). Mining Direct Marketing Data by Ensembles of Weak Learners and Rough Set Methods. In: Tjoa, A.M., Trujillo, J. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2006. Lecture Notes in Computer Science, vol 4081. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11823728_21
Download citation
DOI: https://doi.org/10.1007/11823728_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37736-8
Online ISBN: 978-3-540-37737-5
eBook Packages: Computer ScienceComputer Science (R0)