Abstract
Existing meta-learning based distributed data mining approaches do not explicitly address context heterogeneity across individual sites. This limitation constrains their applications where distributed data are not identically and independently distributed. Modeling heterogeneously distributed data with hierarchical models, this paper extends the traditional meta-learning techniques so that they can be successfully used in distributed scenarios with context heterogeneity.
The support of the Informatics Research Initiative of Enterprise Ireland is gratefully acknowledged.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Provost, F.: Distributed data mining: Scaling up and beyond. In: Kargupta, H., Chan, P.K. (eds.) Advances in Distributed and Parallel Knowledge Discovery, pp. 3–27. AAAI/MIT Press (2000)
Park, B.H., Kargupta, H.: Distributed data mining: Algorithms, systems, and applications. In: Ye, N. (ed.) Data Mining Handbook (2002)
Prodromidis, A.L., Chan, P.K., Stolfo, S.J.: Meta-learning in distributed data mining systems: Issues and approaches. In: Kargupta, H., Chan, P.K. (eds.) Advances in Distributed and Parellel Knowledge Discovery, vol. ch. 3. AAAI/MIT Press (2000)
Kargupta, H., Chan, P.K.: Distributed and parallel data mining: A brief introduction. In: Kargupta, H., Chan, P.K. (eds.) Advances in Distributed and Parallel Knowledge Discovery, pp. xv–xxvi. AAAI/MIT Press (2000)
Draper, D.: Bayesian hierarchical modeling. Tutorial on ISBA 2000 (2000), http://www.bath.ac.uk/
Draper, D.: Inference and hierarchical modeling in the social sciences. Journal of Educational and Behavioral Statistics 20, 115–147 (1995)
Wirth, R., Borth, M., Hipp, J.: When distribution is part of the semantics: A new problem class for distributed knowledge discovery. In: Marinaro, M., Tagliaferri, R. (eds.) 5th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD 2001),Workshop on Ubiquitous Data Mining for Mobile and Distributed Environments, pp. 56–64 (2001)
Brandt, S.: Data Analysis: Statistical and Computational Methods for Scientists and Engineers, 3rd edn. Springer, Heidelberg (1998)
Lipsey, M.W., Wilson, D.B.: Practical Meta-Analysis. SAGE Publications, Thousand Oaks (2000)
Goldstein, H.: Multilevel Statistical Models, 2nd edn. ARNOLD, London (1995)
Kreft, I., Leeuw, J.D.: Introducing Multilevel Modeling. Sage Publications, Thousand Oaks (1998)
Xing, Y., Duggan, J., Madden, M.G., Lyons, G.J.: A multi-agent system for customer behavior prediction in virtual organization. Technical Report NUIG-IT-170503, Department of Information Technology, NUIG (2002)
Breiman, L.: Bagging predictors. Machine Learning 24, 123–140 (1996)
Dietterich, T.G.: Ensemble methods in macine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000)
Friedman, J.H.: Multivariate adaptive regression splines. Annals of Statistics 19, 1–141 (1991)
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Academic Press, London (2000)
Chan, P.K., Fan, W., Prodromidis, A.L., Stolfo, S.J.: Distributed data mining in credit card fraud detection. IEEE Intelligent System 14, 67–74 (1999)
Hershberger, D.E., Kargupta, H.: Distributed multivariate regression using wavelet-based collective data mining. Journal of Parallel and Distributed Computing 61, 372–400 (1999)
Páircéir, R., McClean, S., Scitney, B.: Discovery of multi-level rules and exceptions from a distributed database. In: Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD 2000, pp. 523–532 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Xing, Y., Madden, M.G., Duggan, J., Lyons, G.J. (2003). Distributed Regression for Heterogeneous Data Sets. In: R. Berthold, M., Lenz, HJ., Bradley, E., Kruse, R., Borgelt, C. (eds) Advances in Intelligent Data Analysis V. IDA 2003. Lecture Notes in Computer Science, vol 2810. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45231-7_50
Download citation
DOI: https://doi.org/10.1007/978-3-540-45231-7_50
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40813-0
Online ISBN: 978-3-540-45231-7
eBook Packages: Springer Book Archive