Abstract
A precondition of existing ensemble-based distributed data mining techniques is the assumption that contributing data are identically and independently distributed. However, this assumption is not valid in many virtual organization contexts because contextual heterogeneity exists. Focusing on regression tasks, this paper proposes a context-based meta-learning technique for horizontally partitioned data with contextual heterogeneity. The predictive performance of our new approach and the state of the art techniques are evaluated and compared on both simulated and real-world data sets.
The support of the Informatics Research Initiative of Enterprise Ireland is gratefully acknowledged.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Byrne, J.: The virtual corporation. Business Week, 36–40 (1993)
Park, B.H., Kargupta, H.: Distributed Data Mining: Algorithms, Systems, and Applications. In: IEA, pp. 341–358 (2002)
Chan, P.K., Fan, W., Prodromidis, A.L., Stolfo, S.J.: Distributed data mining in credit card fraud detection. IEEE Intelligent Systems 14, 67–74 (1999)
Guo, Y., Sutiwaraphun, J.: Distributed learning with knowledge probing: A new framework for distributed data mining. In: Kargupa, H., Chan, P. (eds.) Advances in Distributed and Parallel Knowledge Discovery, pp. 113–131. MIT/AAAI Press (2000)
Gorodetski, V., Skormin, V., Popyack, L., Karsaev, O.: Distributed learning in a data fusion system. In: Proceedings of Conference of theWorld Computer Congress (WCC-2000) and Intelligent Information Processing (IIP 2000), Beijing, pp. 147–154 (2000)
Chawla, N.V., Moore, T.E., Hall, L.O., Bowyer, K.W., Kegelmeyer, W.P., Springer, C.: Distributed learning with bagging-like performance. Pattern Recognition Letters 24, 455–471 (2003)
Wirth, R., Borth, M., Hipp, J.: When distribution is part of the semantics: A new problem class for distributed knowledge discovery. In: Proceedings of PKDD 2001 Workshop on Ubiquitous Data Mining for Mobile and Distributed Environments, Freiburg, Germany, pp. 56–64 (2001)
Xing, Y., Madden, M.G., Duggan, J., Lyons, G.J.: Distributed regression for heterogeneous data sets. In: Berthold, M.R., Lenz, H.-J., Bradley, E., Kruse, R., Borgelt, C. (eds.) IDA 2003. LNCS, vol. 2810, pp. 544–553. Springer, Heidelberg (2003)
Xing, Y.: Context-based Numeric Prediction for Distributed Data with Contextual Heterogeneity. PhD thesis, National University of Ireland, Galway, Ireland (2004)
Goldstein, H.: Multilevel Statistical Models, 2nd edn. Arnold (1995)
Draper, D.: Bayesian hierarchical modeling (2000), Online http://citeseer.nj.nec.com/draper00bayesian.html
DMEF: DMEF academic data sets. The Direct Marketing Educational Foundation INC. New York, USA (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Xing, Y., Madden, M.G., Duggan, J., Lyons, G.J. (2005). Context-Sensitive Regression Analysis for Distributed Data. In: Li, X., Wang, S., Dong, Z.Y. (eds) Advanced Data Mining and Applications. ADMA 2005. Lecture Notes in Computer Science(), vol 3584. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11527503_35
Download citation
DOI: https://doi.org/10.1007/11527503_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-27894-8
Online ISBN: 978-3-540-31877-4
eBook Packages: Computer ScienceComputer Science (R0)