Skip to main content

Distributed Regression for Heterogeneous Data Sets

  • Conference paper
Advances in Intelligent Data Analysis V (IDA 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2810))

Included in the following conference series:

Abstract

Existing meta-learning based distributed data mining approaches do not explicitly address context heterogeneity across individual sites. This limitation constrains their applications where distributed data are not identically and independently distributed. Modeling heterogeneously distributed data with hierarchical models, this paper extends the traditional meta-learning techniques so that they can be successfully used in distributed scenarios with context heterogeneity.

The support of the Informatics Research Initiative of Enterprise Ireland is gratefully acknowledged.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Provost, F.: Distributed data mining: Scaling up and beyond. In: Kargupta, H., Chan, P.K. (eds.) Advances in Distributed and Parallel Knowledge Discovery, pp. 3–27. AAAI/MIT Press (2000)

    Google Scholar 

  2. Park, B.H., Kargupta, H.: Distributed data mining: Algorithms, systems, and applications. In: Ye, N. (ed.) Data Mining Handbook (2002)

    Google Scholar 

  3. Prodromidis, A.L., Chan, P.K., Stolfo, S.J.: Meta-learning in distributed data mining systems: Issues and approaches. In: Kargupta, H., Chan, P.K. (eds.) Advances in Distributed and Parellel Knowledge Discovery, vol. ch. 3. AAAI/MIT Press (2000)

    Google Scholar 

  4. Kargupta, H., Chan, P.K.: Distributed and parallel data mining: A brief introduction. In: Kargupta, H., Chan, P.K. (eds.) Advances in Distributed and Parallel Knowledge Discovery, pp. xv–xxvi. AAAI/MIT Press (2000)

    Google Scholar 

  5. Draper, D.: Bayesian hierarchical modeling. Tutorial on ISBA 2000 (2000), http://www.bath.ac.uk/

  6. Draper, D.: Inference and hierarchical modeling in the social sciences. Journal of Educational and Behavioral Statistics 20, 115–147 (1995)

    Google Scholar 

  7. Wirth, R., Borth, M., Hipp, J.: When distribution is part of the semantics: A new problem class for distributed knowledge discovery. In: Marinaro, M., Tagliaferri, R. (eds.) 5th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD 2001),Workshop on Ubiquitous Data Mining for Mobile and Distributed Environments, pp. 56–64 (2001)

    Google Scholar 

  8. Brandt, S.: Data Analysis: Statistical and Computational Methods for Scientists and Engineers, 3rd edn. Springer, Heidelberg (1998)

    MATH  Google Scholar 

  9. Lipsey, M.W., Wilson, D.B.: Practical Meta-Analysis. SAGE Publications, Thousand Oaks (2000)

    Google Scholar 

  10. Goldstein, H.: Multilevel Statistical Models, 2nd edn. ARNOLD, London (1995)

    Google Scholar 

  11. Kreft, I., Leeuw, J.D.: Introducing Multilevel Modeling. Sage Publications, Thousand Oaks (1998)

    Google Scholar 

  12. Xing, Y., Duggan, J., Madden, M.G., Lyons, G.J.: A multi-agent system for customer behavior prediction in virtual organization. Technical Report NUIG-IT-170503, Department of Information Technology, NUIG (2002)

    Google Scholar 

  13. Breiman, L.: Bagging predictors. Machine Learning 24, 123–140 (1996)

    MathSciNet  MATH  Google Scholar 

  14. Dietterich, T.G.: Ensemble methods in macine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  15. Friedman, J.H.: Multivariate adaptive regression splines. Annals of Statistics 19, 1–141 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  16. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Academic Press, London (2000)

    Google Scholar 

  17. Chan, P.K., Fan, W., Prodromidis, A.L., Stolfo, S.J.: Distributed data mining in credit card fraud detection. IEEE Intelligent System 14, 67–74 (1999)

    Article  Google Scholar 

  18. Hershberger, D.E., Kargupta, H.: Distributed multivariate regression using wavelet-based collective data mining. Journal of Parallel and Distributed Computing 61, 372–400 (1999)

    Article  Google Scholar 

  19. Páircéir, R., McClean, S., Scitney, B.: Discovery of multi-level rules and exceptions from a distributed database. In: Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD 2000, pp. 523–532 (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Xing, Y., Madden, M.G., Duggan, J., Lyons, G.J. (2003). Distributed Regression for Heterogeneous Data Sets. In: R. Berthold, M., Lenz, HJ., Bradley, E., Kruse, R., Borgelt, C. (eds) Advances in Intelligent Data Analysis V. IDA 2003. Lecture Notes in Computer Science, vol 2810. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45231-7_50

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-45231-7_50

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40813-0

  • Online ISBN: 978-3-540-45231-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics