Skip to main content

Using Domain Knowledge to Learn from Heterogeneous Distributed Databases

  • Conference paper
  • 719 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3213))

Abstract

We are concerned with the processing of data held in distributed heterogeneous databases using domain knowledge, in the form of rules representing high-level knowledge about the data. This process facilitates the handling of missing, conflicting or unacceptable outlying data. In addition, by integrating the processed distributed data, we are able to extract new knowledge at a finer level of granularity than was present in the original data. Once integration has taken place the extracted knowledge, in the form of probabilities, may be used to learn association rules or Bayesian belief networks. Issues of confidentiality and efficiency of transfer of data across networks, whether the Internet or Intranets, are handled by aggregating the native data in situ, typically behind a firewall, and carrying out further transportation and processing solely on multidimensional aggregate tables. Heterogeneity is resolved by utilisation of domain knowledge for harmonisation and integration of the distributed data sources. Integration is carried out by minimisation of the Kullback-Leibler information divergence between the target integrated aggregates and the distributed data values.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Albrecht, J., Lehner, W.: On-line Analytical Processing in Distributed Data Warehouses. IDEAS 1998, 78–85 (1998)

    Google Scholar 

  2. Chen, A.L.P., Tseng, F.S.C.: Evaluating Aggregate Operations over Imprecise Data. IEEE Transactions on Knowledge and Data Engineering, 8273–8284 (1996)

    Google Scholar 

  3. Jiawei, H.: Towards On-Line Analytical Mining in Large Databases. SIGMOD Record 27(1), 97–107 (1998)

    Article  Google Scholar 

  4. McClean, S.I., Scotney, B.W., Shapcott, C.M.: Aggregation of Imprecise and Uncertain Information for Knowledge Discovery in Databases. In: Proceedings of KDD-1998, New York, pp. 269–273 (1998)

    Google Scholar 

  5. McClean, S.I., Scotney, B.W., Shapcott, C.M.: Using Background Knowledge in the Aggregation of Imprecise Evidence in Databases. Data and Knowledge Engineering 32, 131–143 (2000a)

    Article  MATH  Google Scholar 

  6. McClean, S.I., Scotney, B.W., Shapcott, C.M.: Incorporating Domain Knowledge into Attribute-Oriented Data Mining. International Journal of Intelligent Systems 6, 535–548 (2000b)

    Article  Google Scholar 

  7. McClean, S.I., Scotney, B.W., Shapcott, M.: Aggregation of Imprecise and Uncertain Information in Databases. IEEE Transactions on Knowledge and Data Engineering (TKDE) 13(6), 902–912 (2001)

    Article  Google Scholar 

  8. McClean, S.I., Páircéir, R., Scotney, B.W., Greer, K.R.C.: A Negotiation Agent for Distributed Heterogeneous Statistical Databases. In: Proc. 14th IEEE International Conference on Scientific and Statistical Database Management (SSDBM), pp. 207–216 (2002)

    Google Scholar 

  9. McClean, S.I., Scotney, B.W., Greer, K.R.C.: A Scalable Approach to Integrating Heterogeneous Aggregate Views of Distributed Databases. IEEE Transactions on Knowledge and Data Engineering 15(1), 232–235 (2003)

    Article  Google Scholar 

  10. Parsons, S.: Current Approaches to Handling Imperfect Information in Data and Knowledge Bases. IEEE Transactions on Knowledge and Data Engineering 8, 353–372 (1996)

    Article  Google Scholar 

  11. Scotney, B.W., McClean, S.I., Rodgers, M.C.: Optimal and Efficient Integration of Heterogeneous Summary Tables in a Distributed Database. Data and Knowledge Engineering 29, 337–350 (1999a)

    Article  MATH  Google Scholar 

  12. Scotney, B.W., McClean, S.I.: Efficient Knowledge Discovery through the Integration of Heterogeneous Data. Information and Software Technology 41, 569–578 (1999b)

    Article  Google Scholar 

  13. Vardi, Y., Lee, D.: From Image Deblurring to Optimal Investments: Maximum Likelihood Solutions for Positive Linear Inverse Problems (with discussion). J. R. Statist. Soc. B, 569–612 (1993)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

McClean, S., Scotney, B., Shapcott, M. (2004). Using Domain Knowledge to Learn from Heterogeneous Distributed Databases. In: Negoita, M.G., Howlett, R.J., Jain, L.C. (eds) Knowledge-Based Intelligent Information and Engineering Systems. KES 2004. Lecture Notes in Computer Science(), vol 3213. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30132-5_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30132-5_28

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23318-3

  • Online ISBN: 978-3-540-30132-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics