Skip to main content

Knowledge discovery in large spatial databases: Focusing techniques for efficient class identification

  • Spatial Data Mining
  • Conference paper
  • First Online:
Advances in Spatial Databases (SSD 1995)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 951))

Included in the following conference series:

Abstract

Both, the number and the size of spatial databases are rapidly growing because of the large amount of data obtained from satellite images, X-ray crystallography or other scientific equipment. Therefore, automated knowledge discovery becomes more and more important in spatial databases. So far, most of the methods for knowledge discovery in databases (KDD) have been based on relational database systems. In this paper, we address the task of class identification in spatial databases using clustering techniques. We put special emphasis on the integration of the discovery methods with the DB interface, which is crucial for the efficiency of KDD on large databases. The key to this integration is the use of a well-known spatial access method, the R*-tree. The focusing component of a KDD system determines which parts of the database are relevant for the knowledge discovery task. We present several strategies for focusing: selecting representatives from a spatial database, focusing on the relevant clusters and retrieving all objects of a given cluster. We have applied the proposed techniques to real data from a large protein database used for predicting protein-protein docking. A performance evaluation on this database indicates that clustering on large spatial databases can be performed, both, efficiently and effectively.

This research was funded by the German Minister for Research and Technology (BMFT) under grant no. 01 IB 307 B. The authors are responsible for the contents of this paper.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal R., Imielinski T., Swami A.: “Database Mining: A Performance Perspective”, IEEE Transactions on Knowledge and Data Engineering, Vol. 5, No.6, 1993, pp. 914–925.

    Google Scholar 

  2. Bernstein F. C., Koetzle T. F., Williams G. J., Meyer E. F., Brice M. D., Rodgers J. R., Kennard O., Shimanovichi T., Tasumi M.: ‘The Protein Data Bank: a Computer-based Archival File for Macromolecular Structures', Journal of Molecular Biology, Vol. 112, 1977, pp. 535–542.

    Google Scholar 

  3. Brinkhoff T., Horn H., Kriegel H.-P., Schneider R.: «A Storage and Access Architecture for Efficient Query Processing in Spatial Database Systems', Proc. 3rd Int. Symp. on Large Spatial Databases, Singapore, 1993, Lecture Notes in Computer Science, Vol. 692, Springer, pp. 357–376.

    Google Scholar 

  4. Beckmann N., Kriegel H.-P., Schneider R., Seeger B.: ‘The R * -tree: An Efficient and Robust Access Method for Points and Rectangles', Proc. ACM SIGMOD Int. Conf. on Management of Data, Atlantic City, NJ, 1990, pp. 322–331.

    Google Scholar 

  5. Brinkhoff T., Kriegel H.-P., Schneider R., Seeger B.: ‘Efficient Multi-Step Processing of Spatial Joins', Proc. ACM SIGMOD Int. Conf. on Management of Data, Minneapolis, MN, 1994, pp. 197–208.

    Google Scholar 

  6. Connolly M. L.: ‘Measurement of protein surface shape by solid angles', Journal of Molecular Graphics, Vol. 4, No. 1, 1986, pp. 3–6.

    Google Scholar 

  7. Ester M., Kriegel H.-P., Seidl T., Xu X.: “Shape-based Retrieval of Complementary 3D Surfaces in Protein Databases”, (in German), Proc. GI Conf. on Database Systems for Office Automation, Engineering, and Scientific Applications. 1995, Berlin: Springer 1995.

    Google Scholar 

  8. Frawley W.J., Piatetsky-Shapiro G., Matheus J.: “Knowledge Discovery in Databases: An Overview”, in: Knowledge Discovery in Databases, AAAI Press, Menlo Park, 1991, pp. 1–27.

    Google Scholar 

  9. Gueting R.H.: “An Introduction to Spatial Database Systems”, Special Issue on Spatial Database Systems of the VLDB Journal, Vol.3, No.4, October 1994.

    Google Scholar 

  10. Han J., Cai Y., Cercone N.: “Data-driven Discovery of Quantitative Rules in Relational Databases”, IEEE Transactions on Knowledge and Data Engineering, Vol. 5, No.1, 1993, pp. 29–40.

    Google Scholar 

  11. Holsheimer M., Kersten M.L.: “Architectural Support for Data Mining”, Proc. AAAI Workshop on Knowledge Discovery in Databases, Seattle, Washington, 1994, pp. 217–228

    Google Scholar 

  12. Kaufman L., Rousseeuw P.J.: “Finding Groups in Data: an Introduction to Cluster Analysis”, John Wiley & Sons, 1990.

    Google Scholar 

  13. Lu W., Han J., Ooi B.C.: “Discovery of General Knowledge in Large Spatial Databases”, Proc. Far East Workshop on Geographic Information Systems, Singapore, 1993, pp. 275–289.

    Google Scholar 

  14. Matheus C.J., Chan P.K., Piatetsky-Shapiro G.: “Systems for Knowledge Discovery in Databases”, IEEE Transactions on Knowledge and Data Engineering, Vol. 5, No.6, 1993, pp. 903–913.

    Google Scholar 

  15. Ng R.T., Han J.: “Efficient and Effective Clustering Methods for Spatial Data Mining”, Proc. 20th Int. Conf. on Very Large Data Bases, Santiago, Chile, 1994, pp. 144–155.

    Google Scholar 

  16. Protein Data Bank: ‘Quarterly Newsletter No. 70 (Oct. 1994)', Brookhaven National Laboratory, Upton, NY, 1994.

    Google Scholar 

  17. Preparata F. P., Shamos M. I.: “Computational Geometry”, Springer 1985.

    Google Scholar 

  18. Seidl T., Kriegel H.-P.: ‘Solvent Accessible Surface Representation in a Database System for Protein Docking', Proc. 3rd Int. Conference on Intelligent Systems for Molecular Biology (ISMB-95), Cambridge, UK, AAAI Press, 1995.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Max J. Egenhofer John R. Herring

Rights and permissions

Reprints and permissions

Copyright information

© 1995 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ester, M., Kriegel, HP., Xu, X. (1995). Knowledge discovery in large spatial databases: Focusing techniques for efficient class identification. In: Egenhofer, M.J., Herring, J.R. (eds) Advances in Spatial Databases. SSD 1995. Lecture Notes in Computer Science, vol 951. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60159-7_5

Download citation

  • DOI: https://doi.org/10.1007/3-540-60159-7_5

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-60159-3

  • Online ISBN: 978-3-540-49536-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics