Skip to main content

A Systematic Database Summary Generation Using the Distributed Query Discovery System

  • Conference paper
Computational Science and Its Applications – ICCSA 2004 (ICCSA 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3046))

Included in the following conference series:

  • 922 Accesses

Abstract

This paper introduces an approach to generate a database summary systematically using the distributed query discovery system, MASSON. Our approach is first to create an object-view and partition the database based on the object-view into clusters with similar properties, and then to generate the summary for each cluster. For this purpose, we propose a data set representation framework and introduce a proper similarity measure framework. The paper also describes the techniques used to generalize the generated primitive summary descriptions by MASSON and to improve the performance of the system using clustered computers and CORBA.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: Proc. ACM SIGMOD, pp. 207–216 (1993)

    Google Scholar 

  2. Anderson, E., Culler, D., Paterson, D.: A Case of NOW (Network of Workstations). IEEE Micro 15(1), 54–64 (1995)

    Article  Google Scholar 

  3. Chen, M.-S., Han, J., Yu, P.S.: Data mining: An Overview from a database perspective. IEEE Transactions on knowledge and data engineering 8(6) (1996)

    Google Scholar 

  4. Dhar, V., Tuzhilin, A.: Abstract-Driven Pattern Discovery in Databases. IEEE Transactions on Knowledge and Data engineering 5 (1993)

    Google Scholar 

  5. Dumant, B., Tran, D., Horn, F., Stefani, J.-B.: Jonathan: an Open Distributed Processing Environment in Java. In: Middleware 1998: IFIP International Conference on Distributed Systems and Open Distributed Processing, The Lake District, UK (1998)

    Google Scholar 

  6. DuMouchel, W., Volinsky, C., Johnson, T., Cortes, C., Pregibon, D.: Squashing Flat Files Flatter. In: Proc. Of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 1999), San Diego, California, USA (1999)

    Google Scholar 

  7. Everitt, B.S.: Cluster Analysis, 3rd edn. Edward Arnold, Copublished by Halsted Press and imprint of John Wiley & Sons Inc (1993)

    Google Scholar 

  8. Gibson, D., Kleinberg, J., Raghavan, P.: Clustering Categorical Data: An Approach Based on Dynamical Systems. In: Proc. of the 24th International Conference on Very Large Databases, New York, USA (1998)

    Google Scholar 

  9. Gower, J.C.: A general coefficient of similarity and some of its properties. Biometrics 27, 857–872 (1971)

    Article  Google Scholar 

  10. Hoschka, P., Klösgen, W.: A Support System for Interpreting Statistical Data. Knowledge Discovery in Databases. MIT Press, Cambridge, MA (1991)

    Google Scholar 

  11. Kimm, H.L., Ryu, T.W.: A Framework for Distributed Knowledge Discovery System over Heterogeneous Networks using CORBA. In: Proc. of the ACM SIGKDD 2000 Workshop on Distributed and Parallel Knowledge Discovery, Boston, Massachusetts (2000)

    Google Scholar 

  12. Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. The MIT Press, Cambridge (1990)

    Google Scholar 

  13. Lee, D.H., Kim, M.H.: Discovering Database Summaries through Refinements of Fuzzy Hypotheses. IEEE Transactions on Knowledge and Data engineering 5 (1993)

    Google Scholar 

  14. Mitra, S., Hayashi, Y.: Neuro-Fuzzy Rule Generation: Survey in Soft Computing Framework. IEEE Transactions on Neural Networks 11(3), 748–768 (2000)

    Article  Google Scholar 

  15. Neri, F., Giordana, A.: A parallel genetic algorithm for concept learning. In: Proc. 6th International Conference on Genetic Algorithms, pp. 436–443 (1995)

    Google Scholar 

  16. Otte, R., Patrick, P., Roy, M.: Understanding CORBA: The common object request broker architecture. Prentice Hall, Englewood Cliffs (1996)

    Google Scholar 

  17. Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.-C.: Prefix- Span: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth. In: Proc. of the 17th International Conference on Data Engineering, Heidelberg, Germany (2001)

    Google Scholar 

  18. Ryu, T.W., Eick, C.F.: Deriving Queries from Results using Genetic Programming. In: Proc. of the 2nd Int’l Conf. on Knowledge Discovery and Data Mining, Portland, Oregon (1996)

    Google Scholar 

  19. Ryu, T.W., Eick, C.F.: Similarity Measures for Multi-valued Attributes for Database Clustering. In: Proc. of the International Conference on SMART ENGINEERING SYSTEM DESIGN (ANNIE 1998), St. Louis, Missouri (1998)

    Google Scholar 

  20. Ryu, T.W., Chung, H., Chang, W., Salameh, H.: Database Clustering vs. Flat File Data Clustering. In: Proc. of the International Conference on Artificial Intelligence, Las Vegas (2001)

    Google Scholar 

  21. Tversky, A.: Feature of Similarity. Psychological review 84(4), 327–352 (1977)

    Article  Google Scholar 

  22. Wilson, D.R., Martinez, T.R.: Improved Heterogeneous Distance Functions. Journal of Artificial Intelligence Research 6, 1–34 (1997)

    MATH  MathSciNet  Google Scholar 

  23. Zhong, N., Ohsuga, S.: Managing/refining structural characteristics discovered from databases. In: Proc. of the 24th Hawaii International Conference on System Sciences, vol. 3, pp. 283–292 (1995)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ryu, T.W., Eick, C.F. (2004). A Systematic Database Summary Generation Using the Distributed Query Discovery System. In: Laganá, A., Gavrilova, M.L., Kumar, V., Mun, Y., Tan, C.J.K., Gervasi, O. (eds) Computational Science and Its Applications – ICCSA 2004. ICCSA 2004. Lecture Notes in Computer Science, vol 3046. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24768-5_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24768-5_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22060-2

  • Online ISBN: 978-3-540-24768-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics