Skip to main content

Optimizing Scientific Databases for Client Side Data Processing

  • Conference paper
  • First Online:
Book cover Advances in Database Technology — EDBT 2002 (EDBT 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2287))

Included in the following conference series:

Abstract

Databases are nowadays one more building block in complex multi-tier architectures. In general, however, they are still designed and optimized with little regard for the applications that will run on top of them. This problem is particularly acute in scientific applications where the data is usually processed at the client and, hence, conventional server side optimizations are of limited help. In this paper we present a variety of techniques and a novel client/server architecture designed to optimize the client side processing of scientific data. The main building block in our approach is to store frequently accessed data as relatively small, wavelet encoded segments. These segments can be processed at different qualities and resolutions, thereby enabling efficient processing of very large data volumes. Experimental results demonstrate that our approach significantly reduces overhead (I/O, transfer across network, decoding and analysis), does not require changes to the analysis routines and provides all possible resolution ranges.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. S. Abiteboul, O. M. Duschka Complexity of Answering Queries using Materialzied Views Technical report Stanford University, 1997

    Google Scholar 

  2. M. J. Aschwanden, B. Kliem, U. Schwarz, J. Kurths, B. R. Dennis, R. A. Schwartz Wavelet Analysis of Solar Flare Hard X-Rays Astrophysical Journal, 505, 941–956,1998

    Article  Google Scholar 

  3. C. Aurrecoechea, A. Campell, L. Hauw A Survey of QoS architectures Multimedia Systems, pp. 138–151, Juni 1998

    Google Scholar 

  4. P. Bendjoya, J. M. Petit, F. Spahn Wavelet Analysis of the Voyager Data on Planetary Rings ICARUS, Vol 105, pp. 385–299, 1993

    Article  Google Scholar 

  5. T. Barclay, D. R. Slutz, J. Gray TerraServer: A Spatial Data Warehouse Proc. of the ACM Conference on Management of Data (SIGMOD), 2000

    Google Scholar 

  6. F. Buccafurri, D. Rosaci, D. Sacca Compressed Datacubes for Fast OLAP Applications First International Conference on Data Warehousing and Knowledge Discovery (DaWaK), pp. 65–77, 1999

    Google Scholar 

  7. R. Buyya (Ed.) High Performance Cluster Computing, Vol. 1 and 2, Prentice Hall, 1999

    Google Scholar 

  8. S. Chaudhuri, U. Dayal An Overview of Data Warehousing and OLAP Technology ACM SIGMOD Record, 26(1), March 1997

    Google Scholar 

  9. K. Chakrabarti, M. Garofalakis, R. Rastogi, K. Shim Approximate Query Processing Using Wavelets Proc. of the VLDB Conference, Cairo, Egypt, pp. 111–120, 2000

    Google Scholar 

  10. S. Chaudhuri, R. Krishnamurthy, S. Potamianos, K. Shim Optimizing Queries with Materialized Views ICDE, pp. 190–200, 1995

    Google Scholar 

  11. S. Cohen, W. Nutt, A. Serenbrenik Algorithms for Rewriting Aggregate Queries Using Views Proc. of the International Workshop on Design and Management of Data Warehouses (DMDW), pp. 9.1–9.12, 1999

    Google Scholar 

  12. O. M. Duschka, M. R. Genesereth Answering Recursive Queries using Views Proc. of the PODS Conference, pp. 109–116, 1997

    Google Scholar 

  13. Jochen Doppelhammer, Thomas Höppler, Alfons Kemper, Donald Kossmann Database Performance in the Real World-TPC-D and SAP R/3 Proceedings ACM SIGMOD International Conference on Management of Data, May 13–15, 1997, Tucson, Arizona, USA

    Google Scholar 

  14. P. B. Gibbons, Y. Matias New Sampling-Based Summary Statistics for Improving Approximate Query Answers Proc. of the Conference on Managment of Data (SIGMOD), Seattle, USA, pp. 331–342, June 1998

    Google Scholar 

  15. D. Gunopulos, V. N. Tsotras, G. Kollios, C. Domeniconi Approximating multi-dimensional aggregate range queries over real attributes Proc. of the Conference on Management of Data (SIGMOD), Dallas, USA, May 2000

    Google Scholar 

  16. J. M. Hellerstein, P. J. Haas, H. J. Wang Online Aggregation Proc. of the Conference on Management of Data (SIGMOD), Tucson, USA, May 1997

    Google Scholar 

  17. W. Hoschek, J. J. Martinez, A. S. Samar, H. Stockinger, K. Stockinger Data Management in an International Data Grid Project ACM Workshop on Grid Computing (GRID-00), Bangalore, India, 17–20 Dec., 2000

    Google Scholar 

  18. Y. E. Ioannidis, V. Poosala Histrogram-Based Approximation of Set-Valued Query Answers Proc. of the VLDB Conference, Edinburgh, Great Britain, September 1999

    Google Scholar 

  19. B. Jawerth, W. Sweldens An Overview of Wavelet-based Multiresolution Analyses SIAM Review, 36(3), pp. 377–412, 1994

    Article  MATH  MathSciNet  Google Scholar 

  20. G. Kaestle, E. C. Shek, S. K. Dao Sharing Experiences from Scientific Experiments Proc. of the International Conference on Scientific and Statistical Database Management, 1998

    Google Scholar 

  21. A. Y. Levy, A. O. Mendelzon, D. Srivastava, Y. Sagiv Answering Queries Using Views Proc. of the PODS Conference, 1995

    Google Scholar 

  22. Y. Matias, J. S. Vitter, M. Wang Dynamic Maintenance of Wavelet-Based Histograms Proc. of the VLDB Conference, Cairo, Egypt, pp. 101–110, 2000

    Google Scholar 

  23. B. Oezden, R. Rastogi, A. Silverschatz Multimedia Support for Databases Proc. of the PODS Conference, 1997

    Google Scholar 

  24. M. Riedewald, D. Agrawal, A. E. Abbadi Flexible Data Cubes for Online Aggregation Proc. of the Int. Conference on Database Theory, pp. 159–173, 2001

    Google Scholar 

  25. G. Stoesser et. al. The EMBL Nucleotide Sequence Database Nuclear Acids Research, 27(1), 18–24. 1999

    Article  Google Scholar 

  26. J. Shanmugasundaram, U. Fayyad, P. S. Bradley Compressed Data Cubes for OLAP Aggregate Query Approximation on Continuous Dimensions KDD, Dan Diego, USA, pp. 223–231, 1999

    Google Scholar 

  27. F. Sheikholeslami, S. Chatterjee, A. Zhang WaveCluster: a wavelet-based clustering approach for spatial data in very large Databases The VLDB Journal, Vol. 8 No 3–4, pp. 289–304, 2000

    Article  Google Scholar 

  28. A. Szalay, P. Z. Kunszt, A. Thakar, J. Gray, and D. R. Slutz Designing and mining multi-terabyte astronomy archives: The sloan digital sky survey Proc. of the Conference on Management of Data (SIGMOD), Dallas, USA, pp. 451–462, May 16–18, 2000

    Google Scholar 

  29. J. S. Vitter, M. Wang, B. Iyer Data Cube Approximation and Histograms via Wavelets Proc. of the CIKM, Bethesda, USA, 1998

    Google Scholar 

  30. J. S. Vitter, M. Wang Approximate Computation of Multidimensional Aggregates of Sparse Data Using Wavelets Proc. of the Conference on Management of Data (SIGMOD), Philadelphia, USA, June 1999

    Google Scholar 

  31. J. Z. Wang, G. Wiederhold, O. Firschein, S. X. Wei Content-based image indexing and searching using Daubechies wavelets International Jounal on Digital Libraries, Volume 1, Issue 4, pp. 311–328, 1998

    Article  Google Scholar 

  32. J. T. Wang, K. Zhang, D. Shasha Pattern Matching and Pattern Discovery in Scientific, Program, and Document Databases. Proc. of the Conference on Management of Data (SIGMOD), 1995

    Google Scholar 

  33. M. Zemankova, Y. E. Ioannidis Scientific Databases-State of the Art and Future Directions. Proc. of the VLDB Conference, Santiago, Chile, 1994

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Stolte, E., Alonso, G. (2002). Optimizing Scientific Databases for Client Side Data Processing. In: Jensen, C.S., et al. Advances in Database Technology — EDBT 2002. EDBT 2002. Lecture Notes in Computer Science, vol 2287. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45876-X_26

Download citation

  • DOI: https://doi.org/10.1007/3-540-45876-X_26

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43324-8

  • Online ISBN: 978-3-540-45876-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics