skip to main content
10.1145/1376616.1376696acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Querying continuous functions in a database system

Published:09 June 2008Publication History

ABSTRACT

Many scientific, financial, data mining and sensor network applications need to work with continuous, rather than discrete data e.g., temperature as a function of location, or stock prices or vehicle trajectories as a function of time. Querying raw or discrete data is unsatisfactory for these applications -- e.g., in a sensor network, it is necessary to interpolate sensor readings to predict values at locations where sensors are not deployed. In other situations, raw data can be inaccurate owing to measurement errors, and it is useful to fit continuous functions to raw data and query the functions, rather than raw data itself -- e.g., fitting a smooth curve to noisy sensor readings, or a smooth trajectory to GPS data containing gaps or outliers. Existing databases do not support storing or querying continuous functions, short of brute-force discretization of functions into a collection of tuples. We present FunctionDB, a novel database system that treats mathematical functions as first-class citizens that can be queried like traditional relations. The key contribution of FunctionDB is an efficient and accurate algebraic query processor - for the broad class of multi-variable polynomial functions, FunctionDB executes queries directly on the algebraic representation of functions without materializing them into discrete points, using symbolic operations: zero finding, variable substitution, and integration. Even when closed form solutions are intractable, FunctionDB leverages symbolic approximation operations to improve performance. We evaluate FunctionDB on real data sets from a temperature sensor network, and on traffic traces from Boston roads. We show that operating in the functional domain has substantial advantages in terms of accuracy (15-30%) and up to order of magnitude (10x-100x) performance wins over existing approaches that represent models as discrete collections of points.

References

  1. PostGIS. http://postgis.refractions.net/.Google ScholarGoogle Scholar
  2. Y. Ahmad and U. C¸ etintemel. Declarative temporal data models for sensor-driven query processing. In DMSN, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. Brodsky, V. E. Segal, J. Chen, and P. A. Exarkhopoulo. The CCUBE Constraint Object-Oriented Database System. In SIGMOD, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. Deshpande and S. Madden. MauveDB: Supporting Model-Based User Views in Database Systems. In SIGMOD, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Grumbach, P. Rigaux, and L. Segoufin. The DEDALE system for complex spatial queries. In SIGMOD, pages 213--224, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Grumbach, P. Rigaux, and L. Segoufin. Manipulating Interpolated Data is Easier than You Thought. In The VLDB Journal, pages 156--165, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. R. H. Guting, M. H. Bohlen, M. Erwig, C. S. Jensen, N. A. Lorentzos, M. Schneider, and M. Vazirgiannis. A Foundation for Representing and Querying Moving Objects. ACM TODS, 25(1):1--42, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. D. Haroud and B. Faltings. Global consistency for continuous constraints. In Principles and Practice of Constraint Programming, pages 40--50, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. B. Hull, V. Bychkovsky, Y. Zhang, K. Chen, M. Goraczko, A. K. Miu, E. Shih, H. Balakrishnan, and S. Madden. CarTel: A Distributed Mobile Sensor Computing System. In Sensys, Boulder, CO, November 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. P. C. Kanellakis, G. M. Kuper, and P. Z. Revesz. Constraint Query Languages. In PODS, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. E. J. Keogh, S. Chu, D. Hart, and M. J. Pazzani. An Online Algorithm For Segmenting Time Series. In ICDM, pages 289--296, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. R. A. O. L. Breiman, J. H. Friedman and C. J. Stone. Classification And Regression Trees. Wadsworth International Group, 1984.Google ScholarGoogle Scholar
  13. W. Y. Loh. Regression Trees With Unbiased Variable Selection And Interaction Detection. Statistica Sinica, 12:361--386, 2002.Google ScholarGoogle Scholar
  14. R. Martin, H. Shou, I. Voiculescu, A. Bowyer, and G. Wang. Comparison of Interval Methods For Plotting Algebraic Curves. Computer Aided Geometric Design, 19(7):553--587, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. P. Z. Revesz. Constraint databases: A survey. In Semantics in Databases, pages 209--246, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. P. Z. Revesz, R. Chen, P. Kanjamala, Y. Li, Y. Liu, and Y. Wang. The MLPQ/GIS Constraint Database System. In SIGMOD, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. G. Taubin. Rasterizing algebraic curves and surfaces. IEEE Comp. Graphics and Applications, 14(2):14--23, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. A. Thiagarajan. Representing and Querying Regression Models in an RDBMS. Master's thesis, MIT, Sep 2007.Google ScholarGoogle Scholar
  19. M. Vazirgiannis and O. Wolfson. A Spatiotemporal Model and Language for Moving Objects on Road Networks. In SSTD, pages 20--35, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Querying continuous functions in a database system

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              SIGMOD '08: Proceedings of the 2008 ACM SIGMOD international conference on Management of data
              June 2008
              1396 pages
              ISBN:9781605581026
              DOI:10.1145/1376616

              Copyright © 2008 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 9 June 2008

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

              Acceptance Rates

              Overall Acceptance Rate785of4,003submissions,20%

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader