skip to main content
10.1145/872757.872818acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
Article

Cache-and-query for wide area sensor databases

Published:09 June 2003Publication History

ABSTRACT

Webcams, microphones, pressure gauges and other sensors provide exciting new opportunities for querying and monitoring the physical world. In this paper we focus on querying wide area sensor databases, containing (XML) data derived from sensors spread over tens to thousands of miles. We present the first scalable system for executing XPATH queries on such databases. The system maintains the logical view of the data as a single XML document, while physically the data is fragmented across any number of host nodes. For scalability, sensor data is stored close to the sensors, but can be cached elsewhere as dictated by the queries. Our design enables self starting distributed queries that jump directly to the lowest common ancestor of the query result, dramatically reducing query response times. We present a novel query-evaluate gather technique (using XSLT) for detecting (1) which data in a local database fragment is part of the query result, and (2) how to gather the missing parts. We define partitioning and cache invariants that ensure that even partial matches on cached data are exploited and that correct answers are returned, despite our dynamic query-driven caching. Experimental results demonstrate that our techniques dramatically increase query throughputs and decrease query response times in wide area sensor databases.

References

  1. Apache Xindice Database. http://www.dbxml.org.]]Google ScholarGoogle Scholar
  2. Xalan-Java. http://xml.apache.org/xalan-j.]]Google ScholarGoogle Scholar
  3. S. Abiteboul, J. McHugh, M. Rys, V. Vassalos, and J. L. Wiener. Incremental maintenance for materialized views over semistructured data. In VLDB, 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Agrawal and S. Sengupta. Modular synchronization in distributed, multi-version databases: Version control and concurrency control. IEEE TKDE, 5(1), 1993.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. R. Alonso, D. Barbara, and H. Garcia-Molina. Data caching issues in an information retrieval system. TODS, 15(3), 1990.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. R. Blumofe, C. Joerg, B. Kuszmaul, C. Leiserson, K. Randall, and Y. Zhou. Cilk: An efficient multithreaded runtime system. In Symposium on Principles and Practice of Parallel Programming, 1995.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. P. Bonnet, J. E. Gehrke, and P. Seshadri. Towards sensor database systems. In Mobile Data Management, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. J. Carey et al. XPERANTO: Publishing object-relational data as XML. In WebDB, 2000.]]Google ScholarGoogle Scholar
  9. D. Carney et al. Monitoring streams - A new class of data management applications. In VLDB, 2002.]]Google ScholarGoogle Scholar
  10. S. Chandrasekaran et al. TelegraphCQ: Continuous dataflow processing for an uncertain world. In CIDR, 2003.]]Google ScholarGoogle Scholar
  11. A. Crespo and H. Garcia-Molina. Routing indices for peer-to-peer systems. In International Conference on Distributed Computing Systems, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Fox, S. D. Gribble, Y. Chawathe, E. A. Brewer, and P. Gauthier. Cluster-based scalable network services. In SOSP, 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. Franklin and M. Carey. Client-server caching revisited. In International Workshop on Distributed Object Management, 1992.]]Google ScholarGoogle Scholar
  14. G. Graefe. Query evaluation techniques for large databases. ACM Computing Surveys, 25(2), 1993.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Gray, P. Helland, P. O'Neil, and D. Shasha. The dangers of replication and a solution. In SIGMOD, 1996.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. S. Gribble, A. Halevy, Z. Ives, M. Rodrig, and D. Suciu. What can databases do for peer-to-peer. In WebDB, 2001.]]Google ScholarGoogle Scholar
  17. M. Harren, J. Hellerstein, R. Huebsch, B. Loo, S. Shenker, and I. Stoica. Complex queries in DHT-based peer-to-peer networks. In International Workshop on Peer-to-Peer Systems, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. P. Kalnis, W. S. Ng, B. C. Ooi, D. Papadias, and K.-L. Tan. An adaptive peer-to-peer network for distributed caching of OLAP results. In SIGMOD, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. D. Karger, E. Lehman, T. Leighton, M. Levine, D. Lewin, and R. Panigrahy. Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the world wide web. In STOC, 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. Klettke and H. Meyer. XML and object-relational database systems -- enhancing structural mappings based on statistics. In WebDB, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. N. Krishnakumar and A. Bernstein. Bounded ignorance in replicated systems. In PODS, 1991.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. S. Madden and M. J. Franklin. Fjording the stream: An architecture for queries over streaming sensor data. In ICDE, 2002.]]Google ScholarGoogle ScholarCross RefCross Ref
  23. S. Madden, M. J. Franklin, J. M. Hellerstein, and W. Hong. TAG: A tiny aggregation service for ad hoc sensor networks. In OSDI, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. P. V. Mockapetris and K. J. Dunlap. Development of the Domain Name System. In SIGCOMM, 1988.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. R. Motwani et al. Query processing, approximation, and resource management in a data stream management system. In CIDR, 2003.]]Google ScholarGoogle Scholar
  26. C. Olston and J. Widom. Best-effort cache synchronization with source cooperation. In SIGMOD, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. C. Pu and A. Leff. Replica control in distributed system: An asynchronous approach. In SIGMOD, 1991.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. T. Shimura, M. Yoshikawa, and S. Uemura. Storage and retrieval of XML documents using object-relational databases. In Database and Expert Systems Applications, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. J. Sidell, P. M. Aoki, S. Barr, A. Sah, C. Staelin, M. Stonebraker, and A. Yu. Data replication in Mariposa. In ICDE, 1996.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. M. Stonebraker and G. Kemnitz. The Postgres next generation database management system. CACM, 34(10), 1991.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. B. Surjanto, N. Ritter, and H. Loeser. XML content management based on object-relational database technology. In Web Info. Sys. Eng., 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. B. W. Wah. File placement on distributed computer systems. IEEE Computer, 17(1), 1984.]]Google ScholarGoogle Scholar
  33. M. Wahl, T. Howes, and S. Kille. Lightweight Directory Access Protocol (v3). Tech report, IETF, RFC 2251, 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Cache-and-query for wide area sensor databases

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in
              • Published in

                cover image ACM Conferences
                SIGMOD '03: Proceedings of the 2003 ACM SIGMOD international conference on Management of data
                June 2003
                702 pages
                ISBN:158113634X
                DOI:10.1145/872757

                Copyright © 2003 ACM

                Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Published: 9 June 2003

                Permissions

                Request permissions about this article.

                Request Permissions

                Check for updates

                Qualifiers

                • Article

                Acceptance Rates

                SIGMOD '03 Paper Acceptance Rate53of342submissions,15%Overall Acceptance Rate785of4,003submissions,20%

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader