ABSTRACT
Webcams, microphones, pressure gauges and other sensors provide exciting new opportunities for querying and monitoring the physical world. In this paper we focus on querying wide area sensor databases, containing (XML) data derived from sensors spread over tens to thousands of miles. We present the first scalable system for executing XPATH queries on such databases. The system maintains the logical view of the data as a single XML document, while physically the data is fragmented across any number of host nodes. For scalability, sensor data is stored close to the sensors, but can be cached elsewhere as dictated by the queries. Our design enables self starting distributed queries that jump directly to the lowest common ancestor of the query result, dramatically reducing query response times. We present a novel query-evaluate gather technique (using XSLT) for detecting (1) which data in a local database fragment is part of the query result, and (2) how to gather the missing parts. We define partitioning and cache invariants that ensure that even partial matches on cached data are exploited and that correct answers are returned, despite our dynamic query-driven caching. Experimental results demonstrate that our techniques dramatically increase query throughputs and decrease query response times in wide area sensor databases.
- Apache Xindice Database. http://www.dbxml.org.]]Google Scholar
- Xalan-Java. http://xml.apache.org/xalan-j.]]Google Scholar
- S. Abiteboul, J. McHugh, M. Rys, V. Vassalos, and J. L. Wiener. Incremental maintenance for materialized views over semistructured data. In VLDB, 1998.]] Google ScholarDigital Library
- D. Agrawal and S. Sengupta. Modular synchronization in distributed, multi-version databases: Version control and concurrency control. IEEE TKDE, 5(1), 1993.]] Google ScholarDigital Library
- R. Alonso, D. Barbara, and H. Garcia-Molina. Data caching issues in an information retrieval system. TODS, 15(3), 1990.]] Google ScholarDigital Library
- R. Blumofe, C. Joerg, B. Kuszmaul, C. Leiserson, K. Randall, and Y. Zhou. Cilk: An efficient multithreaded runtime system. In Symposium on Principles and Practice of Parallel Programming, 1995.]] Google ScholarDigital Library
- P. Bonnet, J. E. Gehrke, and P. Seshadri. Towards sensor database systems. In Mobile Data Management, 2001.]] Google ScholarDigital Library
- M. J. Carey et al. XPERANTO: Publishing object-relational data as XML. In WebDB, 2000.]]Google Scholar
- D. Carney et al. Monitoring streams - A new class of data management applications. In VLDB, 2002.]]Google Scholar
- S. Chandrasekaran et al. TelegraphCQ: Continuous dataflow processing for an uncertain world. In CIDR, 2003.]]Google Scholar
- A. Crespo and H. Garcia-Molina. Routing indices for peer-to-peer systems. In International Conference on Distributed Computing Systems, 2002.]] Google ScholarDigital Library
- A. Fox, S. D. Gribble, Y. Chawathe, E. A. Brewer, and P. Gauthier. Cluster-based scalable network services. In SOSP, 1997.]] Google ScholarDigital Library
- M. Franklin and M. Carey. Client-server caching revisited. In International Workshop on Distributed Object Management, 1992.]]Google Scholar
- G. Graefe. Query evaluation techniques for large databases. ACM Computing Surveys, 25(2), 1993.]] Google ScholarDigital Library
- J. Gray, P. Helland, P. O'Neil, and D. Shasha. The dangers of replication and a solution. In SIGMOD, 1996.]] Google ScholarDigital Library
- S. Gribble, A. Halevy, Z. Ives, M. Rodrig, and D. Suciu. What can databases do for peer-to-peer. In WebDB, 2001.]]Google Scholar
- M. Harren, J. Hellerstein, R. Huebsch, B. Loo, S. Shenker, and I. Stoica. Complex queries in DHT-based peer-to-peer networks. In International Workshop on Peer-to-Peer Systems, 2001.]] Google ScholarDigital Library
- P. Kalnis, W. S. Ng, B. C. Ooi, D. Papadias, and K.-L. Tan. An adaptive peer-to-peer network for distributed caching of OLAP results. In SIGMOD, 2002.]] Google ScholarDigital Library
- D. Karger, E. Lehman, T. Leighton, M. Levine, D. Lewin, and R. Panigrahy. Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the world wide web. In STOC, 1997.]] Google ScholarDigital Library
- M. Klettke and H. Meyer. XML and object-relational database systems -- enhancing structural mappings based on statistics. In WebDB, 2000.]] Google ScholarDigital Library
- N. Krishnakumar and A. Bernstein. Bounded ignorance in replicated systems. In PODS, 1991.]] Google ScholarDigital Library
- S. Madden and M. J. Franklin. Fjording the stream: An architecture for queries over streaming sensor data. In ICDE, 2002.]]Google ScholarCross Ref
- S. Madden, M. J. Franklin, J. M. Hellerstein, and W. Hong. TAG: A tiny aggregation service for ad hoc sensor networks. In OSDI, 2002.]] Google ScholarDigital Library
- P. V. Mockapetris and K. J. Dunlap. Development of the Domain Name System. In SIGCOMM, 1988.]] Google ScholarDigital Library
- R. Motwani et al. Query processing, approximation, and resource management in a data stream management system. In CIDR, 2003.]]Google Scholar
- C. Olston and J. Widom. Best-effort cache synchronization with source cooperation. In SIGMOD, 2002.]] Google ScholarDigital Library
- C. Pu and A. Leff. Replica control in distributed system: An asynchronous approach. In SIGMOD, 1991.]] Google ScholarDigital Library
- T. Shimura, M. Yoshikawa, and S. Uemura. Storage and retrieval of XML documents using object-relational databases. In Database and Expert Systems Applications, 1999.]] Google ScholarDigital Library
- J. Sidell, P. M. Aoki, S. Barr, A. Sah, C. Staelin, M. Stonebraker, and A. Yu. Data replication in Mariposa. In ICDE, 1996.]] Google ScholarDigital Library
- M. Stonebraker and G. Kemnitz. The Postgres next generation database management system. CACM, 34(10), 1991.]] Google ScholarDigital Library
- B. Surjanto, N. Ritter, and H. Loeser. XML content management based on object-relational database technology. In Web Info. Sys. Eng., 2000.]] Google ScholarDigital Library
- B. W. Wah. File placement on distributed computer systems. IEEE Computer, 17(1), 1984.]]Google Scholar
- M. Wahl, T. Howes, and S. Kille. Lightweight Directory Access Protocol (v3). Tech report, IETF, RFC 2251, 1997.]] Google ScholarDigital Library
Index Terms
- Cache-and-query for wide area sensor databases
Recommendations
Research of the Query Technology in Wide Area Sensor Databases
CIMCA '06: Proceedings of the International Conference on Computational Inteligence for Modelling Control and Automation and International Conference on Intelligent Agents Web Technologies and International CommerceWide area sensor databases are a hotspot research area internationally at present. The query processing technology in wide area sensor databases is analyzed in detail. Based on it, the method of query equal decomposition is proposed. Firstly, the ...
Comments