ABSTRACT
Nowadays, we are required to deal with more complex data, prime examples of which are data on the Web, XML data, biological data, etc. There are already proposed abstractions to handle these kinds of data, in particular in terms of semistructured data models. A semistructured model conceives a database essentially as a finite directed labeled graph whose nodes represent objects, and whose edges represent relationships between objects. In this paper, we focus on path queries, which are considered the basic querying mechanism for semistructured data. In essence, such queries are used to navigate, or discover paths that conform to specifications captured by regular expressions. In order to make the navigation more useful, we consider generalized path queries, in which the symbols could optionally be weighted by numbers. Such numbers can express a variety of information about the data that the query could possibly match or navigate.Motivated by the plethora of today's applications utilizing Web services and peer-to-peer architectures, we present a distributed algorithm for evaluating generalized path queries. We follow a realistic model with distributed (non-shared) memory and message-passing between processors. An optimal solution to the problem lies in the intersection of ideas related to distributed query evaluation, distributed shortest path computation, and queueing systems.
- Abiteboul, S., Buneman, P., and Suciu, D. Data on the Web: From Relations to Semistructured Data and XML. Morgan Kaufmann, San Francisco, CA, 1999. Google ScholarDigital Library
- Aho, A., Hopcroft, J. E., and Ullman, J. D. The Design and Analysis of Computer Algorithms. Addison-Wesley, Reading, MA, 1974. Google ScholarDigital Library
- Calvanese, D., Giacomo, G., Lenzerini, M., and Vardi, M. Y., Rewriting of regular expressions and regular path queries. In Proceedings of the 19th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS '99). ACM Press, New York, NY, 1999, 194--204. Google ScholarDigital Library
- Calvanese, D., Giacomo, G., Lenzerini, M., and Vardi, M. Y. View-based query processing and constraint satisfaction. In Proceedings of the 15th Annual IEEE Symposium on Logic in Computer Science (LICS '00). IEEE Press, Washington, DC, 2000, pp. 361--371. Google ScholarDigital Library
- Cooper, R. Introduction to Queueing Theory. Elsevier, North-Holland, 1981.Google Scholar
- Flesca, S., Furfaro, F., and Greco, S. Weighted path queries on web data. In Proceedings of the 4th International Workshop on the Web and Databases (WebDB '01). Informal Proceedings, pp. 7--12.Google Scholar
- Gallo, G., and Pallottino, S. Shortest path methods in transportation models. In: (M. Florian, ed.) Transportation Planning Models. Elsevier, North-Holland, 1984, pp. 227--256.Google Scholar
- Grahne, G., and Thomo, A. Approximate reasoning in semistructured data. In Proceedings of the 8th International Workshop on Knowledge Representation meets Databases (KRDB '01). Online Proceedings, http://ceur-ws.org/Vol-45/03-thomo.psGoogle Scholar
- Grahne, G., and Thomo, A. Query answering and containment for regular path queries under distortions. In Proceedings of the 3rd International Symposium on Foundations of Information and Knowledge Systems (FoIKS '04). Lecture Notes in Computer Science 2942, Springer, 2004, pp. 98--115.Google ScholarCross Ref
- Hopcroft, J., E., and Ullman, J., D. Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, Reading, MA, 1979. Google ScholarDigital Library
- Hribar, M., R., Taylor, V., E., and Boyce, D., E. Implementing parallel shortest path for parallel transportation applications. Parallel Computing, 27 (12), 2001, pp. 1537--1568.Google ScholarCross Ref
- Matocha, J., and Camp, T. A taxonomy of distributed termination detection algorithms. Journal of Systems and Software, 43 (3), 1998, pp. 207--221. Google ScholarDigital Library
- Mendelzon, A., O., Mihaila, G., A., and Milo, T. Querying the World Wide Web. International Journal on Digital Libraries, 1 (1), 1997, pp. 57--67.Google Scholar
- Mendelzon A. O., and Wood T. P., Finding regular simple paths in graph databases. SIAM Journal on Computing, 24 (6), 1995, pp. 1235--1258. Google ScholarDigital Library
- Suciu D., Distributed query evaluation on semistructured data. ACM Transactions on Database Systems, 27 (1), 2002, pp. 1--62. Google ScholarDigital Library
- Yadin, M., and Naor, P. Queueing systems with a removable service station. Operations Research Quarterly, 14 (4), 1963, pp. 393--405.Google ScholarCross Ref
Index Terms
- Distributed evaluation of generalized path queries
Recommendations
Regular path queries under approximate semantics
We give a general framework for approximate query processing in semistructured databases. We focus on regular path queries, which are the integral part of most of the query languages for semistructured databases. To enable approximations, we allow the ...
Fault-tolerant computation of distributed regular path queries
Regular path queries are the building blocks of almost any mechanism for querying semistructured data. Despite the fact that the main applications of such data are distributed, there are only few works dealing with distributed evaluation of regular path ...
Weighted path queries on semistructured databases
Path queries have been extensively used to query semistructured data, such as the Web and XML documents. In this paper we introduce weighted path queries, an extension of path queries enabling several classes of optimization problems (such as the ...
Comments