Abstract
Among various proposals for primitives for deconstructing XML data two approaches seem to clearly stem from practice: path expressions, widely adopted by the database community, and regular expression patterns, mainly developed and studied in the programming language community. We think that the two approaches are complementary and should be both integrated in languages for XML, and we see in that an opportunity of collaboration between the two communities. With this aim, we give a presentation of regular expression patterns and the type systems they are tightly coupled with. Although this article advocates a construction promoted by the programming language community, we will try to stress some characteristics that the database community, we hope, may find interesting.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abiteboul, S., Quass, D., McHugh, J., Widom, J., Wiener, J.: The Lorel query language for semistructured data. International Journal on Digital Libraries 1(1), 68–88 (1997)
Amadio, R.M., Cardelli, L.: Subtyping recursive types. ACM Transactions on Programming Languages and Systems 15(4) (September 1993)
Bell-labs. Galax, http://db.bell-labs.com/galax/
Benzaken, V., Castagna, G., Frisch, A.: CDuce: an XML-friendly general purpose language. In: ICFP 2003, 8th ACM International Conference on Functional Programming, Uppsala, Sweden, pp. 51–63. ACM Press, New York (2003)
Benzaken, V., Castagna, G., Miachon, C.: A full pattern-based paradigm for XML query processing. In: Hermenegildo, M.V., Cabeza, D. (eds.) PADL 2004. LNCS, vol. 3350, pp. 235–252. Springer, Heidelberg (2005)
Bierman, G., Meijer, E., Schulte, W.: The essence of data access in Cω. In: Black, A.P. (ed.) ECOOP 2005. LNCS, vol. 3586, pp. 287–311. Springer, Heidelberg (2005)
Boag, S., Chamberlin, D., Fernandez, M., Florescu, D., Robie, J., Siméon, J., Stefanescu, M.: XQuery 1.0: An XML Query Language. W3C Working Draft (May 2003), http://www.w3.org/TR/xquery/org/TR/xquery/
Bohannon, P., Freire, J., Haritsa, J., Ramanath, M., Roy, P., Simeon, J.: LegoDB: customizing relational storage for XML documents. In: VLDB 2002, 28th Int. Conference on Very Large Databases, pp. 1091–1094 (2002)
Bothner, P.: Qexo - the GNU Kawa implementation of XQuery, http://www.gnu.org/software/qexo/org/software/qexo/
Bressan, S., Lacroix, Z., Li, Y.G., Maddalena, A.: Prune the XML before you search it: XML transformations for query optimization. In: DataX Workshop (2004)
Broberg, N., Farre, A., Svenningsson, J.: Regular expression patterns. In: ICFP 2004: 9th ACM SIGPLAN International Conference on Functional programming, pp. 67–78. ACM Press, New York (2004)
Brüggemann-Klein, A., Wood, D.: Caterpillars, context, tree automata and tree pattern matching. In: Proceedings of DLT 1999: Foundations, Applications and Perspectives. World Scientific Publishing Co., Singapore (2000)
Calcagno, C., Gardner, P., Zarfaty, U.: Context logic and tree update. In: POPL 2005, 32nd ACM Symposium on Principles of Programming Languages. ACM Press, New York (2005)
Cardelli, L., Gardner, P., Ghelli, G.: A spatial logic for querying graphs. In: Widmayer, P., Triguero, F., Morales, R., Hennessy, M., Eidenbenz, S., Conejo, R. (eds.) ICALP 2002. LNCS, vol. 2380, p. 597. Springer, Heidelberg (2002)
Cardelli, L., Gardner, P., Ghelli, G.: Manipulating trees with hidden labels. In: Gordon, A.D. (ed.) FOSSACS 2003. LNCS, vol. 2620, pp. 216–232. Springer, Heidelberg (2003)
Cardelli, L., Ghelli, G.: Tql: A query language for semistructured data based on the ambient logic. Mathematical Structures in Computer Science 14, 285–327 (2004)
Cardelli, L., Gardner, P., Ghelli, G.: A spatial logic for querying graphs. In: Widmayer, P., Triguero, F., Morales, R., Hennessy, M., Eidenbenz, S., Conejo, R. (eds.) ICALP 2002. LNCS, vol. 2380, pp. 597–610. Springer, Heidelberg (2002)
Castagna, G., Colazzo, D., Frisch, A.: Error mining for regular expression patterns. In: Coppo, M., Lodi, E., Pinna, G.M. (eds.) ICTCS 2005. LNCS, vol. 3701, pp. 160–172. Springer, Heidelberg (2005)
Castagna, G., Frisch, A.: A gentle introduction to semantic subtyping. Proceedings of PPDP 2005, the 7th ACM SIGPLAN International Symposium on Principles and Practice of Declarative Programming. ACM Press, New York (full version); Caires, L., Italiano, G.F., Monteiro, L., Palamidessi, C., Yung, M. (eds.) ICALP 2005. LNCS, vol. 3580, pp. 30–34. Springer, Heidelberg (2005); Joint ICALP-PPDP keynote talk
Chamberlin, D., Fankhauser, P., Florescu, D., Marchiori, M., Robie, J.: XML Query Use Cases. Technical Report 20030822, World Wide Web Consortium (2003)
Clark, J., DeRose, S.: XML Path Language (XPath). W3C Recommendation (November 1999), http://www.w3.org/TR/xpath/
Clark, J.: XSL Transformations (XSLT). W3C Recommendation (November 1999), http://www.w3.org/TR/xslt/
Cluet, S.: Designing OQL: allowing objects to be queried. Inf. Syst. 23(5), 279–305 (1998)
Conforti, G., Ghelli, G., Albano, A., Colazzo, D., Manghi, P., Sartiani, C.: The query language TQL. In: Proc. of the 5th WebDB, Madison, Wisconsin, USA, pp. 19–24 (2002)
Deutsch, A., Fernandez, M.F., Florescu, D., Levy, A.Y., Suciu, D.: XML-QL: A Query Language for XML. In: WWW The Query Language Workshop (QL) (1998)
Filiot, E.: Composition de requêtes monadiques dans les arbres. Master’s thesis, Master Recherche de l’Université des Sciences et Technologies de Lille (2005)
Franc, X.: Qizx/open, http://www.xfra.net/qizxopen
Frisch, A.: Regular tree language recognition with static information. In: Proc. IFIP Conference on Theoretical Computer Science (TCS), Toulouse. Kluwer, Dordrecht (2004)
Frisch, A., Castagna, G., Benzaken, V.: Semantic Subtyping. In: LICS 2002, 17th Annual IEEE Symposium on Logic in Computer Science, pp. 137–146. IEEE Computer Society Press, Los Alamitos (2002)
Frisch, A.: Théorie, conception et réalisation d’un langage de programmation fonctionnel adapté à XML. PhD thesis, Université Paris 7 (December 2004)
Gapeyev, V., Pierce, B.C.: Regular object types. In: Cardelli, L. (ed.) ECOOP 2003. LNCS, vol. 2743. Springer, Heidelberg (2003)
Gapeyev, V., Pierce, B.C.: Paths into patterns. Technical Report MS-CIS-04-25, University of Pennsylvania (October 2004)
Goris, E., Marx, M.: Looping caterpillars. In: LICS 2005, 20th Annual IEEE Symposium on Logic in Computer Science. IEEE Computer Society Press, Los Alamitos (2005)
Hosoya, H.: Regular Expression Types for XML. PhD thesis, The University of Tokyo (2001)
Hosoya, H.: Regular expressions pattern matching: a simpler design. Unpublished manuscript (February 2003)
Hosoya, H.: Regular expression filters for XML. In: Programming Languages Technologies for XML (PLAN-X) (2004)
Hosoya, H., Frisch, A., Castagna, G.: Parametric polymorphism for XML. In: POPL 2005, 32nd ACM Symposium on Principles of Programming Languages. ACM Press, New York (2005)
Hosoya, H., Murata, M.: Validation and boolean operations for attribute-element constraints. In: Programming Language Technologies for XML (PLAN-X) (2002)
Hosoya, H., Pierce, B.: XDuce: A typed XML processing language. ACM Transactions on Internet Technology 3(2), 117–148 (2003)
Hosoya, H., Pierce, B.C.: XDuce: A typed XML processing language. In: Suciu, D., Vossen, G. (eds.) WebDB 2000. LNCS, vol. 1997, p. 226. Springer, Heidelberg (2001)
Hosoya, H., Pierce, B.C.: Regular expression pattern matching for XML. In: POPL 2001, 25th ACM Symposium on Principles of Programming Languages (2001)
Internet Movie Database, http://imdb.com
Levin, M.Y., Pierce, B.C.: Type-based optimization for regular patterns. In: Bierman, G., Koch, C. (eds.) DBPL 2005. LNCS, vol. 3774, pp. 184–198. Springer, Heidelberg (2005)
Libkin, L.: Logics for unranked trees: an overview. In: Caires, L., Italiano, G.F., Monteiro, L., Palamidessi, C., Yung, M. (eds.) ICALP 2005. LNCS, vol. 3580, pp. 35–50. Springer, Heidelberg (2005)
Lu, K.Z.M., Sulzmann, M.: An implementation of subtyping among regular expression types. In: Chin, W.-N. (ed.) APLAS 2004. LNCS, vol. 3302, pp. 57–73. Springer, Heidelberg (2004)
Marian, A., Siméon, J.: Projecting XML elements. In: 29th Int. Conference on Very Large Databases (VLDB 2003), pp. 213–224 (2003)
Meijer, E., Shields, M.: XMλ: A functional language for constructing and manipulating XML documents (Draft) (1999)
Murata, M., Lee, D., Mani, M.: Taxonomy of XML schema languages using formal language theory. In: Extreme Markup Languages (2001)
Neven, F., Schwentick, T.: Expressive and efficient pattern languages for tree-structured data. In: PODS 2000: 19th ACM Symposium on Principles of Database Dystems, pp. 145–156. ACM Press, New York (2000)
Nguy\(\tilde{\textrm{\^e}}\)n, K.: Une algébre de filtrage pour le langage CDuce. DEA Programmation, Université Paris 11 (September 2004), Available at http://www.lri.fr/~kn/main.pdf
Papakonstantinou, Y., Vianu, V.: Dtd inference for views of XML data. In: PODS 2000: Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pp. 35–46. ACM Press, New York (2000)
Schmidt, A., Waas, F., Kersten, M.L., Carey, M.J., Manolescu, I., Busse, R.: XMark: A benchmark for XML data management. In: Proceedings of the Int’l. Conference on Very Large Database Management (VLDB), pp. 974–985 (2002)
Shanmugasundaram, J., Tufte, K., Zhang, C., He, G., DeWitt, D., Naughton, J.: Relational databases for querying XML documents: Limitations and opportunities. In: VLDB 1999, 25th Int. Conference on Very Large Databases, pp. 302–314 (1999)
Odersky, M., et al.: An overview of the Scala programming language. Technical Report IC/2004/64, École Polytechnique Fédérale de Lausanne (2004), Latest version at http://scala.epfl.ch
Trinder, P., Wadler, P.: Improving list comprehension database queries. In: Proc. of TENCON 1989, Bombay, India, November 1989, pp. 186–192 (1989)
Wadler, P.: List comprehensions. In: Peyton Jones, S. (ed.) The Implementation of Functional Programming Languages, ch. 7. Prentice Hall, Englewood Cliffs (1987), Available on-line at http://research.microsoft.com/Users/simonpj/Papers/slpj-book-1987
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Castagna, G. (2005). Patterns and Types for Querying XML Documents. In: Bierman, G., Koch, C. (eds) Database Programming Languages. DBPL 2005. Lecture Notes in Computer Science, vol 3774. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11601524_1
Download citation
DOI: https://doi.org/10.1007/11601524_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30951-2
Online ISBN: 978-3-540-31445-5
eBook Packages: Computer ScienceComputer Science (R0)