Abstract
We describe a path approximate search process based on an extended editing distance designed to manage ‘don’t care characters’ (*) with variable length (*i~j) in a path matching scheme extending XPath. The structural path is bounded to conditional properties using variables whose values are retrieved thanks to a backtracking processed on the editing distance matrix. This system provides a dedicated iterator for a XML query and processing scripting language that features large XML document collection management, joint operations and extraction features.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Akutsu, T.: Approximate String Matching with Variable Length Don’t Care Characters. IEICE Trans. E78-D (1996)
Amer-Yahia, S., Lakshmanan, L.V.T., Pandit, S.: FlexPath: Flexible Structure and Full-text Querying for XML
Amer-Yahia, S., Botev, C., Shanmugasundaram, J.: TeXQuery: A Full-Text Search Extension to XQuery in WWW 2004 (2004)
Baeza-Yates, B., Navarro, G.: XQL and Proximal nodes. In: Proceedings ACM SIGIR 2000 Workshop on XML and Information Retrieval (2000)
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley, Reading (1999)
Botev, C., Amer-Yahia, S., Shanmugasundaram, J.: A TexQuery-Based XML Full-Text Search Engine (Demo Paper). In: SIGMOD (2004)
Bremer, J., Gertz, M.: XQuery/IR: Integrating XML Document and Data Retrieval. In: WebDB 2002, pp. 1–6 (2002)
Chinenyanga, T., Kushmerick, N.: An expressive and Efficient Language for Information Retrieval. JASIST 53(6), 438–453 (2002)
Cole, R., Gottlieb, L., Lewenstein, M.: Dictionary Matching and Indexing With Errors and Don’t Cares. In: Annual ACM Symposium on Theory of Computing. Proceedings of the thirty-sixth annual ACM symposium on Theory of computing, vol. 2B, pp. 91–100 (2004)
Denoyer, L., Gallinari, P.: The Wikipedia XML Corpus, SIGIR Forum (2006)
Fisher, M., Patterson, M.: String Matching and Other Products. Complexity of Computation, SIAM-ACM Proceeding 7, 113–125 (1974)
Fragment Description, http://www.w3.org/TR/WD-xml-fragment#terminology
Fuhr, N., Grojohann, K.: XIRQL: A Query Language for Information Retrieval in XML Documents. In: SIGIR 2001 (2001)
Guo, L., Shao, F., Botev, C., Shanmugasundaram, J.: XRANK: Ranked keyword search over XML Documents. In: SIGMOD, pp. 16–27 (2003)
Landau, G.M., Vishkin, U.: Fast parallel and serial approximate string matching. Journal of Algorithm 10, 157–169 (1989)
Levenshtein: Binary Codes Capable of Correcting Deletions, Insertions, and Reversals. Doklady Akademii Nauk SSSR 10(8), 707–710 (1966) (Russian) (English translation in Soviet Physics Doklady (aka report) 10(8), 707–710 (1966))
Manber, U., Baeza-Yates, R.: An Algorithm for String Matching with a Sequence of Don’t Cares. Information Processing Letters 37, 133–136 (1991)
Ménier, G., Marteau, P.F.: Information Retrieval in Heterogeneous XML Knowledge Bases. In: The 9th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, Annecy, France, July 1-5. IEEE, Los Alamitos (2002)
Meyer, H., Bruder, I., Weber, G., Heuer, A.: The Xircus Search Engine (2003), http://www.xircus.de
Myers, E.W., Miller, W.: Approximate Matching of Regular Expressions. Bulletin of Mathematical Biology 51, 5–37 (1989)
Popovici, E., Ménier, G., Marteau, P.-F.: SIRIUS: A lightweight XML indexing and approximate search system at INEX 2005. In: Fuhr, N., Lalmas, M., Malik, S., Kazai, G. (eds.) INEX 2005. LNCS, vol. 3977, pp. 321–335. Springer, Heidelberg (2006)
Prolog Development Center (PDC), http://www.pdc.dk
Theobald, A., Weikum, G.: Adding Relevance to XML. In: WebDB 2000 (2000)
Theobald, A., Weikum, G.: The index-based XXL search engine for querying XML data with relevance ranking. In: Jensen, C.S., Jeffery, K., Pokorný, J., Šaltenis, S., Bertino, E., Böhm, K., Jarke, M. (eds.) EDBT 2002. LNCS, vol. 2287, pp. 477–495. Springer, Heidelberg (2002)
Trotman, A., Sigurbjörnsson, B.: Narrowed Extended XPath I (NEXI). In: Pre-Proceedings of the INEX 2004 Workshop. Schloss Dagstuhl, Germany, pp. 219–227 (2004)
QDBM sourceforge: M. Hirabayashi, http://qdbm.sourceforge.net/
Visual Prolog, product, http://www.visual-prolog.com
Wang, H., Park, S., Fan, W., Yu, P.: ViST: A Dynamic Index Method for Querying XML Data by Tree Structures. In: SIGMOD (2003)
XPath Reference, http://www.w3.org/TR/xpath
XQuery Reference: A Query Language for XML (Feburary 2001), http://www.w3.org/TR/xquery
XQuery & Exist, http://exist.sourceforge.net/
Zhang, K., Shasha, D., Wang, J.: Approximate Tree Matching in the Presence of Variable Length Don’t Cares”. Journal of Algorithms 16, 33–66 (1994)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ménier, G., Marteau, PF., Bonnel, N. (2009). Binding Structural Properties to Node and Path Constraints in XML Path Retrieval. In: Damiani, E., Yetongnon, K., Chbeir, R., Dipanda, A. (eds) Advanced Internet Based Systems and Applications. SITIS 2006. Lecture Notes in Computer Science, vol 4879. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01350-8_30
Download citation
DOI: https://doi.org/10.1007/978-3-642-01350-8_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01349-2
Online ISBN: 978-3-642-01350-8
eBook Packages: Computer ScienceComputer Science (R0)