Abstract
We use the Edit distance with Moves on words and trees and say that two regular (tree) languages are ε-close if every word (tree) of one language is ε-close to the other. A transducer model is introduced to compare tree languages (schemas) with different alphabets and attributes. Using the statistical embedding of Fischer et al. (Proceedings of 21st IEEE Symposium on Logic in Computer Science, pp. 421–430, 2006), we show that Source-Consistency and Approximate Query Answering are testable on words and trees, i.e. can be approximately decided within ε by only looking at a constant fraction of the input.








Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
The coordinates follow the lexicographic enumeration 00,01,10,11.
References
Alon, N., Krivelich, M., Newman, I., & Szegedy, M. (2000). Regular languages are testable with a constant number of queries. SIAM Journal on Computing, 30(6), 1842–1862.
Apostolico, A., & Galil, Z. (1997). Chapter 14: Approximate tree pattern matching. In Pattern matching algorithms. Oxford: Oxford University Press.
Arenas, M., & Libkin, L. (2005). Xml data exchange: Consistency and query answering. In Proceedings of ACM symposium on principles of database systems (pp. 13–24).
Boobna, U., & de Rougemont, M. (2004). Correctors for XML data. In International XML database symposium, XSym (pp. 97–111).
Broder, A. (1997). On the resemblance and containment of documents. In Proceedings of compression and complexity of sequences (p. 21).
Cormode, G., & Muthukrishnan, S. (2002). The string edit distance matching problem with moves. In Symposium on discrete algorithms (pp. 667–676).
Fagin, R., Kolaitis, P. G., Miller, R. J., & Popa, L. (2003). Data exchange: Semantics and query answering. In International conference on database theory (pp. 207–224).
Fischer, E., Magniez, F., & de Rougemont, M. (2006). Approximate satisfiability and equivalence. In Proceedings of 21st IEEE symposium on logic in computer science (pp. 421–430).
Goldreich, O., Goldwasser, S., & Ron, D. (1998). Property testing and its connection to learning and approximation. Journal of the ACM, 45(4), 653–750.
Magniez, F., & de Rougemont, M. (2004). Property testing of regular tree languages. In International conference on automata languages and programming (ICALP) (pp. 932–944).
Martens, W., & Neven, F. (2004). Frontiers of tractability for typechecking simple xml transformations. In Principles of database systems (pp. 23–34).
Masek, M., & Paterson, M. (1980). A faster algorithm for computing string edit distance. Journal of Computer and System Sciences, 20(1), 18–31.
Parikh, R. J. (1966). On context-free languages. Journal of the ACM (JACM), 13(4), 570–581.
Rubinfeld, R., & Sudan, M. (1996). Robust characterizations of polynomials with applications to program testing. SIAM Journal on Computing, 25(2), 23–32.
Shapira, D., & Storer, J. (2002). Edit distance with move operations. In Proceedings of symposium on combinatorial pattern matching, Lecture Notes in Computer Science (Vol. 2373, pp. 85–98). Verlag.
Tai, K. C. (1979). The tree-to-tree correction problem. Journal of the Association for Computing Machinery, 26, 422–433.
Thatcher, J. W. (1967). Characterizing derivation trees of context-free grammars through a generalization of finite automata theory. Journal of Computer and System Sciences, 1, 317–322.
Wagner, R., & Fisher, M. (1974). The string-to-string correction problem. Journal of the Association for Computing Machinery, 21, 168–173.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
de Rougemont, M., Vieilleribière, A. Approximate schemas, source-consistency and query answering. J Intell Inf Syst 31, 127–146 (2008). https://doi.org/10.1007/s10844-008-0060-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10844-008-0060-9