Skip to main content

Correction of Invalid XML Documents with Respect to Single Type Tree Grammars

  • Conference paper
Networked Digital Technologies (NDT 2011)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 136))

Included in the following conference series:

Abstract

XML documents and related technologies represent a widely accepted standard for managing semi-structured data. However, a surprisingly high number of XML documents is affected by well-formedness errors, structural invalidity or data inconsistencies. The aim of this paper is the proposal of a correction framework involving structural repairs of elements with respect to single type tree grammars. Via the inspection of the state space of a finite automaton recognising regular expressions, we are always able to find all minimal repairs against a defined cost function. These repairs are compactly represented by shortest paths in recursively nested multigraphs, which can be translated to particular sequences of edit operations altering XML trees. We have proposed an efficient algorithm and provided a prototype implementation.

This work was partially supported by the Czech Science Foundation (GAČR), grants number 201/09/P364 and P202/10/0573.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bouchou, B., Cheriat, A., Ferrari Alves, M.H., Savary, A.: Integrating Correction into Incremental Validation. In: BDA (2006)

    Google Scholar 

  2. Allauzen, C., Mohri, M.: A Unified Construction of the Glushkov, Follow, and Antimirov Automata. In: Královič, R., Urzyczyn, P. (eds.) MFCS 2006. LNCS, vol. 4162, pp. 110–121. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  3. Corrector Prototype Implementation, http://www.ksi.mff.cuni.cz/~svoboda/

  4. Thompson, H.S., Beech, D., Maloney, M., Mendelsohn, N.: XML Schema Part 1: Structures, 2nd edn. (2004), http://www.w3.org/TR/xmlschema-1/

  5. Mlynkova, I., Toman, K., Pokorny, J.: Statistical Analysis of Real XML Data Collections. In: Proceedings of the 13th International Conference on Management of Data (2006)

    Google Scholar 

  6. Murata, M., Lee, D., Mani, M., Kawaguchi, K.: Taxonomy of XML Schema Languages using Formal Language Theory. ACM Trans. Internet Technol. 5(4), 660–704 (2005)

    Article  Google Scholar 

  7. Svoboda, M.: Processing of Incorrect XML Data. Master’s thesis, Department of Software Engineering, Charles University in Prague, Czech Republic, Malostranske namesti 25, 118 00 Praha 1, Czech Republic (July 2010)

    Google Scholar 

  8. Flesca, S., Furfaro, F., Greco, S., Zumpano, E.: Querying and Repairing Inconsistent XML Data. In: Ngu, A.H.H., Kitsuregawa, M., Neuhold, E.J., Chung, J.-Y., Sheng, Q.Z. (eds.) WISE 2005. LNCS, vol. 3806, pp. 175–188. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  9. Staworko, S., Chomicky, J.: Validity-Sensitive Querying of XML Databases. In: Freund, Y., Györfi, L., Turán, G., Zeugmann, T. (eds.) ALT 2008. LNCS (LNAI), vol. 5254. Springer, Heidelberg (2008)

    Google Scholar 

  10. Bray, T., Paoli, J., Sperberg-McQueen, C.M., Maler, E., Yergeau, F., Cowan, J.: Extensible Markup Language (XML) 1.1, 2nd edn. (2006), http://www.w3.org/XML/

  11. Boobna, U., de Rougemont, M.: Correctors for XML data. In: Bellahsène, Z., Milo, T., Rys, M., Suciu, D., Unland, R. (eds.) XSym 2004. LNCS, vol. 3186, pp. 97–111. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  12. Tan, Z., Zhang, Z., Wang, W., Shi, B.-L.: Computing Repairs for Inconsistent XML Document Using Chase. In: Dong, G., Lin, X., Wang, W., Yang, Y., Yu, J.X. (eds.) APWeb/WAIM 2007. LNCS, vol. 4505, pp. 293–304. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Svoboda, M., Mlýnková, I. (2011). Correction of Invalid XML Documents with Respect to Single Type Tree Grammars. In: Fong, S. (eds) Networked Digital Technologies. NDT 2011. Communications in Computer and Information Science, vol 136. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22185-9_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-22185-9_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-22184-2

  • Online ISBN: 978-3-642-22185-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics