Model checking XSL transformations

https://doi.org/10.1016/j.cl.2014.03.003Get rights and content

Highlights

  • We develop a framework to model XSL transformations and XML Schema constraints.

  • We use model checking to verify the correctness of XSL transformations.

  • We provide a tool for XSL verification, including an advanced error reporting.

  • We show how to interactively check and fix XSL transformations in our framework.

Abstract

The XSLT language is key technology to develop software which manipulates data encoded in XML, a versatile formalism widely adopted for information description and exchange. This motivates the adoption of formal techniques to certify the correctness (with respect to the expected output) and robustness (e.g., tolerance to malformed inputs) of the XSLT code. Unfortunately, such code cannot be validated using only static approaches (i.e., without executing it), due to the complexity of the XSLT formalism. In this paper we show how a software verification technology, namely the model checking, can be adapted to obtain an effective and easy to use XSLT validation framework. The core of the presented methodology is the XSLToMurphi algorithm, which is able to build a formal model of an XSLT transformation, suitable to be verified through the CMurphi tool.

Introduction

The eXtensible Markup Language (XML) [1] is a flexible tagged text format derived from the Standard Generalised Markup Language (SGML) [2], proposed in 1996 by the W3C Consortium [3] as a tool to standardise the format of all the documents used on the Web and to meet the challenges of large-scale electronic publishing. The XML language is extremely simple and versatile: the information is organised in a tree-structured fashion, which is both human readable and machine processable.

This W3C proposal had immediate success, becoming a de facto standard, and it is now being widely adopted for the description and the manipulation of data. Indeed, XML documents represent data structures in a clear, cross-platform and open format, which is ideal for the Web. This trend has been further pushed by the fact that applications can easily manipulate XML-structured data using specialised tools and languages like the XSL Transformations (XSLT [4]).

XSLT is a key technology for accessing XML data: indeed, by using XSLT it is possible to

  • Perform complex data manipulation that would traditionally require server side programming.

  • Transform XML data in any text or XML based format.

  • Achieve a strong separation between the data and its representation, which could be otherwise obtained only using methodologies like the HTML templates [5], [6].

These features make XSLT suitable to many scenarios, like data manipulation and presentation in XML databases [7], [8], [9], [10], [11], [12], publishing of XML data through XHTML [13] or XSL-FO [14], [15] and conversion between XML data structures [7], [16], [17], [18]. Moreover, the introduction of XML-based standards in many research communities led to the adoption of XSLT for data manipulation tasks that were previously performed using traditional programming languages: for example, the XMI [19] standard made XSLT transformations a powerful tool in the software engineering field [20], [21], [22].

More recently, in XML databases, new languages like XQuery [23] are being used to perform XML data manipulation and transformation. XQuery and XSLT have the same expressive power, so both formalisms can be used alternatively [24].

XSLT is a powerful and widely used XML technology, and this motivates the adoption of the highest quality standards in the design and development of XSL transformations. This, in turn, implies big efforts to optimise the XSLT processing performances [25] and to check the correctness of XSLT transformations, i.e., verify that the transformation output has the expected format and properties.

In general, certifying the correctness of a system requires some kind of formal proof. Indeed, manually checking a system against a set of test cases, as it is usually done in the common programming practice, does not ensure that it would perform as expected in any situation. On the other hand, a formal proof guarantees that the system will always behave as specified.

The formal checking of XSLT has been already addressed in the literature: most of the current works (see Section 8) validate XSL transformations using static approaches like type checking or flow analysis, where the code is statically analysed without executing it. In the software/hardware programming practice, such kinds of approaches are always the best choice, when applicable, since they are usually very fast and effective: for instance, strictly typed programming languages prevent the user from making a lot of common mistakes.

However, often the code complexity makes such techniques not suitable for the complete formal validation of an application׳s behaviour. Indeed, when applied on complex code structures, the static source analysis often requires too much resources or turns out to be very approximated. These considerations are also valid for XSLT validation, due to the complexity of the XSLT/XPath constructs. Indeed, XSLT is Turing complete [26], so we know that its static validation is mathematically undecidable.

An answer to the static validation problems above comes from model checking: if we cannot certify that a given code is correct by statically reasoning on its structure, we can run (a suitably simplified model of) it and dynamically analyse its behaviour.

In particular, model checking techniques are used to exhaustively verify a system by automatically checking all its possible states against a set of user-defined assertions. The result of this verification is a certification that has the same value of a formal proof of correctness.

Indeed, model checking is being widely adopted in the development of critical systems, like embedded hardware [27], as well as in the verification of code written in statically typed programming languages, such as Java [28], [29], [30]. In other words, an effective framework that results in complete, in-depth code verification requires the integration of different, static and dynamic techniques, such as type checking and model checking.

To this aim, in this paper we show how model checking can be applied to XSLT validation. In particular, we describe the design and implementation of a complete model checking based XSLT verification process. The core of this process is the XSLToMurphi algorithm, which is able to translate XSLT stylesheets into models that can be verified through the CMurphi tool [31].

By exploiting techniques designed to handle large-scale software systems, our model checking approach can better deal with the complexity of the XSLT formalism, thus achieving a more precise and complete validation while keeping the whole process under reasonable complexity limitations.

Moreover, our approach is also able to check the transformation robustness, i.e., verify if it handles correctly (small) deviances from the expected input structure, and generate detailed descriptions of the input sequences that lead to transformation errors, to possibly detect and block them at runtime.

Finally, to make our approach suitable to real-world development processes, we designed it to be completely automatic and accessible to non-experts, too. Indeed, the underlying methodology is completely hidden to the user, and the validation outcome is either a success message or a set of clear and detailed “debugger-style” error messages that can be effectively used to locate and fix the transformation errors.

The paper is organised as follows. Section 2 gives a brief introduction to the main elements of XSLT and their semantics, whereas Section 3 introduces the model checking techniques and, in particular, the CMurphi model checker. Section 4 describes how model checking can be adapted to validate XSLT stylesheets and Section 5 shows the XSLToMurphi algorithm that implements the core of XSLT model checking. Section 6 shows our XSLT model checking process working on a case study, whereas Section 7 illustrates and discusses the tool experimentation. Finally, Section 8 offers an overview on the XSLT validation works in the current literature and Section 9 outlines the paper conclusions and illustrates our planned future work on XSLToMurphi.

Section snippets

The eXtensible Stylesheet Language Transformations

The eXtensible Stylesheet Language Transformations (XSLT) is a W3C recommendation [4] and, in general, is used to transform an XML document into another XML document. This is done by defining transformation rules that describe the XML markup to be generated when certain elements are found in the transformation input. XSLT not only has a functional flavour, but also contains many common programming constructs like loops, conditional expressions and parametric function calls. All these statements

Model checking techniques

Generally speaking, model checking [34], [35], [36], [37], [38], [39] can be defined as the formal process of verifying the validity of a set of assertions on the (formal) model of a (software) system. Model checking is applied in a scenario like the one depicted in Fig. 2.

Given a software or a hardware system, we derive from it a system model, i.e., a suitable abstraction of the system functionalities that we want to verify. This is the most critical step of the whole process: indeed, the

Model checking applied to XSLT

In the following section of the paper we will illustrate a technique and a tool that allow to perform model checking on XSL transformations.

Model checking is usually applied to verify that a system always behaves as specified. In our case, the system is coded in the XSLT language. We may also have a formal description of the input the transformation is supposed to work on, given by a schema that defines the class of valid input documents. In this situation, we may want to embed the schema

The XSLToMurphi algorithm

Once we have defined a suitable way to model an XSL transformation, we can describe the algorithm that, given an XSLT stylesheet, creates the corresponding abstraction and formally encodes it in order to apply model checking. Moreover, we should define a methodology to create a set of properties (to be verified) from the output constraints illustrated in Section 4.2. Since our target verifier is CMurphi, the stylesheet and constraints encoding should produce a valid program written in the

The verification process

In this section we show the complete verification process involving the XSLToMurphi algorithm and the CMurphi verifier. To this aim, we reuse the running example of Section 5, namely the transformation in Fig. 1 and the XML Schema shown in Fig. 4.

To perform the verification, the user simply runs a shell script provided with XSLToMurphi, having set the appropriate environment variables to point to the CMurphi and XSLToMurphi distribution directories. Of course both tools must have been

Experimentation

The current prototype of the XSLToMurphi tool has been first tested using several “malicious” ad hoc created stylesheets, as the running example of this paper (Fig. 1), to stress the most complex XSLT aspects and embed both typical and hard-to-catch output errors. The output languages for these stylesheets were XHTML, WSDL [45], SVG [46] as well as several custom XML languages. These experiments were mainly used to debug the automatic XSLT modelling and abstraction algorithm shown in Section 5.1

Related work

In this section we give some related work in the field of XSLT validation. For information about the model checking techniques, the reader may refer to Section 3.

The validation of XSL transformations is addressed by several works in the literature. This task is generally considered very complex: indeed (see [47], [48]), as non-trivial subsets of XSLT are addressed, the validation problem quickly becomes EXPTIME-hard.

The world of model checking and the one of XML are actually already in touch.

Conclusions

The benefits deriving from XSLT validation are widely testified by the number of research works and tools in this field (see Section 8). In this paper we described a new technique to validate XSL transformations by means of a standard technology, i.e., model checking. In particular, our work focused on the full automatisation of the XSLT model checking process through an algorithm called XSLToMurphi, which completely hides the complexity of the model checking technology to the user.

Indeed, we

References (70)

  • S. Groppe et al.

    Transforming XSLT stylesheets into XQuery expressions and vice versa

    Comput Lang Syst Struct

    (2011)
  • J.R. Burch et al.

    Symbolic model checking: 1020 states and beyond

    Inf Comput

    (1992)
  • W3C. Extensible markup language (XML), 〈http://www.w3.org/XML〉;...
  • ISO. Information processing—text and office systems—standard generalized markup language (SGML, ISO 8879:1986),...
  • W3C. World Wide Web consortium website, 〈http://www.w3.org〉;...
  • W3C. XSL transformations version 1.0. W3C recommendation, 〈http://www.w3.org/TR/xslt〉;...
  • Parr TJ. Enforcing strict model-view separation in template engines. In: WWW ׳04: Proceedings of the 13th international...
  • Mohr M. Smarty template engine, 〈http://www.smarty.net/〉;...
  • Bailey J. Transformation and reaction rules for data on the web. In: ADC ׳05: Proceedings of the 16th Australasian...
  • Groppe S, Buttcher S. Xpath query transformation based on XSLT stylesheets. In: WIDM ׳03: Proceedings of the fifth ACM...
  • Jain S, Mahajan R, Suciu D. Translating XSLT programs to efficient SQL queries. In: WWW ׳02: Proceedings of the 11th...
  • Meier W. eXist: open source native XML database, 〈http://exist.sourceforge.net/〉;...
  • Salminen A, Tompa FW. Requirements for XML document database systems. In: DocEng ׳01: Proceedings of the 2001 ACM...
  • Paradies M, Malaika S, Nicola M, Xie K. Comparing XML processing performance in middleware and database: a case study....
  • W3C. The extensible hypertext markup language. W3C recommendation, 〈http://www.w3.org/TR/xhtml1〉;...
  • Giannetti F, Fernandes LG, Timmers R, Nunes T, Raeder M, Castro M. High performance XSL-FO rendering for variable data...
  • W3C. Extensible stylesheet language version 1: XSL formatting objects (XSL-FO), 〈http://www.w3.org/TR/xsl/#fo-section〉;...
  • Gasevic D, Djuric D, Devedzic V, Damjanovi V. Converting UML to OWL ontologies. In: WWW Alt. ׳04: Proceedings of the...
  • Pimentel MG, Baldochi L, Munson EV. Modeling context information for capture and access applications. In: DocEng ׳06:...
  • H.S. Kim et al.

    Development of an MPEG-4 scene converter for rich media services in mobile environments

  • OMG. XML metadata interchange (XMI), 〈http://www.omg.org/spec/XMI/〉;...
  • Duran A, Ruiz-Cortes A, Corchuelo R, Toro M. Supporting requirements verification using XSLT. In: Proceedings of IEEE...
  • Gu GP, Petriu DC. Xslt transformation from UML models to LQN performance models. In: WOSP ׳02: Proceedings of the third...
  • Peltier M, Bézivin J, Guillaume G. Mtrans: a general framework based on XSLT for model transformations. In: WTUML׳01:...
  • W3C. W3C XML query (XQuery), 〈http://www.w3.org/XML/Query〉;...
  • Kenji M, Hiroyuki S. Static optimization of XSLT stylesheets: template instantiation optimization and lazy XML parsing....
  • Kepser S. A simple proof for the turing-completeness of XSLT and XQuery. In: Proceedings of extreme markup languages;...
  • Bormann J, Lohse J, Payer M, Venzl G. Model checking in industrial hardware design. In: DAC ׳95: Proceedings of the...
  • W. Visser et al.

    Model checking programs

    Autom Softw Eng J

    (2003)
  • Park D, Stern U, Sakkebæk J, Dill DL. Java model checking. In: Proceedings of 15th IEEE international conference on...
  • Mehlitz P, Tkachuk O, Ujma M. JPF-AWT: model checking Gui applications. In: 2011 26th IEEE/ACM international conference...
  • Tronci E. Cached Murphi web page, 〈http://www.dsi.uniroma1.it/~tronci/cached.murphi.html〉;...
  • W3C. XML path language version 1.0. W3C recommendation, 〈http://www.w3.org/TR/xpath〉;...
  • W3C. XML schema (XSD), 〈http://www.w3.org/XML/Schema〉;...
  • Dill DL, Drexler AJ, Hu AJ, Yang CH. Protocol verification as a hardware design aid. In: Proceedings of the 1991 IEEE...
  • Cited by (0)

    View full text