Abstract
In this paper, we develop a framework for the automated verification of Web sites, which can be used to specify integrity conditions for a given Web site, and then automatically check whether these conditions are fulfilled. First, we provide a rewriting-based, formal specification language which allows us to define syntactic as well as semantic properties of the Web site. Then, we formalize a verification technique which detects both incorrect/forbidden patterns as well as lack of information, that is, incomplete/missing Web pages inside the Web site. Useful information is gathered during the verification process which can be used to repair the Web site. Our methodology is based on a novel rewriting-based technique, called partial rewriting, in which the traditional pattern matching mechanism is replaced by tree simulation, a suitable technique for recognizing patterns inside semistructured documents. The framework has been implemented in the prototype GVerdi, which is publicly available.
Similar content being viewed by others
References
Abiteboul, S., Buneman, P., Suciu, D.: Data on the Web. From Relations to Semistructured Data and XML. Morgan Kaufmann (2000)
Alpuente, M., Ballis, D., Falaschi, M.: A rewriting-based framework for Web sites verification. In: Proceedings of 1st International Workshop on Ruled-Based Programming (RULE’04), vol. 124(1). ENTCS, Elsevier (2004)
Alpuente M., Ballis D., Falaschi M.(2004). VERDI: an automated tool for Web sites Verification. In: Alferes J.J., Leite J. (eds) Proceedings of the 9th European Conference on Logics in Artificial Intelligence (JELIA’04), vol. 3229 of Lecture Notes in Computer Science. Springer, Berlin Heidelberg New York, pp. 726–729
Baader F., Nipkow T.(1998). Term Rewriting and All That. Cambridge University Press, Cambridge
Ballis, D.: Rule-based Software Verification and Correction. PhD thesis, University of Udine and Technical University of Valencia (2005)
Ballis, D., García Vivó, J.: A rule-based system for Web site verification. In: Proceedings of 1st International Workshop on Automated Specification and Verification of Web Sites (WWV’05). ENTCS, Elsevier, 2005. To appear.
Baxter, I.D., Ricca, F., Tonella, P.: Web application transformations based on rewrite rules. Inform. Softw. Technol. 44 (13), (2002)
Bertino E., Mesiti M., Guerrin G.(2004). A matching algorithm for measuring the structural similarity between an XML document and a DTD and its applications. Inform. Syst. 29(1): 23–46
Bezem M.(2003). TeReSe, Term Rewriting Systems, chapter Mathematical background (Appendix A). Cambridge University Press, Cambridge
Bry, F., Schaffert, S.: Towards a declarative query and transformation language for XML and semistructured data: simulation unification. In: Proceedings of the International Conference on Logic Programming (ICLP’02) vol. 2401 of LNCS. Springer Berlin Heidelberg New York (2002)
Bry, F., Schaffert, S.: The XML query language xcerpt: design principles, examples, and semantics. Technical report, Available at: http://www.xcerpt.org(2002)
Buneman, P., Davidson, S.B., Hillebrand, G.G., Suciu, D.: A query language and optimization techniques for unstructured data. In: Proceedings of ACM SIGMOD International Conference on Management of Data (ICMD’96) (1996)
Capra L., Emmerich W., Finkelstein A., Nentwich C.(2002). XLINKIT: a consistency checking and smart link generation service. ACM Trans. Internet Technol. 2 (2): 151–185
Cortesi A., Dovier A., Quintarelli E., Tanca L.(2002). Operational and abstract semantics of a graphical query language. Theore. Comput. Sci. 275, 521–560
Dershowitz N., Plaisted D.(2001). Rewriting. Handbook Automated Reasoning 1, 535–610
Despeyroux, T., Trousse, B.: Semantic verification of Web sites using natural semantics. In: Proceedings of 6th Conference on Content-Based Multimedia Information Access (RIAO’00) (2000)
Di Sciascio, E., Donini, F.M., Mongiello, M., Piscitelli, G.: Web applications design and maintenance using symbolic model checking. In: Proceedings 7th European Conference on Software Maintenance and Reengineering, pp. 63. IEEE Computer Society (2003)
Easterbrook S.M., Nuseibeh B., Russo A.(2000). Leveraging inconsistency in software development. IEEE Comp. 33 (4): 24–29
Ellmer E., Emmerich W., Finkelstein A., Nentwich C.(2003). Flexible consistency checking. ACM Trans. Softw. Eng. 12(1): 28–63
Fan W., Libkin L.(2002). On XML integrity constraints in the presence of DTDs. J. ACM 49(3): 368–406
Fernandez, M., Florescu, D., Levy, A., Suciu, D.: Verifying integrity constraints on Web sites. In: Proceedings of Sixteenth International Joint Conference on Artificial Intelligence (IJCAI’99) vol. 2 pp. 614–619. Morgan Kaufmann (1999)
Fernandez, M.F., Suciu, D.: Optimizing regular path expressions using graph schemas. In: Proceedings of International Conference on Data Engineering (ICDE’98), pp. 14–23 (1998)
M. Hanus (ed.). Curry: an integrated functional logic language. Available at: http://www-i2.informatik. rwthaachen.de/~hanus /curry (1999)
Henzinger, M.R., Henzinger, T.A., Kopke, P.W.: Computing simulations on finite and infinite graphs. In: IEEE Sympo Foundations Comput. Sci. 453–462 (1995)
Hopcroft, J.E., Ullman, J.D.: Introduction to Automata Theory, Languages, and Computation. Addison-Wesley (1979)
Hosoya, H., Pierce, B.: Regular expressions pattern matching for XML. In: Proceedings of 25th ACM SIGPLAN-SIGACT International Symposium POPL, pp. 67–80. ACM (2001)
Hussmann, H.: Unification in conditional-equational Theories. In: Proceedings of ALP’88, pp. 31–40. Springer LNCS 343 (1988)
Imagiware. Inc. Doctor HTML: quality assessment for the Web. Available at: http://www.doctor-html.com/ RxHTML/.
Kirchner, C., Qian, Z., Singh, P.K,. Stuber, J.: Xemantics: a Rewriting Calculus-Based Semantics of XSLT. Rapport de recherche A01-R-386, LORIA (2001)
Klop J.W. (1992). Term rewriting systems. In: Abramsky S., Gabbay D., Maibaum T., (eds) Handbook of Logic in Computer Science vol. I. Oxford University Press, Oxford, pp. 1–112
Moreno-Navarro J.J., Rodriguez-Artalejo.: BABEL: a functional and logic programming language based on a constructor discipline and narrowing. In: Grabowski I., Lescanne P., Wechler W., (eds) Proceedings of the International Conference on Algebraic and Logic Programming. pp. 223–232 Springer LNCS 343 (1988)
Nentwich, C., Emmerich, W., Finkelstein, A.: Consistency management with repair actions. In: Proceedings of the 25th International Conference on Software Engineering (ICSE’03). IEEE Computer Society (2003)
Nesbit, S.: HTML Tidy: keeping it clean Available at: http://www.webreview.com/2000/06-16/ webauthors/06-16-00-3.shtml(2000)
The Open Group. Unix Regular Expressions. Available at: http://www.opengroup.org/onlinepubs/7908799/ xbd/re.html.
Academisa Sinica Computing Centre. The schematron: an XML structure validation language using pattern in trees. Available at: http://xml.ascc.net/resource/ schematron/schematron.html
World Wide Web Consortium (W3C). Extensible Markup Language (XML) 1.0, second edition Available at: http:// www.w3.org (1999)
World Wide Web Consortium (W3C). XML path language (XPath) Available at: http://www.w3.org (1999)
World Wide Web Consortium (W3C). Extensible HyperText Markup Language (XHTML) Available at: http://www. w3.org (2000)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Alpuente, M., Ballis, D. & Falaschi, M. Rule-based verification of Web sites. Int J Softw Tools Technol Transfer 8, 565–585 (2006). https://doi.org/10.1007/s10009-006-0009-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10009-006-0009-7