Abstract
A draft standard for HTML, HTML5, includes the detailed specification of the parsing algorithm for HTML5 documents, including error handling. In this paper, we develop a reachability analyzer for the parsing specification of HTML5 and automatically generate HTML documents to test compatibilities of Web browsers. The set of HTML documents are extracted using our reachability analysis of the statements in the specification. This analysis is based on a translation of the specification to a conditional pushdown system and on a new algorithm for the reachability analysis of conditional pushdown systems.
In our preliminary experiments, we generated 353 HTML documents automatically from a subset of the specification and found several compatibility problems by supplying them to Web browsers.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bouajjani, A., Esparza, J., Maler, O.: Reachability Analysis of Pushdown Automata: Application to Model-Checking. In: Mazurkiewicz, A., Winkowski, J. (eds.) CONCUR 1997. LNCS, vol. 1243, pp. 135–150. Springer, Heidelberg (1997)
Brzozowski, J.: Derivatives of regular expressions. J. ACM 11, 481–494 (1964)
World Wide Web Consortium. HTML5: Editor’s draft 22 (February 2012), http://dev.w3.org/html5/spec/Overview.html
Esparza, J., Hansel, D., Rossmanith, P., Schwoon, S.: Efficient Algorithms for Model Checking Pushdown Systems. In: Emerson, E.A., Sistla, A.P. (eds.) CAV 2000. LNCS, vol. 1855, pp. 232–247. Springer, Heidelberg (2000)
Esparza, J., Kucera, A., Schwoon, S.: Model checking LTL with regular valuations for pushdown systems. Information and Computation 186(2), 355–376 (2003)
Ginsburg, S., Greibach, S.A., Harrison, M.A.: Stack automata and compiling. J. ACM 14(1), 172–201 (1967)
Hague, M., Ong, C.-H.L.: Symbolic backwards-reachability analysis for higher-order pushdown systems. Logical Methods in Computer Science 4, 1–45 (2008)
html5lib, http://code.google.com/p/html5lib/
Krauss, A., Nipkow, T.: Proof pearl: Regular expression equivalence and relation algebra. J. Automated Reasoning (March 2011) (published online)
Li, X., Ogawa, M.: Conditional weighted pushdown systems and applications. In: Proceedings of the 2010 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation, pp. 141–150 (2010)
Minamide, Y.: Static approximation of dynamically generated Web pages. In: Proceedings of the 14th International World Wide Web Conference, pp. 432–441. ACM Press (2005)
Owens, S., Reppy, J., Turon, A.: Regular-expression derivatives re-examined. J. of Functional Programming 19, 173–190 (2009)
Suwimonteerabuth, D., Schwoon, S., Esparza, J.: Efficient Algorithms for Alternating Pushdown Systems with an Application to the Computation of Certificate Chains. In: Graf, S., Zhang, W. (eds.) ATVA 2006. LNCS, vol. 4218, pp. 141–153. Springer, Heidelberg (2006)
Suwimonteerabuth, D.: Reachability in Pushdown Systems: Algorithms and Applications. PhD thesis, Technischen Universität München (2009)
Validator.nu. The validator.nu html parser, http://about.validator.nu/htmlparser/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Minamide, Y., Mori, S. (2012). Reachability Analysis of the HTML5 Parser Specification and Its Application to Compatibility Testing. In: Giannakopoulou, D., Méry, D. (eds) FM 2012: Formal Methods. FM 2012. Lecture Notes in Computer Science, vol 7436. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32759-9_26
Download citation
DOI: https://doi.org/10.1007/978-3-642-32759-9_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32758-2
Online ISBN: 978-3-642-32759-9
eBook Packages: Computer ScienceComputer Science (R0)