Skip to main content
Log in

Comparing accessibility evaluation tools: a method for tool effectiveness

  • Long paper
  • Published:
Universal Access in the Information Society Aims and scope Submit manuscript

Abstract

This paper claims that effectiveness of automatic tools for evaluating web site accessibility has to be itself evaluated, given the increasingly important role that these tools play. The paper presents a comparison method for a pair of tools that takes into account correctness, completeness and specificity in supporting the task of assessing the conformance of a web site with respect to established guidelines. The paper presents data acquired during a case study based on comparing LIFT Machine with Bobby. The data acquired from the case study is used to assess the strengths and weaknesses of the comparison method. The conclusion is that even though there is room for improvement of the method, it is already capable of providing accurate and reliable conclusions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. http://www.cast.org/

  2. Comparing these tools is somehow unfair given their different scope, flexibility, power and price. LIFT Machine is targeted to an enterprise-level quality assurance team and whose price starts at $6,000; Bobby 4.0 was available for free (now it runs at about $300) and is targeted to a single individual wanting to test a relatively limited number of pages. Nevertheless, the comparison is useful as a case study for demonstrating the evaluation method itself.

  3. More specific types of results could be considered. For example a distinction among definite errors, probable errors, manual warnings triggered by content, and untriggered manual warnings would yield a finer grid serving as a basis for a richer evaluation of tools. Some of the testing tools provide these finer distinctions. However, it may be difficult to classify the tool output according to those categories (this information might not be always available from the tool output) and if two tools provide different types of results, it will be difficult to compare them. For this reason, the method proposed in this paper is based on two types of results: those that the tool assumes to be true problems and those that are warnings.

    The consequence is that the evaluation method is blind with respect to finer distinctions, and tools that provide intermediate warnings are treated in the same way as tools that provide manual warnings.

  4. It is recommendable to use the same limits for both systems; otherwise the consequence is that there might be several pages that are tested by one tool only, thus reducing the effectiveness of the comparison method, since the issues associated with these pages are excluded from any further analysis. This has happened in the case study, due to differences in the crawling methods adopted by different tools.

  5. This is Bobby’s terminology corresponding to what we earlier referred to as manual warning triggered by content (for Partial or PartialOnce) and untriggered manual warning (for AskOnce).

  6. This is a case where the values reported in Table 1 affect the FN percentages. In particular, since FN for Bobby is defined in reference to the behavior of LIFT, when considering a larger number of issues generated by LIFT, there are increased chances to find a FN for Bobby. Therefore, FN for Bobby is correct, while FN for LIFT is underestimated.

  7. A confidence interval of a parameter around a value and with a given significance level α describes the possible variability of the parameter when a different sample of data is analysed. α gives the probability that the parameter stays within the interval.

  8. For example, the claim HFPα valid with probability α=0.01, means that the data gathered in this experiment in 99 cases out of 100 support the claim that A produces less FP than B.

References

  1. Brajnik G (2000) Automatic web usability evaluation: what needs to be done? In: Proceedings of human factors and the web, 6th conference, Austin, Texas http://www.dimi.uniud.it/giorgio/papers/hfweb00.html

  2. Brajnik G (2001) Towards valid quality models for websites. In: Proceedings of human factors and the web, 7th conference, Madison, Wisconsin http://www.dimi.uniud.it/giorgio/papers/hfweb01.html

  3. Brajnik G (2004) Using automatic tools in accessibility and usability assurance. In: Stephanidis C (ed) Lecture notes in computer science proceedings of the 8th ERCIM UI4ALL workshop. Springer, Berlin Heidelberg New York

    Google Scholar 

  4. Brajnik G (2003) Comparing accessibility evaluation tools: results from a case study. In: Ardissono L, Goy A (eds) HCITALY 2003: Simposio su human-computer interaction. SigCHI, Turin, Italy

    Google Scholar 

  5. Brink T, Hofer E (2002) Automatically evaluating web usability. In: Proceedings of CHI 2002 workshop, Minneapolis, 20–25 April

  6. Chisholm W, Palmer S (2002) Evaluation and Report Language (EARL) 1.0. http://www.w3.org/TR/EARL10

  7. Dougherty R, Wade A (2004) Vischeck http://www.vischeck.com/

  8. EUROAccessibility (2003) www.euroaccessibility.org http://www.euroaccessibility.org Cited in Nov 2003

  9. Fenton NE, Pfleeger SL (1997) Software metrics, 2nd edn. Thompson, Washington, D.C.

    Google Scholar 

  10. Gunning R (1968) The techniques of clear writing. McGraw-Hill, New York

    Google Scholar 

  11. Ivory M, Hearst M (2001) The state of the art in automated usability evaluation of user interfaces. ACM Comput Surv 4(33):173–197

    Google Scholar 

  12. Ivory M, Mankoff J, Le A (2003) Using automated tools to improve web site usage by users with diverse abilities. IT Soc 1(3):195–236 http://www.stanford.edu/group/siqss/itandsociety/v01i03/v01i03a11.pdf

    Google Scholar 

  13. Nielsen Norman Group (2001) Beyond ALT text: making the web easy to use for users with disabilities. http://www.nngroup.com/reports/accessibility/

  14. Paciello M (2000) Web accessibility for people with disabilities. CMP Books, Gilroy, Calif.

  15. Scapin D, Leulier C, Vanderdonckt J, Mariage C, Bastien C, Farenc C, Palanque P, Bastide R (2000) Towards automated testing of web usability guidelines. In: Proceedings of human factors and the web, 6th conference, Austin, Texas http://www.tri.sbc.com/hfweb/scapin/Scapin.html

  16. Slatin J, Rush S (2003) Maximum accessibility: making your web site more usable for everyone. Addison-Wesley, Boston

    Google Scholar 

  17. Sullivan T, Matson R (2000) Barriers to use: usability and content accessibility on the web’s most popular sites. In: Proceedings of 1st ACM conference on universal usability, Washington, D.C., 16–17 November

  18. Thatcher J (2002) Evaluation and repair tools. Originally posted on http://www.jimthatcher.com. Cited in June 2002; no longer available.

  19. Thatcher J, Waddell C, Henry S, Swierenga S, Urban M, Burks M, Regan B, Bohman P (2002) Constructing accessible web sites. Glasshaus, Birmingham, UK

    Google Scholar 

  20. UsableNet Inc (2003) Usablenet technology. http://www.usablenet.com/usablenet_technology/usablenet_technology.html

  21. UsableNet Inc (2004) LIFT for Dreamweaver—Nielsen Norman Group edition. http://www.usablenet.com/products_services/lfdnng/lfdnng.html

  22. W3C Web Accessibility Initiative. (1994) Evaluation, repair, and transformation tools for web content accessibility. http://www.w3.org/WAI/ER/existingtools.html

  23. World Wide Web Consortium (1999) Web accessibility initiative. Checklist of checkpoints for web content accessibility guidelines 1.0. http://www.w3.org/TR/WCAG10/fullchecklist.html. Cited in May 1999

  24. WorldWideWeb Consortium (1999) Web accessibility initiative. Web content accessibility guidelines 1.0. http://www.w3.org/TR/WCAG10. Cited in May 1999

Download references

Acknowledgements

Many thanks to Jim Thatcher and Daniela Ortner for their detailed reading of a draft of this paper. I’d also like to thank participants of the first face-to-face meeting of EuroAccessibility Task Force 2 held in London, November 2003, for their feedback on the method. Of course, the author is the only one responsible for the content of this paper. I give my thanks also to the editorial staff of the journal for their help in improving the English style of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Giorgio Brajnik.

Additional information

Giorgio Brajnik is a scientific advisor for UsableNet Inc., manufacturer of LIFT Machine, one of the tools used in the case study reported in this paper.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Brajnik, G. Comparing accessibility evaluation tools: a method for tool effectiveness. Univ Access Inf Soc 3, 252–263 (2004). https://doi.org/10.1007/s10209-004-0105-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10209-004-0105-y

Keywords

Navigation