Comparing accessibility evaluation tools: a method for tool effectiveness

Brajnik, Giorgio

doi:10.1007/s10209-004-0105-y

Comparing accessibility evaluation tools: a method for tool effectiveness

Long paper
Published: 24 August 2004

Volume 3, pages 252–263, (2004)
Cite this article

Universal Access in the Information Society Aims and scope Submit manuscript

Giorgio Brajnik¹

1611 Accesses
63 Citations
Explore all metrics

Abstract

This paper claims that effectiveness of automatic tools for evaluating web site accessibility has to be itself evaluated, given the increasingly important role that these tools play. The paper presents a comparison method for a pair of tools that takes into account correctness, completeness and specificity in supporting the task of assessing the conformance of a web site with respect to established guidelines. The paper presents data acquired during a case study based on comparing LIFT Machine with Bobby. The data acquired from the case study is used to assess the strengths and weaknesses of the comparison method. The conclusion is that even though there is room for improvement of the method, it is already capable of providing accurate and reliable conclusions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Universal design, inclusive design, accessible design, design for all: different concepts—one goal? On the concept of accessibility—historical, methodological and philosophical aspects

Article 07 May 2014

Patients’ perception of using telehealth for consultation: insights after pandemic and development of an online calculator platform to predict acceptance of remote consultation: the TELEMED international study

Article 15 April 2024

FEDS: a Framework for Evaluation in Design Science Research

Article Open access 11 November 2014

Notes

http://www.cast.org/
Comparing these tools is somehow unfair given their different scope, flexibility, power and price. LIFT Machine is targeted to an enterprise-level quality assurance team and whose price starts at $6,000; Bobby 4.0 was available for free (now it runs at about $300) and is targeted to a single individual wanting to test a relatively limited number of pages. Nevertheless, the comparison is useful as a case study for demonstrating the evaluation method itself.
More specific types of results could be considered. For example a distinction among definite errors, probable errors, manual warnings triggered by content, and untriggered manual warnings would yield a finer grid serving as a basis for a richer evaluation of tools. Some of the testing tools provide these finer distinctions. However, it may be difficult to classify the tool output according to those categories (this information might not be always available from the tool output) and if two tools provide different types of results, it will be difficult to compare them. For this reason, the method proposed in this paper is based on two types of results: those that the tool assumes to be true problems and those that are warnings.
The consequence is that the evaluation method is blind with respect to finer distinctions, and tools that provide intermediate warnings are treated in the same way as tools that provide manual warnings.
It is recommendable to use the same limits for both systems; otherwise the consequence is that there might be several pages that are tested by one tool only, thus reducing the effectiveness of the comparison method, since the issues associated with these pages are excluded from any further analysis. This has happened in the case study, due to differences in the crawling methods adopted by different tools.
This is Bobby’s terminology corresponding to what we earlier referred to as manual warning triggered by content (for Partial or PartialOnce) and untriggered manual warning (for AskOnce).
This is a case where the values reported in Table 1 affect the FN percentages. In particular, since FN for Bobby is defined in reference to the behavior of LIFT, when considering a larger number of issues generated by LIFT, there are increased chances to find a FN for Bobby. Therefore, FN for Bobby is correct, while FN for LIFT is underestimated.
A confidence interval of a parameter around a value and with a given significance level α describes the possible variability of the parameter when a different sample of data is analysed. α gives the probability that the parameter stays within the interval.
For example, the claim HFP_α valid with probability α=0.01, means that the data gathered in this experiment in 99 cases out of 100 support the claim that A produces less FP than B.

References

Brajnik G (2000) Automatic web usability evaluation: what needs to be done? In: Proceedings of human factors and the web, 6th conference, Austin, Texas http://www.dimi.uniud.it/giorgio/papers/hfweb00.html
Brajnik G (2001) Towards valid quality models for websites. In: Proceedings of human factors and the web, 7th conference, Madison, Wisconsin http://www.dimi.uniud.it/giorgio/papers/hfweb01.html
Brajnik G (2004) Using automatic tools in accessibility and usability assurance. In: Stephanidis C (ed) Lecture notes in computer science proceedings of the 8th ERCIM UI4ALL workshop. Springer, Berlin Heidelberg New York
Google Scholar
Brajnik G (2003) Comparing accessibility evaluation tools: results from a case study. In: Ardissono L, Goy A (eds) HCITALY 2003: Simposio su human-computer interaction. SigCHI, Turin, Italy
Google Scholar
Brink T, Hofer E (2002) Automatically evaluating web usability. In: Proceedings of CHI 2002 workshop, Minneapolis, 20–25 April
Chisholm W, Palmer S (2002) Evaluation and Report Language (EARL) 1.0. http://www.w3.org/TR/EARL10
Dougherty R, Wade A (2004) Vischeck http://www.vischeck.com/
EUROAccessibility (2003) www.euroaccessibility.org http://www.euroaccessibility.org Cited in Nov 2003
Fenton NE, Pfleeger SL (1997) Software metrics, 2nd edn. Thompson, Washington, D.C.
Google Scholar
Gunning R (1968) The techniques of clear writing. McGraw-Hill, New York
Google Scholar
Ivory M, Hearst M (2001) The state of the art in automated usability evaluation of user interfaces. ACM Comput Surv 4(33):173–197
Google Scholar
Ivory M, Mankoff J, Le A (2003) Using automated tools to improve web site usage by users with diverse abilities. IT Soc 1(3):195–236 http://www.stanford.edu/group/siqss/itandsociety/v01i03/v01i03a11.pdf
Google Scholar
Nielsen Norman Group (2001) Beyond ALT text: making the web easy to use for users with disabilities. http://www.nngroup.com/reports/accessibility/
Paciello M (2000) Web accessibility for people with disabilities. CMP Books, Gilroy, Calif.
Scapin D, Leulier C, Vanderdonckt J, Mariage C, Bastien C, Farenc C, Palanque P, Bastide R (2000) Towards automated testing of web usability guidelines. In: Proceedings of human factors and the web, 6th conference, Austin, Texas http://www.tri.sbc.com/hfweb/scapin/Scapin.html
Slatin J, Rush S (2003) Maximum accessibility: making your web site more usable for everyone. Addison-Wesley, Boston
Google Scholar
Sullivan T, Matson R (2000) Barriers to use: usability and content accessibility on the web’s most popular sites. In: Proceedings of 1st ACM conference on universal usability, Washington, D.C., 16–17 November
Thatcher J (2002) Evaluation and repair tools. Originally posted on http://www.jimthatcher.com. Cited in June 2002; no longer available.
Thatcher J, Waddell C, Henry S, Swierenga S, Urban M, Burks M, Regan B, Bohman P (2002) Constructing accessible web sites. Glasshaus, Birmingham, UK
Google Scholar
UsableNet Inc (2003) Usablenet technology. http://www.usablenet.com/usablenet_technology/usablenet_technology.html
UsableNet Inc (2004) LIFT for Dreamweaver—Nielsen Norman Group edition. http://www.usablenet.com/products_services/lfdnng/lfdnng.html
W3C Web Accessibility Initiative. (1994) Evaluation, repair, and transformation tools for web content accessibility. http://www.w3.org/WAI/ER/existingtools.html
World Wide Web Consortium (1999) Web accessibility initiative. Checklist of checkpoints for web content accessibility guidelines 1.0. http://www.w3.org/TR/WCAG10/fullchecklist.html. Cited in May 1999
WorldWideWeb Consortium (1999) Web accessibility initiative. Web content accessibility guidelines 1.0. http://www.w3.org/TR/WCAG10. Cited in May 1999

Download references

Acknowledgements

Many thanks to Jim Thatcher and Daniela Ortner for their detailed reading of a draft of this paper. I’d also like to thank participants of the first face-to-face meeting of EuroAccessibility Task Force 2 held in London, November 2003, for their feedback on the method. Of course, the author is the only one responsible for the content of this paper. I give my thanks also to the editorial staff of the journal for their help in improving the English style of this paper.

Author information

Authors and Affiliations

Dip. di Matematica e Informatica, Università di Udine, Italy
Giorgio Brajnik

Authors

Giorgio Brajnik
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Giorgio Brajnik.

Additional information

Giorgio Brajnik is a scientific advisor for UsableNet Inc., manufacturer of LIFT Machine, one of the tools used in the case study reported in this paper.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Brajnik, G. Comparing accessibility evaluation tools: a method for tool effectiveness. Univ Access Inf Soc 3, 252–263 (2004). https://doi.org/10.1007/s10209-004-0105-y

Download citation

Published: 24 August 2004
Issue Date: October 2004
DOI: https://doi.org/10.1007/s10209-004-0105-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Comparing accessibility evaluation tools: a method for tool effectiveness

Abstract

Access this article

Similar content being viewed by others

Universal design, inclusive design, accessible design, design for all: different concepts—one goal? On the concept of accessibility—historical, methodological and philosophical aspects

Patients’ perception of using telehealth for consultation: insights after pandemic and development of an online calculator platform to predict acceptance of remote consultation: the TELEMED international study

FEDS: a Framework for Evaluation in Design Science Research

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Comparing accessibility evaluation tools: a method for tool effectiveness

Abstract

Access this article

Similar content being viewed by others

Universal design, inclusive design, accessible design, design for all: different concepts—one goal? On the concept of accessibility—historical, methodological and philosophical aspects

Patients’ perception of using telehealth for consultation: insights after pandemic and development of an online calculator platform to predict acceptance of remote consultation: the TELEMED international study

FEDS: a Framework for Evaluation in Design Science Research

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation