ABSTRACT
Web Content Accessibility Guidelines 2.0 (WCAG 2.0) require that success criteria be tested by human inspection. Further, testability of WCAG 2.0 criteria is achieved if 80% of knowledgeable inspectors agree that the criteria has been met or not. In this paper we investigate the very core WCAG 2.0, being their ability to determine web content accessibility conformance. We conducted an empirical study to ascertain the testability of WCAG 2.0 success criteria when experts and non-experts evaluated four relatively complex web pages; and the differences between the two. Further, we discuss the validity of the evaluations generated by these inspectors and look at the differences in validity due to expertise.
In summary, our study, comprising 22 experts and 27 non-experts, shows that approximately 50% of success criteria fail to meet the 80% agreement threshold; experts produce 20% false positives and miss 32% of the true problems. We also compared the performance of experts against that of non-experts and found that agreement for the non-experts dropped by 6%, false positives reach 42% and false negatives 49%. This suggests that in many cases WCAG 2.0 conformance cannot be tested by human inspection to a level where it is believed that at least 80% of knowledgeable human evaluators would agree on the conclusion. Why experts fail to meet the 80% threshold and what can be done to help achieve this level are the subjects of further investigation.
- S. Abou-Zahra. Web accessibility evaluation. In S. Harper and Y. Yesilada, editors, Web Accessibility: A Foundation for Research, Human-Computer Interaction Series, chapter 7, pages 79--106. Springer, London, first edition, Sept. 2008.Google Scholar
- F. Alonso, J. L. Fuertes, A.L. González, and L. Martínez. On the testability of wcag 2.0 for beginners. In Web for All - W4A 2010, Raleigh, USA, April 2010. ACM. Google ScholarDigital Library
- G. Brajnik. Comparing accessibility evaluation tools: a method for tool effectiveness. Int. Journal on Universal Access in the Information Society, 3(3-4):252--263, Oct. 2004. Google ScholarDigital Library
- G. Brajnik. Beyond conformance: the role of accessibility evaluation methods. In S. Hartmann, X. Zhou, and M. Kirchberg, editors, WISE 2008: 9th Int. Conference on Web Information Systems Engineering - 2nd International Workshop on Web Usability and Accessibility IWWUA08, LNCS 5176, pages 63--80, Auckland, New Zealand, Sept. 2008. Springer-Verlag. Keynote speech. Google ScholarDigital Library
- B. Caldwell, M. Cooper, L. G. Reid, and G. Vanderheiden. Web Content Accessibility Guidelines (WCAG) 2.0. W3C, 2008. http://www.w3.org/TR/WCAG20/.Google Scholar
- M. Catani and D. Biers. Usability evaluation and prototype fidelity: users and usability professionals. In Proc. of the Human Factors and Ergonomics Society 42nd Annual Meeting, 1998.Google ScholarCross Ref
- K. P. Coyne and J. Nielsen. How to conduct usability evaluations for accessibility: methodology guidelines for testing websites and intranets with users who use assistive technology. http://www.nngroup.com/reports/accessibility/testing, Nielsen Norman Group, Oct. 2001.Google Scholar
- DRC. The web: Access and inclusion for disabled people. Technical Report, Disability Rights Commission (DRC), UK, 2004.Google Scholar
- A. D. N. Edwards. Assistive technologies. In S. Harper and Y. Yesilada, editors, Web Accessibility: A Foundation for Research, Human-Computer Interaction Series, chapter 10, pages 142--162. Springer, London, first edition, Sept. 2008.Google Scholar
- S. L. Henry and M. Grossnickle. Just Ask: Accessibility in the User-Centered Design Process. Georgia Tech Research Corporation, Atlanta, Georgia, USA, 2004. On-line book: www.UIAccess.com/AccessUCD.Google Scholar
- M. Hertzum and N. E. Jacobsen. The evaluator effect during first-time use of the cognitive walkthrough technique. In Proc. of HCI International on Human-Computer Interaction: Ergonomics and User Interfaces-Volume I, pages 1063--1067, 1999. Google ScholarDigital Library
- M. Hertzum and N. E. Jacobsen. The evaluator effect: a chilling fact about usability evaluation methods. Int. Journal of Human-Computer Interaction, 1(4):421--443, 2001.Google ScholarCross Ref
- M. Hertzum, N. E. Jacobsen, and R. Molich. Usability inspections by groups of specialists: Perceived agreement in spite of disparate observations. In CHI 2002 Extended Abstracts, pages 662--663. ACM, ACM Press, 2002. Google ScholarDigital Library
- K. Hornbæk and E. Frøkjær. A study of the evaluator effect in usability testing. Human-Computer Interaction, 23(3):251--277, 2008.Google ScholarCross Ref
- N. E. Jacobsen, M. Hertzum, and B. John. The evaluator effect in usability tests. In CHI '98, pages 255--256. ACM, 1998. Google ScholarDigital Library
- C. Jay, D. Lunn, and E. Michailidou. End user evaluations. In S. Harper and Y. Yesilada, editors, Web Accessibility: A Foundation for Research, Human-Computer Interaction Series, chapter 8, pages 107--126. Springer, London, first edition, September 2008.Google Scholar
- T. Lang. Comparing website accessibility evaluation methods and learnings from usability evaluation methods. http://www.peakusability.com.au/about-us/pdf/website_accessibility.pdf, Visited May 2008, 2003.Google Scholar
- C. Ling and G. Salvendy. Effect of evaluators' cognitive style on heuristic evaluation: Field dependent and field independent evaluators. Int. Journal of Human-Computer Studies, 67(4):382--393, 2009. Google ScholarDigital Library
- J. Nielsen. Finding usability problems through heuristic evaluation. In Proc. of CHI 1992, pages 373--380, Monterey, CA, USA, May 1992. ACM. Google ScholarDigital Library
- J. Nielsen. Usability Engineering. Morgan Kaufmann, San Francisco, 1993. Google ScholarDigital Library
- H. Petrie and O. Kheir. The relationship between accessibility and usability of websites. In Proc. CHI 2007, pages 397--406, San Jose, CA, USA, 2007. ACM. Google ScholarDigital Library
- J. Rubin and D. Chisnell. Handbook of Usability Testing. Wiley, second edition, 2008. Google ScholarDigital Library
- J. Slatin and S. Rush. Maximum Accessibility: Making Your Web Site More Usable for Everyone. Addison-Wesley, 2003. Google ScholarDigital Library
- J. Thatcher, M. Burks, C. Heilmann, S. Henry, A. Kirkpatrick, P. Lauke, B. Lawson, B. Regan, R. Rutter, M. Urban, and C. Waddell. Web Accessibility: Web Standards and Regulatory Compliance. FriendsofED, 2006. Google ScholarDigital Library
- W3C/WAI. Requirements for WCAG 2.0 Checklists and Techniques. http://www.w3.org/TR/2003/WD-wcag2-tech-req-20030207, 2003.Google Scholar
- Y. Yesilada, G. Brajnik, and S. Harper. How Much Does Expertise Matter? A Barrier Walkthrough Study with Experts and Non-Experts. In Proc. of 11th Int. ACM SIGACCESS Conference on Computers and Accessibility - ASSETS 2009, pages 203--210, Pittsburgh, PA, Oct. 2009. Google ScholarDigital Library
Index Terms
- Testability and validity of WCAG 2.0: the expertise effect
Recommendations
Is accessibility conformance an elusive property? A study of validity and reliability of WCAG 2.0
The Web Content Accessibility Guidelines (WCAG) 2.0 separate testing into both “Machine” and “Human” audits; and further classify “Human Testability” into “Reliably Human Testable” and “Not Reliably Testable”; it is human testability that is the focus ...
On the testability of WCAG 2.0 for beginners
W4A '10: Proceedings of the 2010 International Cross Disciplinary Conference on Web Accessibility (W4A)Web accessibility for people with disabilities is a highly visible area of research in the field of ICT accessibility, including many policy activities across many countries. The commonly accepted guidelines for web accessibility (WCAG 1.0) were ...
How much does expertise matter?: a barrier walkthrough study with experts and non-experts
Assets '09: Proceedings of the 11th international ACM SIGACCESS conference on Computers and accessibilityManual accessibility evaluation plays an important role in validating the accessibility of Web pages. This role has become increasingly critical with the advent of the Web Content Accessibility Guidelines (WCAG) 2.0 and their reliance on user evaluation ...
Comments