1 Introduction

As in every category of computing systems, accessibility has been established as a condition for both ergonomics and quality in game design [22, 23]. The importance of game accessibility ranges from traditional console to augmented reality games for people with disabilities. Improvements on game accessibility can help different groups of players to take benefits from player experience, as practicing enjoyable exercises [7, 27, 34, 36]. Enhancing accessibility in games is a great challenge once it may be related with other important terms in game design, as playability, player’s engagement, fun, enjoyment and appropriated cognitive load [3, 17, 20, 34, 36].

The literature on Human-Computer Interaction (HCI) has shown that proper design methodologies are fundamental for developing accessible technologies [33, 44]. In especial, methods for accessibility evaluation are basis to improve accessibility in digital games [5, 17, 32, 41, 44]. Similarly, evaluating usability for people with disabilities is important in the game context in order to include players with impairments and their range of characteristics [7, 24, 39].

The literature shows extensive reviews on game usability evaluation [19, 25, 35, 43]. But, these studies were not enough to inform what methods can be applied to evaluate game accessibility. Game accessibility is different from software accessibility, because the main purpose of a game is entertainment [44]. Hence, game accessibility evaluation also differs from software accessibility evaluation methods. On this subject, Bors [2] presented a synthesized review on game accessibility guidelines. Paavilainen [30] reviewed the variety of heuristics that could be applied in game evaluation, including accessibility aspects. In a broad way, Yuan et al. [44] conducted a literature survey on game accessibility aspects, reviewing the main topics related to the field, but without the focus on game accessibility evaluation methods.

Until the popularization of game accessibility evaluation methods becomes concrete, identifying the existing set of methods that can be applied specifically for game accessibility evaluation is a very important task that needs to be performed to enable further research in the field. Therefore, in direction to this objective, we performed a literature survey trying to comprehend and identify the variety of methods used to evaluate game accessibility. Our survey was based on a snowballing technique, as proposed by Wohlin [42]. We defined a set of five (5) research questions and performed searches based on a start set of works, and their references and citations. We also defined inclusion/exclusion criteria and, finally, performed qualitative analysis of extracted data [4].

In summary, our conclusion was that traditional user-based evaluations have been addressed to the context of game accessibility, while new inspection-based evaluations (as the strategies presented by Yuan et al. [44], and the guidelines proposed by IGDAFootnote 1 and MedialtFootnote 2) have been proposed in order to provide accessibility evaluation methods focused on games. However, the applicability of such guidelines and strategies in the mobile game context is not clear yet.

The remaining of this paper is organized as follows: Sect. 2 presents research questions and review methodology; Sect. 3 presents the evaluation of the research questions; Sect. 4 shows a discussion on how our findings implicate in the design process and Sect. 5 presents concluding remarks, limitations of this research and indications for future work.

2 Review Methodology

This study may serve as a prelude to investigate a research topic and identify further research activities. We performed a literature survey aiming at collecting evidences on what methods are applied for accessibility evaluation focused on game domain. Our survey is based on a snowballing technique as proposed by Wohlin [42]. Thus, we defined a start set of works from a group of candidates suggested by a researcher of the field. In sequence, we defined proper inclusion/exclusion criteria and procedures for data qualitative analysis.

Research Questions

The idea was to compile the main concepts presented in previous studies and produce a synthesis to help researchers and developers gaining a better understanding of the field, as well as, increase the discussion about the subject. To achieve this objective, we surveyed literature to answer the following five questions:

 

RQ1::

What are the game accessibility evaluation methods and how they are classified according to ISO categories of accessibility evaluation methods? Footnote 3

RQ2::

Does mobile game accessibility evaluation differs from game accessibility evaluation in other devices?

RQ3::

What are the aspects considered during game accessibility evaluation?

RQ4::

What genres of games have been approached by game accessibility evaluation methods?

RQ5::

What user profiles are considered during game accessibility evaluation method?

Start Set and Selection Criteria

We defined the referred start set considering the most related work, from previous researches in the field. The start set had six (6) works [13, 26, 31, 36, 38, 44] based on the quality of evidences provided by them to answer our RQs. The minimum requirement to include a paper at this stage was that it should answer at least one of the RQs. In sequence, we performed searches using the snowballing technique considering the start set of works. We adopted Google Scholar Footnote 4 for the searches, as suggested by Wohlin [42].

We considered papers published between 2011 and 2016. This time frame was set in order to collect works published after Yuan et al. [44]. To be included in our review, a work should be written in English and provide enough evidences to answer any of our questions. We performed a first selection after reading title, abstract and keywords. All works selected at this phase were read in full and we applied the following inclusion/exclusion criteria:

  • Inclusion criteria: (i) studies that contain the keywords “gam*Footnote 5 and “access*” in title, abstract or keywords; (ii) papers that provided enough evidence of answers to, at least, RQ1.

  • Exclusion criteria: (i) works with no full-text available; (ii) works not related to game accessibility evaluation; (iii) works that do not answer any of our questions.

After applying all presented criteria we gathered 32 works. We defined following ten fields to be filled out with data extracted from included works: (a) summary of contributions and limitations, (b) game accessibility evaluation method described, (c) main characteristics of the game evaluated, (d) which aspects were taken in account as subject of evaluation, (e) how the method is classified (inspection-based or user-based evaluation), (f) particular characteristics of the method, (g) characteristics of each method’s outcomes, (h) characteristics of user profile, (i) kind of game platform (device) and (j) additional relevant information. After extraction of all data from the selected works, we performed qualitative analysis on data according to Cruzes and Dyba [4].

3 Evaluation of Research Questions

This section discusses the main findings of our study, we present the answers for our research questions followed by proper discussions regarding such answers.

  • RQ1: What are the game accessibility evaluation methods and how they are classified according to ISO categories of accessibility evaluation methods?

In the context of user based evaluation, the works included in our survey refer to test with users to detect accessibility barriers, metrics of player’s performance and conformance level regarding ISO ergonomics standard [3, 6, 8,9,10,11,12,13,14, 17, 20, 27, 31, 34, 36, 38]. Song et al. [37] applied focus group to discuss with potential users about accessibility characteristics of the game under evaluation. When referring to user based methods, different authors, as Seaborn et al. [36], Rector et al. [34], Torrente et al. [40], and also Song et al. [37], usually do not show changes to traditional structures because these methods are highly dependent on users’ perceptions of the interface (used by players during such evaluations). Authors as Gotfrid [13], Seabron et al. [36], Gerling et al. [10] and de Oliveira et al. [29] performed user based evaluations and, complementarily, introduced games to potential users and asked them to express their opinions and highlight accessibility issues.

Regarding inspection based methods, works refer to expert reviews as guideline reviews and heuristic evaluation [6, 7, 12, 17, 21, 26, 32, 34, 44]. Yuan et al. [44] and Heron [17] referred to the use of traditional guidelines, not focused on game domain. Yuan et al. [44] referred to WCAGFootnote 6 guidelines, and Heron [17] referred to BBC Future Media Standards and Guidelines. But, most works referred to inspection methods focused on game domain, as expert reviews (including heuristic evaluation) using game accessibility strategies from Yuan et al. [44] and guidelines review using popular game accessibility guidelines as IGDAFootnote 7, MedialtFootnote 8 and Game AccessibilityFootnote 9 [6, 7, 12, 17, 21, 26, 32, 34, 44]. Additionally to these works, Garcia and de Almeida Neris [6] proposed a set of guidelines for audio based games, Rector et al. [34] composed an enjoyment checklist and Garber referred to a set of good practices in game accessibility [26]. Besides guidelines for game accessibility, some works showed guidelines for including traditional games to the accessible context [32, 36], we understand that such guidelines are important for the field, but this was out of the scope of our question.

In summary, works surveyed refer to the user-based methods: test with users, questionnaire application and focus group; and to the inspection-based methods: guidelines review and heuristic evaluation.

  • RQ2: Does mobile game accessibility evaluation differs from game accessibility evaluation in other devices?

The data extracted from works surveyed was not enough to properly answer this question. However, we found a few approaches that authors adopted to evaluate mobile game accessibility. Gotfrid [13], de Oliveira et al. [29] and Seaborn et al. [36] developed their own questionnaires to evaluate mobile game accessibility. Besides being easy to apply, their questionnaires still lacks validation in order to comprehend the extension of its results.

Seaborn et al. [36] and Gerling et al. [10] adopted the NASA-TLX [28] questionnaire in order to evaluate players’ cognitive load, recognized as an important aspect in mobile usability for players with impairments [15]. The NASA-TLX [28] is well accepted in the literature for evaluation of cognitive load, but it is not focused on game domain.

An important fact noticed through evaluation of this questions was that Seaborn et al. [36] used mobile games as an alternative to include traditional games in the accessible scenario, adapting the Catch the Flag game for adult powered chair users. In this sense, they referred to the capability of mobile devices of including traditional games in the accessible context and proposed a set of guidelines for this process.

  • RQ3: What are the aspects considered during game accessibility evaluation?

Most works surveyed refer to traditional aspects as accessibility, accessibility barriers, player performance (based on log recordings) and usability for people with disabilities. Some works referred to specific aspects, focused on game domain. The following list present works and the respective aspect referred during game accessibility evaluation:

  • Chen [3], Torrente et al. [39, 40] and Gerling and Mandryk [9] referred to player experience.

  • Seaborn et al. [36], Gerling et al. [11], Chen [3] and Rector et al. [34] referred to engagement.

  • Heron [17] and Chen [3] referred to fun.

  • Gerling et al. [10] and Seaborn et al. [36] referred to evaluating levels of cognitive load.

  • Immonen [20], Rector et al. [34] and Chen [3] referred to enjoyment.

  • Gerling et al. [8] and Chen [3] referred to players’ humor.

  • Chen [3] and Heron [17] referred to playability.

  • Lee Garber referred to best practices in game accessibility development.

In summary, player experience, engagement and enjoyment were the aspects most referred among works surveyed in our study.

  • RQ4: What genres of games have been approached by game accessibility evaluation methods?

Works commonly referred to evaluations of mobility games, especially “exergames” [7,8,9,10,11,12,13, 17, 29, 32, 34, 36]. According to Rector et al. [34], exergames are games that promote physical exercises for players. Although, racing [6, 10, 20, 27], action/adventure [31, 38, 40], games based on voice recognition [14] and cognitive games [8, 13] were also evaluated among works surveyed.

  • RQ5: What user profiles are considered during game accessibility evaluation method?

To answer this question, we adopted the categories of user profile in game accessibility context as proposed by Yuan et al. [44]: users with motor impairments, users with visual impairments, users with hearing impairments and users with cognitive impairments. All of these categories were referred by the works surveyed. Most works referred to at least two of such categories. In addition to these categories, we found a growing attention for developing accessible games for elderly players [3, 8,9,10,11].

4 Implications for Design and Directions for Future Researches

Our findings indicate that designers and practitioners can apply traditional user based evaluation methods when their applications are appropriated (e.g.: when interactive prototypes are available). On the other hand, most works that referred to inspection methods reported methods focused on game domain. For this reason, when inspection methods are required (as in stages of the design when low fidelity prototypes are available), we suggest that designers consider inspections using strategies from Yuan et al. [44], IGDA or Medial guidelines, because of their popularity among works surveyed.

For cases when mobile game accessibility needs to be evaluated, we indicate the application of the NASA-TLX [28] questionnaire combined with another evaluation method. NASA-TLX [28] can be used as complement to enrich results with indications about players’ cognitive load. Finally, we suggest the application of includifying [36] or includification Footnote 10, combined with other evaluation methods when the goal of the game is to include a traditional game in the accessible context.

We also suggest the following research topics as a roadmap for future studies:

  • To explore applicability and efficacy of adopting popular game accessibility guidelines. IGDA and Medialt are the most popular game accessibility guidelines, but they are not focused on mobile context. We also suggest to future studies to explore whether cognitive load, an important aspect of mobile usability for players with impairments, could be evaluated through such guidelines.

  • To explore the impact of expertise and evaluator effect [1, 18], that are common bias among accessibility inspection methods, on the outcomes of game accessibility inspection methods. Such exploration is necessary because different inspectors can report different problems on a game accessibility inspection (evaluator-effect), and inspectors with different expertises can produce reports with different levels of quality (expertise-effect).

  • To perform validation studies to understand differences of outcomes from the variety of methods showed in the findings of our survey (e.g.: comparing outcomes from inspection with game accessibility guidelines with outcomes from test with users). Such studies should consider using the assessment criteria as showed by Hartson et al. [16].

  • To explore whether traditional user-based methods are sufficient for evaluating all characteristics of game accessibility, or if new methods need to be proposed in order to contemplate that. Users-based evaluations methods, referred in works we included, are not focused on game accessibility and exploring this topic is very relevant for the community. We suggest to explore the use of the interaction model of Yuan et al. [44] as basis for accessibility problem detection.

5 Concluding Remarks and Related Work

Our study focused on collecting information from the literature about methods to evaluate accessibility focused on game domain. In this sense, we listed the most common methods as referred by works surveyed. Most of user-based evaluation methods referred by works were not focused on game domain, while most of inspection methods retrieved were focused on game domain.

Game design field have some popular methods for evaluation of game accessibility. For user based evaluations, traditional test with users is largely applied. Regarding inspection methods, expert reviews using strategies from Yuan et al. [44] and guideline reviews using IGDA and Medialt guidelines are popular in the area. Evaluation methods focused on mobile game context should be on the agenda of researchers.

Results showed that player experience is commonly evaluated together with game accessibility, and that the design of mobility games (as exergames) receive a large attention from the community. Multiple user profiles are considered during game accessibility evaluation and elderly players have received special attention among recent works.

Bors [2] presented a synthesized review of guidelines for game accessibility evaluation. Paavilainen [30] reviewed heuristics and accessibility aspects for game accessibility inspection. In a broad way, Yuan et al. [44] conducted a literature survey on game accessibility, reviewing the main topics related to the field. Our study provides a literature survey on methods used to evaluate accessibility with focus on game domain.

Our main contributions are the answers for our research questions. We did not find a study that individually answered our questions. Thus, after considering the review work we decided to share our findings with the community. Additional contribution of our work is a list of methods referred by the literature on game accessibility evaluation and details about such evaluations. We also discuss implications of our findings to a design process, especially suggesting the use of popular evaluation methods as described before. Finally, we provide a research roadmap to guide future studies in the field with insights based on our findings.