Keywords

1 Introduction

Multimodal serious games are a type of multimodal interface application that have been widely used as tools to improve various cognitive skills in people who are blind, such as orientation and mobility, logical reasoning and collaboration [1,2,3]. Video games with this purpose should meticulously combine and integrate different sources of perceptual inputs [4], considering the limitations of the users as well as the cognitive goals that the game pursues [5]. Moreover, there is an increasing concern regarding the usability of the user interfaces and interaction in this type of games [6, 7]. There is a significant challenge in practice when considering the evaluation of such multimodal interactions, because they integrate the complexities related to games evaluation, multimodal interfaces evaluation, and cognition of people who are blind. Therefore, much better guidance and accumulated best practices are still needed [8]. Even though usability evaluation is the most frequent type of evaluation in the context of multimodal video games [5], recurrent situations jeopardize its effectiveness in the context of people with visual disabilities. For example, employing inappropriate usability methods and neglecting critical issues, such as the nature of the audience’s limitations and whether the modalities offered by the game can support and help to enhance the desired cognitive skills [9].

The usability evaluation in this context should verify whether a multimodal gaming interface adapts to the needs and abilities of different users, as well as diverse contexts of use, considering their individual differences [8]. The consideration of such aspects leads researchers to identifying and fixing the relevant usability issues that affect people who are blind while playing multimodal video games. Considering the target users’ specificities is necessary for the HCI evaluation of any system. However, it is a more sensitive matter when evaluating applications for people who are blind, because we cannot assume that all individuals with visual disabilities are identical, especially in different contexts and with specific cognitive development purposes [1, 2, 10]. For that reason, the evaluators must identify whenever is necessary to change or extend the interactive modalities and whether the customization options fit the actual user needs [10]. The nature of the interactions and modalities when carrying out interactive tasks should also be taken into consideration because it can make a considerable difference to usability whether a piece of abstract information has been represented in one or another modality, especially in for the perception and understanding of people who are blind [3, 10].

The importance of thinking about usability evaluation in this regard resides mainly in that the multimodal gaming interface should not add unnecessary complexity to the interaction of people with visual disabilities. On the contrary, such interfaces should be friendly and pleasant to use, supporting in different levels the diversity of the target users’ abilities and disabilities [11, 12], and successfully leading them to acquire cognitive skills while interacting.

As part of a continuing effort to improve usability evaluation of multimodal video games for cognitive enhancement of people who are blind, we conducted an expert opinion survey [13]. It aimed to update the published literature review results [5, 9], by deepening and enriching the developing understandings of challenges and principles in this topic. Hence, this paper contributes with the literature by discussing how usability evaluation has been carried on by researchers and practitioners in this field and with the proposal of PrincipLes for Evaluating Usability of Multimodal Video Games for People who are Blind (PLUMB). PLUMB is a practical aid to help researchers and practitioners to properly plan and conduct usability evaluation of multimodal video games based on audio and haptics designed for people who are blind.

2 Background

2.1 Multimodal Interaction Characterization

Multimodal applications have evolved in a complex way regarding technological resources and interaction possibilities that might be offered, through processing multiple combined user input modes in a coordinated manner with multimedia output [14]. It is generally accepted that the multimodal interaction definition is built upon Norman’s action cycle [15], using established findings on human-machine multimodal communication to orchestrate the fusion of multimodal inputs and the fission of multimodal outputs, resulting in an adequate outcome to the users, according to their context of use, and personal preferences and characteristics [16].

The applicability of multimodal interfaces in diverse areas is due to the range of possibilities brought by the combination of interaction modalities. An interaction modality can be seen as a communication channel related to the human senses or form of expression [17], describing an interaction technique that utilizes a particular combination of user ability and device capabilities [8]. The combination of such input and output channels and modes, in addition to the output modality selection based on context and user needs, turns the modeling of a multimodal application into a complex task [16]. As a result, researchers have put much effort in describing and modeling multimodal applications from both physical and conceptual point of view.

Considering the physical dimension of multimodal systems, the World Wide Web Consortium proposed a Multimodal Interaction Framework, emphasizing the interpretation and inner system layers [19]. Under a different perspective, the CASE model [20] describes four techniques to combine modalities at the integration engine level. However, for the purpose of considering the usability of such interfaces from a user-centered perspective, we are mostly interested in conceptual descriptions of the multimodal interaction, which can provide us with valuable insights on the human aspects that should be considered during the usability evaluation of multimodal interfaces.

Coutaz et al. [18] proposed the CARE properties (Complementarity, Assignment, Redundancy, and Equivalence) as a simple way of characterizing the features of the multimodal interaction that may occur between the interaction techniques available in a multimodal user interface, focusing on the modality combination possibilities at the user level. CARE properties can also be used to assess multimodal interfaces, considering the fusion and fission of information during the interaction. Each CARE property is represented in a formal expression based on state, goal, modality, and temporal relationship, and has a counterpart in CARE-like user properties.

Bongers & Van der Veer [17] introduced the Multimodal Interaction Space (MIS), a theoretical framework to place and described multimodal interactions, focusing on the space where the interaction occurs, in a human-centered approach. MIS is a descriptive framework for interaction styles and comprises levels, modes, and sensory modalities, but do not include physical interface input and output modalities. According to this approach, a multimodal interaction can be described in multiple layers, considering the user goal and intention, the formulation of tasks and subtasks, and finally the execution of such actions in the interface while receiving physical feedback and evaluating the outcome.

Based on an empirical study of multimodal usability [21], Bernsen (2010) proposed AMITUDE [10], a conceptual development-for-usability framework to express a model of use of the system under development, based on seven interaction aspects: Application type, Interaction, Task or domain, User, Environment of use, Modalities, and Device. According to the authors, each aspect addressed by AMITUDE must be taken into account when developing for usability. Therefore, while designing a multimodal application practitioners and researchers should provide detailed description of the intended application users, the tasks the application will support, the modalities and respective devices involved, as well as the environment in which the tasks will be carried on, and how each one of these aspects can affect the usability during user interaction.

2.2 Multimodal Interaction for People Who Are Blind

Receiving information in a multimodal way enables people who are blind to interact with real and virtual environments mainly through audio-based and haptic interfaces, which are capable of fomenting learning and cognition in this audience [8, 9]. The crucial features of the multimodal interaction in multimodal gaming interfaces in this context are Audio, Adaptation, Interaction Mode and Feedback, along with the cognitive aspects meant to be stimulated [5], in consonance with a motivating story [22]. The game input device - which may include a keyboard, natural language, force feedback devices, touchscreen, directional pads, and specific devices - determines the style of interaction available and influences the type of feedback the user receives after an interaction, which is usually a combination of haptic, sonorous and visual feedback [5]. Figure 1 summarizes the multiple aspects of the interface and interaction dimensions of multimodal games for people who are blind.

Fig. 1.
figure 1

Dimensions for the description of the key characteristics of the design and evaluation of multimodal video games for cognitive development of people who are blind [5]

The different modalities combinations and choices affect the users’ behavior towards the game and determine how cognitive processes are stimulated. For instance, audio and visual cues coordinated with haptic elements distributed in a virtual navigational environment can serve as references for orientation and mobility, as well as help people who are blind adopting and restructuring a mental model of spatial dimensions. The different types of audio cues can represent spatially and surroundings properties including location, size, distance, direction, separation and connection, shape, pattern, and movement; or be associated with each available object and action in the environment.

For that reason, the multimodal interaction provided by the game interface must adequate the use of modalities to the cognitive game goals, along with the game story, while providing the player with the proper interaction mode to develop the desired skills in a certain usage context.

In a previous work [23] we took the first step towards identifying the issues that may jeopardize the multimodal interaction of people who are blind, by defining a Standard List of Usability Problems (SLUP) for multimodal video games, which was submitted to the judgment of experts and end-users. In addition to the Overall Usability issues related to learnability, efficiency, satisfaction, and difficulties in handling the different modalities, SLUP details usability problems related to Audio, Adaptation, Interaction Mode and Feedback.

Being familiar with the types of issues that affect the interaction of people who are blind and with the main features of multimodal games can help researchers avoid these problems, as well as discover them more efficiently during usability evaluations. The final version of SLUP specifies:

  • 15 audio issues that can affect iconic, spatialized and stereo sounds, iconic sounds and abstract earcons, or speech synthesis and spoken audio;

  • Six customization issues that can be caused by the size, color scheme, or contrast of graphic elements, or even by the speed and intensity of sounds and voices;

  • 13 issues related to the overall usability of the game, addressing learnability, satisfaction, errors, and efficiency;

  • Eight feedback issues that can occur either in haptic (kinesthetic or tactile), aural or visual responses to the user interaction; and

  • 19 issues that can be related to the game interaction techniques and devices, and that affect the user interaction with the game inputs and outputs.

The establishment of SLUP for multimodal games for people who are blind serve multiple purposes, such as comparing which problems different usability evaluation methods (UEMs) can disclose, helping designers to avoid predictable issues in project time, and supporting the evaluation of these applications. The most significant problems in SLUP are grouped in subcategories listed in Fig. 2. These results were used as one of the bases for the principles assembled in this research.

Fig. 2.
figure 2

Summary of the most frequent types of usability issues in multimodal interfaces for people who are blind [23]

3 Expert Opinion Survey

In previous studies [5, 9], we analyzed the usability evaluation performed in 17 multimodal games and 4 multimodal navigation virtual environments, in which we identified that practitioners and researchers often tend to administer informal usability evaluation methods in this field. To deepen our understanding of this issue, we created a 15-questions online questionnaire and emailed it to 16 international researchers that reported usability evaluation in the 21 applications previously identified. The survey aimed to update the published literature review results, as well as to deepen and to enrich the developing understandings of challenges and principles in this topic.

3.1 Methodology

Expert Opinion Surveys can be used to serve a variety of purposes and they result in predictions of how others will behave in a particular situation, according to persons with knowledge of the situation [13]. This technique can be used to assist in problem identification and in clarifying the issues relevant to a particular topic, by consulting individual experts [24]. Although it can be seen as a relatively informal technique, as individual expert opinion is not infallible. If a number of different experts provide the same feedback it is likely that real issues exist [24].

We sent personalized emails to the 16 identified authors. These emails included the title of the paper in which each researcher described a usability evaluation, a brief description of the ongoing study with a hyperlink to the survey, and an estimate of the time needed to fill out the questionnaire. We chose this approach because using personalized email in soliciting participation appears to be the most effective method to increase participation in surveys [25].

For two months, biweekly reminders were sent, until the response rate stopped improving. Researchers that had already responded the survey were contacted and kindly asked to remind their coauthors to answer our survey. About 56% percent of these researchers never answered, despite reminders and/or personalized emails to their coauthors. Hence, a response rate of 43, 75% (N = 7) was obtained, corresponding to seven authors responsible for nine of the previously analyzed papers.

Despite the small number of respondents, the strength of this exploratory approach resides in the fact that the respondent experts have a diverse background and work in different countries and research groups, which helps to avoid specific trends in their opinions. Among the respondents, there are researchers from North and South America, Europe and Asia, who answered the questionnaire independently. The literature also attests the fairness of the response rate obtained in this work. For instance, Rowe and Wright (2001) discuss principles for the use of expert opinions and state that the most relevant forecasts rely on unaided expert opinions. They state that, when conducting expert opinion surveys, researchers should obtain independent answers from between 5 to 20 experts [26]. In addition to that, according to Lazar, Feng & Hochheiser (2017), if the research goal is to gather requirements from domain experts, in-depth discussions with two or three motivated individuals can provide a wealth of data, which corroborates the adequacy of the approach this research purposes [27].

3.2 Overview of Survey Results

The participant’s profiles are summarized in Table 1, based on the characterization of their research experience and practice on evaluating multimodal applications for people who are blind. The experts are from five different countries and work in different research groups, except for A3 and A6, who belong to the same University. Most of the experts are University professors and researchers (A2, A3, A4, A6, and A7).

Table 1. Researchers’ profiles and behavior towards multimodal usability evaluation

A1 works with virtual environments, usability and accessibility at IBM Research and A5 investigates educational video games at the Organización Nacional de Ciegos Españoles (ONCE), a Spanish national nonprofit social corporation of public law. Some of the researchers had more than one paper identified in our previous literature review, such as A2 (3 papers) and A5, A6 and A7, with two papers each. In the survey, they answered specific questions regarding the types of usability instruments and methods administered in the evaluations described in each of these papers.

Regarding their research experience in the field, 43% have been researching applications designed for people who are blind for about four to six years (A4, A5, and A6), and 29% have been researching in this area for seven to nine years (A2 and A7). While one expert has been researching this field for up to ten years (A1), another one occasionally collaborates with researchers involving evaluation with people who are blind (A3). It is interesting to point out that neither the most experienced nor the occasional researcher are familiar with any particular models for the design and evaluation of multimodal applications, as well as most of the experts (57%).

The experts that reported being familiar with models for the design or usability evaluation of multimodal interaction mentioned the CARE properties, the CASE model, and the AMITUDE model. However, none of them based their evaluation instruments in any of these models. Instead, all the researchers use mostly ad-hoc instruments during the usability evaluation of multimodal applications for people who are blind. In this work, they are classified as ad-hoc any instruments generated by the authors, according to the specific goals of an ongoing evaluation, but not formally validated and often not reusable.

Although A7 is familiar with CASE model and uses validated instruments reportedly, his evaluations usually do not employ any instrument based on that model. Instead, his research group developed and validated a number of quantitative specific instruments to measure usability in the specific context of their research. This information implies that their familiarity with formal models describing multimodal interaction models does not affect the conduction of usability evaluations.

All the researchers agreed that the use of both quantitative and qualitative methods is necessary to assess the interaction of people who are blind with multimodal interfaces. Although also agreeing with the use of quantitative and qualitative methods, A7 focus in the use of quantitative evaluation methods and instruments, as his research usually measures the cognitive impact of multimodal games on the intellect of people who are blind.

4 Challenges on Usability Evaluation of Multimodal Interaction with People Who Are Blind

When answering the survey, the experts provided a detailed discussion of the topics approached, revealing challenges and needs in the field of usability evaluation of multimodal video games for cognitive development of people who are blind. Some insights emerged from the experts’ answers on their recent research work and evaluation practices, as well as challenges related to usability evaluation of multimodal video games for people who are blind. These results are further discussed in the remainder of this section.

4.1 Challenge 1: Lack of Guidance on the Conduction of Usability Evaluation

Overall, experts indicated that it is common to perform informal usability evaluation in this field, which usually consists in applying ad-hoc questionnaires or interviews after a gameplay session. According to them, it happens due to time or team issues, as well as for the need to perform multiple types of tests (e.g. performance and cognitive impact). In practice, little time is left for planning and conducting a usability evaluation because most of the project schedule accommodates the development and other types of tests.

In addition to that, usually, the team is unfamiliar with usability evaluation instruments or methods that offer useful specific support to this context. Even experienced researchers who are familiar with models for design and evaluation of multimodal applications (e.g. CARE, CASE, and AMITUDE) claim that the use of these models for multimodal usability evaluation is complicated and too laborious, mainly when performed by practitioners. They argue that, in this scenario, it seems more beneficial to create their own evaluation instruments that are easier to administer and context-specific.

Indeed, some usability aspects – efficiency and effectiveness, for instance – can be evaluated independently of the domain [28]. However, all the researchers agreed that a drawback of doing this is the lack of guarantee of meeting the user’s needs, principally considering the game cognitive requirements. Ratifying this concept, researcher A3 highlighted that visual disabilities are very particular, and each user is a world, so it is hard to apply very general solutions for usability evaluation in this context.

The applicability of general UEMs to diverse areas is questionable when evaluating specific characteristics, because they may not be adequate for the new contexts of use, generating gaps in the evaluation [29]. In fact, all the expert researchers agreed that, according to their experience the multimodal inputs, the specificities of users’ visual disabilities and the type of cognitive skills to be supported are specific characteristics that profoundly affect the usability evaluation of an application designed for people who are blind.

Hence, these characteristics have to be considered in any usability evaluation in this context. Researcher A6 further remarked that, while it is true that all of these aspects can have an impact on the usability evaluation, the most significant challenge in measuring usability is that the solutions that grant accessibility for some levels of visual impairment actually hinder their usability for other levels.

The discussion raised by the researchers pointed out that there is a need for practical guidance on usability evaluation in this field. It is necessary to ensure that multimodal game interaction and interface elements suit the game cognitive purposes and the people’ characteristics, leading them to interact pleasantly and correctly while playing and learning. All the experts affirmed that they would welcome evaluation principles to assist researchers in choosing the methods that better fit their usability requirements to assess diverse aspects of multimodal interfaces in this context.

4.2 Challenge 2: Evaluate Multimodal Interaction While Considering the Cognitive Dimension

The usability evaluations described in the experts’ papers showed that the crucial features to be evaluated during the interaction of people who are blind with multimodal gaming interfaces are audio, customization capability, interaction mode, and feedback (including audio and haptic stimulus). However, the experts highlighted that these aspects could not be evaluated apart from the game target cognitive aspects.

Despite that, they reportedly conduct usability evaluation more frequently than cognitive impact evaluation because the last one requires specialized people and procedures. All the experts agreed that it is not possible to assure that any multimodal application or game is capable of developing or enhance any cognitive skills in people with visual disabilities if an adequate cognitive impact evaluation is not conducted.

Having performed cognitive impact evaluation in this context several times, experts A2 and A7 suggested that the key for the success of this type of evaluation is the understanding of what data to collect from users, in order to compare, analyze and measure the skills and the development of the subjects. They recommend the use of tests based on experimental and control groups or two-sample test analysis based on a pretest-posttest of the same group. Besides, researchers were unanimous in their opinion that it is crucial to investigate how multimodal game elements can be meaningfully used to develop cognition in people who are blind.

4.3 Challenge 3: To Go Beyond Both Usability and Cognitive Impact Evaluation

Although researchers spoke in one voice regarding the need for sound usability evaluation of multimodal games for people who are blind, they also demonstrated concern about evaluating other aspects. According to the researcher A7, when a usability evaluation is applied to the users’ context, the mental model and the cultural environment are also critical because the interface was created specifically for them. In other words, all the interaction is part of the design for people who are blind, including all their culture. Besides, A7 suggested that an equally important challenge could be these users’ experience with the multimodal interface because the experience is more than usability.

For the experts, usability is clearly an important aspect for the game quality, but there are other aspects to consider related to pleasure-based human factors, such as the satisfaction of people who are blind, the multisensory aesthetic experience and their emotional response. In addition, they indicated that the behavior of people who are blind towards multimodal games should also be evaluated considering gameplay experience, social interaction, fun, and playability.

There is an opportunity for the academic community to take an active role in creating diverse types of evaluation instruments, as well as evaluating the effectiveness of the existing ones, in the context of multimodal video games for blind person’s cognition enhancement.

5 Principles for Evaluating Usability of Multimodal Video Games for People Who Are Blind (PLUMB)

Seeking to meet the identified challenges regarding usability evaluation, a set of PrincipLes for Evaluating Usability of Multimodal Video Games for people who are Blind (PLUMB) were proposed. It is based on the analysis of the Expert Opinion Survey; on usability evaluation reported in the literature; and on the Standard List of Usability Problems (SLUP) in multimodal video games for people who are blind.

PLUMB is a practical aid to help researchers and practitioners to properly plan and conduct usability evaluation of multimodal video games based on audio and haptics, designed for enhancing and improving cognition in people who are blind. When asked about their opinion regarding the establishment of principles to assist researchers and practitioners in choosing the methods that better fit their usability requirements, to assess diverse aspects of multimodal interfaces in this context, all the authors agreed that it would be a useful aid to theirs and others’ research. A6 commented that this outcome would fill significant gaps in their original research, while A7 stated that it would be useful to have a simple guideline on how to make a usability testing with people who are blind.

Inspired by the experts’ comments on this topic and based on reviews of previous related studies and observations of current trends previously reported in this paper, PLUMB was created as a list of five principles for evaluating the usability of multimodal video games for people who are blind. The usability issues detailed in SLUP [23] include problems reported by users and issues pointed out by researchers. Considering that, and in addition to the discussion and results presented regarding the experience of researchers in this field, we propose that the usability evaluation of multimodal video games for developing cognition in people who are blind should observe the following principles.

  1. 1.

    Be connected to the design process, in a formative way. The identification of usability issues in this context should follow a “find-and-fix” approach, to ensure an interaction free of possibilities to distract the person who is blind from the game cognitive purposes. Hence, the usability evaluation should be planned focused on identifying usability problems before the game is completed. A formative evaluation during the game design process can maximize the chances of effecting change and implementing the usability recommendations.

  2. 2.

    Combine quantitative and qualitative methods to provide a holistic view of the data. This approach can help develop rich insights into phenomena of interest while evaluating multimodal games for people who are blind that cannot be fully understood using only a quantitative or a qualitative method. This principle aims to guarantee that usability evaluation uses multiple ways to understand the interaction and possible issues between the user who is blind and the multimodal gaming interface. Hence, data collection should involve any techniques available to researchers that allow at least two types of data (e.g., numerical and text), two types of data analysis (e.g., statistical and textual) and two types of conclusions (e.g., objective and subjective).

  3. 3.

    Combine empirical and analytical methods to comprehend both the users and researchers’ point of view. This principle aims to improve the accuracy of the identification of usability issues sources in the gaming interface and interaction. The use of both empirical (test-based) and analytical (inspection-based) usability evaluation methods provides direct information about how people who are blind use the multimodal game and their exact issues with its interface, while also having usability specialists judging whether each interactive element follows the necessary usability principles.

  4. 4.

    Include both users who are blind and with visual impairments, preferably in the real context of use. This principle aims to guarantee that the usability evaluation considers the different issues that arise from the diversity of perception and behavior between people who still rely on visual residues and those who rely on hearing and touch only. Besides, the multimodal video game has to adequate the presentation of abstract information, feedback, and game stimulus to the real conditions where people who are blind interact with the game: in schools or at home, assisted by a tutor. It is important to notice that testing applications with blindfolded users is not the same, due to the very different mental models.

  5. 5.

    Guarantee a combination of methods capable of analyzing

    1. a.

      the user’s perception of each interaction modality to execute specific tasks in the game;

    2. b.

      the user’s understanding of the relationship between the modalities offered and the game tasks;

    3. c.

      the user’s comprehension of the game goals and context, including the cultural and social context of the game narrative;

    4. d.

      the user’s ability to perform the expected tasks in the game correctly, in a way that the planned cognitive skills can be exercised;

    5. e.

      whether the user can distinguish the diverse sonorous, visual and haptic feedbacks, associating them with the correct actions and objects in the game;

    6. f.

      f.whether the user can combine modalities to achieve a goal in the game successfully;

    7. g.

      whether the combination of modalities offered by the gaming interface and devices is adequate for executing the game tasks;

    8. h.

      whether the modalities are appropriate to convey the information related to the game tasks;

    9. i.

      whether the game devices offer the desirable support for the game modalities;

    10. j.

      whether the modalities offered can ease the execution of a task;

    11. k.

      whether the user can recognize visual, aural and haptic feedback in a game task;

    12. l.

      whether the user can associate visual, aural and haptic feedback to a game task;

    13. m.

      whether the user has a positive acceptance to the visual, aural and haptic feedback associated with the game tasks, objects, and instructions.

The principles for setting a usability evaluation environment and for choosing UEMs in this context should be used as a guide to help practitioners and researchers to employ the most appropriate UEMs to evaluate the required aspects of these games in a particular context. However, we highlight that PLUMB is not a closed list. It can be expanded and improved, particularly as more knowledge is produced on the suitability of usability evaluation methods in this field.

6 Conclusion

Identifying usability problems in serious multimodal video games designed with cognitive proposes for people who are blind matters because these issues will make them focus on the problems, distracting them from learning cognitive skills when interacting with the video game. However, the planning and conduction of usability field tests involving these users is not an easy task. For this reason, these tests are often conducted using inappropriate instruments, UEMs and procedures to their contexts, or even left aside. In this paper, we discussed the characterization of multimodal interaction under multiple points of views, considering how different modalities combinations and choices may affect the users and be considered to an adequate usability evaluation.

In addition, we discussed the current usability evaluation practices for multimodal games for people who are blind and pointed out some challenges regarding this field, based on research with experts in development and evaluation of such applications.

Finally, we argued that there are specific aspects of user interface and interaction of multimodal video games that should be considered for the evaluation of a multimodal game for people who are blind to identify and correct the relevant usability issues that affect these users. In a nutshell, an adequate usability evaluation of such applications should consider (i) what are the specific characteristics of the target users; and (ii) how the multimodal features affect the interaction of users who are blind with the interface. PLUMB was proposed to assist evaluators in considering these aspects in their practice.

Our expectation is that the future designs and evaluation processes of multimodal interactions to improve cognition of people who are blind take into consideration their broadly different abilities and disabilities and provide them with usable and pleasurable gaming interfaces, by considering our findings.