Keywords

1 Introduction

Usability is an important aspect of software, it is related to user’s satisfaction and how effective and efficient he/she can perform a task interacting with an interface [16, 17]. Usability is also important for designing technologies to the elderly population, especially when the benefits from mobile technologies are taken in account [6, 11, 22, 27, 34, 36,37,38]. Given the importance of usability, diverse methods that promote its evaluation were proposed in the literature [7, 13, 26, 32].

Methods that propose usability evaluation are indispensable for designing usable interfaces [8]. In the context of mobile usability, one of the main challenges of evaluation methods is to address the diversity of context-of-use and the impact these devices have on users’ mobility [13]. At this point, studies (including previous studies of some of the authors) have shown that a popular inspection-based usability evaluation, the Heuristic Evaluation (HE), has evolved to address different contexts and users profile through the proposal of domain specific heuristics [14, 19, 23, 24].

Regarding the elderly, both Hermawati and Lawson [14] and de Lima Salgado et al. [24] showed the heuristics of Al-Razgan et al. [2] as the unique set specific for mobile usability and the elderly profile. However, as reinforced by the study of Hermawati and Lawson [14], validation of proposed domain specific heuristics are still reduced. Hermawati and Lawson [14] suggested that future studies should continue the development of such domain specific heuristics. At the time this study was written, the heuristics of Al-Razgan et al. [2] could be studied through the following venues:

  • (i) increasing its validations against traditional usability heuristics, as heuristics of Nielsen [18];

  • (ii) increasing its validation against outcomes from test with real users;

  • (iii) suggesting a text description (at least as an alternative) for each heuristic proposed.

The goal of our study was to expand the validation of the heuristics of Al-Razgan et al. [2], comparing them against the traditional heuristics of Nielsen [18] and outcomes from test with real users. Additionally, we discussed possible implications for design based on the evidences from this study.

The remaining of this paper was structured as follows: Sect. 2 provides a brief literature review on heuristics for elderly and mobile usability; Sect. 3 shows details of methods for this study as planned by the authors; Sect. 4 presents results from two process of validation for the heuristics of Al-Razgan et al. [2]; Sect. 5 shows implications for design; and Sect. 6 summarizes the conclusions of the present study.

2 Heuristics for Mobile Usability and Elderly Users

Heuristic Evaluation (HE) has been recognized among the most popular methods for usability inspection [1, 5]. The HE method is based on the application of broad usability principles, called heuristics, by expert evaluators in order to collect a list of existent usability problems [15, 28, 30, 31].

Some of the main distinct characteristics of mobile usability is its dependency on the context-of-use, user profile and cognitive load [13, 24]. Since the arrival of mobile devices, many domain specific heuristic sets were proposed aiming at providing better inspection for different contexts and user profiles [14, 23, 24].

Among the contributions of Hermawati and Lawson [14] and de Lima Salgado et al. [24], only two heuristic sets approached the elderly context domain. The referred sets were: (i) the weighted heuristics of Lynch [25]; and (ii) the “Touch-based Mobile Heuristics Evaluation for elderly people” from Al-Razgan et al. [2]. Although both heuristic sets approach elderly profile, only the heuristics of Al-Razgan et al. [2] consider the mobile context domain. The heuristics of Al-Razgan et al. [2] are listed in sequence:

 

1.:

“Make Elements on the page easy to read.

2.:

Easy Recognition and accessibility.

3.:

Make clickable items easy to target and hit.

4.:

Use the elderly language and culture; minimize technical terms.

5.:

Provide clear feedback on actions.

6.:

Provide preferable gesture for elderly.

7.:

Provide elderly with information on launcher/elderly status.

8.:

Use conventional interaction items.

9.:

Ergonomics design.

10.:

Provide functions that reduce the elderly memory load.

11.:

Elderly does not feel lost or stuck (Elderly control and freedom).

12.:

Prevent error from occurrence.

13.:

Provide necessary information and settings.

 

The “Touch-based Mobile Heuristics Evaluation for elderly people” from Al-Razgan et al. [2] were proposed for the evaluation of usability of mobile launcher applications for the elderly. Some examples of this kind of launcher applications are: WiserFootnote 1, KoalaFootnote 2 and Big LauncherFootnote 3. Despite being proposed for the evaluation of launcher applications, the heuristics of Al-Razgan et al. [2] were the closest we found for evaluation of mobile usability for the elderly. For this reason, we understood that these heuristics could be explored in order to better understand its validity for the wide context of mobile usability and elderly.

3 Methods

This section describes methods applied during our study to enhance the validation of the heuristics from Al-Razgan et al. [2] study. The following sections show details on how we organized and conducted such validations.

3.1 Study Design

The design of this study is organized among the two (2) following stages: (i) validation regarding traditional heuristics; and (ii) validation regarding outcomes from test with real users.

At the first stage - validation regarding traditional heuristics - we aimed to compare the heuristics from Al-Razgan et al. [2] study against the traditional heuristics of Nielsen [18]. The aim of this stage was to identify the coverage of Nielsen’s heuristics among the heuristics of Al-Razgan et al. [2]; and to identify which of the heuristics from Al-Razgan et al. [2] are not covered by Nielsen’s heuristics. For this reason, we used the heuristics and factors as exposed by Nielsen at [29] in a matching process with the describing checklist of Al-Razgan et al. [2]. Two usability researchers were responsible for comparing each item of the checklist (used by Al-Razgan et al. [2] to describe their heuristics) with Nielsen‘s heuristics.

At the second stage - validation regarding outcomes from test with real users - our goal was to validate the coverage of the heuristics of Al-Razgan et al. [2] against outcomes from test with real users, in the context of elderly using mobile applications. For this purpose, we used results from the literature (that provided evidences from test of elderly using mobile applications) and a case study with six (6) senior using a mobile application during a Think Aloud testFootnote 4. From a literature review, we identified five (5) works that provided evidences from test sessions of elderly users using mobile applications [9, 12, 20, 21, 35].

For every matching processes conducted in this study, we applied a relaxed criteria: problems were considered similar whether they express the same underlying problem [4, 33].

3.2 Application: Aptor Digital CogniTest

The Aptor Digital CogniTest is a mobile app that aims to make a digital version of paper based cognitive test, designed by Aptor SoftwareFootnote 5. We opted for using Aptor Digital CogniTest because it is part of a larger project that some of the authors participate. The application was designed to be used in an Android tablet, by Brazilian elderly (Portuguese speakers) and its development was based on the Able Gamers’ Includification guidelinesFootnote 6.

Aptor Digital CogniTest has two basic tasks implemented. One task is to remember figure positions on a matrix. As an example, Fig. 1 shows one of the screens of the training sessions, informing the user about what is required to achieve the goal, while Fig. 2 shows a screen of success in a task of remembering figure positions. Users are also asked to remember number sequences, which comprehends another task. Figure 3 shows an example screen of an incorrect trial during a number sequence remembering task.

Fig. 1.
figure 1

Screen for instructions on identifying the difference in the matrix.

Fig. 2.
figure 2

Screen for success feedback (correto means correct in Portuguese) on identifying the difference in the matrix.

Fig. 3.
figure 3

Screen for failure feedback (incorreto means incorrect in Portuguese) on remembering a sequence of numbers.

From the tests conducted during this study, we aim to provide important feedback for Aptor Software for the next steps on the development of Aptor Digital CogniTest. The following section presents results from validation processes.

4 Results and Discussion

This section presents results from both validation processes conducted in our study: (i) validation regarding traditional heuristics; and (ii) validation regarding outcomes from test with real users.

4.1 Validation I: Comparing Against Nielsen’s Heuristics

We conducted a matching comparison between the heuristics of Al-Razgan et al. [2] and the heuristics of Nielsen [18]. Hence, we considered the checklist as provided by Al-Razgan et al. [2] to describe their heuristics, and compared against the heuristics and factors showed by Nielsen [29], and the description of each heuristic as showed by Nielsen [27].

Al-Razgan et al. [2] used 48 checklist items to describe their 13 heuristics. Among these 48 items of the checklist, we identified 31 matching cases when comparing with heuristics and factors showed by Nielsen [29]. In Table 1, we summarized the matching as identified during this stage. In sequence, we compared the remaining 17 items with the heuristic descriptions as provided by Nielsen [18]. Notice that heuristic and factor codes, used in Table 1, are the same as reported by Nielsen [29]. At this time, we found three matching cases between items from Al-Razgan et al. [2] checklist and the heuristic “Aesthetic and minimalist design” (called Heuristic 8 in this paper) that were shown in Table 1.

Table 1. Matching heuristics of Al-Razgan et al. [2] with traditional heuristics of Nielsen.

At this extent, 14 items from the checklist of Al-Razgan et al. [2] were not matched with any of Nielsen‘s heuristics.

4.2 Validation II: Comparing both Heuristic Sets Against Outcomes from Test with Users

The first step for this validation process was to collect evidences from tests of mobile usability for the elderly from five (5) works in the literature [9, 12, 20, 21, 35]. We collected as much works as possible that provided some evidences of usability problems from test with elderly users regarding mobile application. We understand that such sample of usability problems retrieved from the literature is limited by different age ranges and culture of users, but this can still provide good insights about the theme. These works provided a total of 27 usability problems, as follows:

  • Nine (9) usability problems were retrieved from Scheibe et al. [35]. Scheibe et al. [35] tested a mobile diabetes application with 29 users with 50 years or older.

  • Two (2) usability problems were retrieved from Gao and Sun [9]. In their study, Gao and Sun [9] tested gestures on touch screen devices, on their own testing system, with 40 elderly users aged from 52 up to 81 years.

  • Four (4) usability problems retrieved from Kobayashi et al. [21]. Kobayashi et al. [21] tested traditional touch screen gestures with one iPod, one iPad (emulating an iPod) and 20 elderly users with ages ranging from 60 up to 80 years.

  • Eight (8) usability problems retrieved from Harada et al. [12]. Harada et al. [12] conducted tests and focus group with 21 elderly users with ages from 63 up to 79 years and three different applications: an Address Book, a Phone and a Map. Both smartphones and tablets were used for the tests.

  • Four (4) usability problems retrieved from Kiat and Chen [20]. Kiat and Chen [20] conducted focus group and test with of a Mobile Instant Messaging with six elderly people whose ages ranged from 60 up to 80.

In addition to the test outcomes retrieved from the literature, we conducted a test with six (6) elderly users with ages ranging from 61 up to 73 years (\(\bar{x}\) = 67.83, s = 5.42). For the tests, we used the Think Aloud procedure [10]. Two moderators were responsible for taking notes of usability problems based on users’ interaction. As result from the tests, the moderators collected 53 usability problems. In sequence, we conducted a duplicate analysis, resulting in a list of 25 distinct usability problems (see Appendix A). Finally, we had a total set of 52 mobile usability problems related to the elderly context.

At this stage, we matched all 52 usability problems retrieved from the literature and from the tests we conducted against Nielsen’s heuristics and the heuristics of Al-Razgan et al. [2]. In this sense, we understood that 28 of the problems retrieved were related (matched) with some of Nielsen’s heuristic. At this point, our goal was not to identify all relations between Nielsen‘s heuristics and the usability problems retrieved in our study, but to identify at least one relation that could show that the respective heuristic is applicable to this context. Table 2 shows the number of usability problems matched with each of the ten heuristics of Nielsen (\(\bar{x}\) = 3.11, s = 2.47). As shown in Table 2, one can see that the heuristics 1 and 2 were the most matched with usability problems (had the largest coverage).

Table 2. Total coverage for each one of Nielsen’s heuristics on mobile usability problems identified in test with elderly users.

In the following stage, we matched the 52 usability problems with the heuristics from Al-Razgan et al. [2]. Our goal was by no means to match all possibilities between usability problems and heuristics, but only the most related from our understanding. In Table 3 we show the number of problems related to each heuristic from Al-Razgan et al. [2], represented by its items. Among the 48 checklist items from the heuristics set, only 16 items were matched with some usability problem, as shown in Table 3 (\(\bar{x}\) = 0.59, s = 1.28).

Table 3. Items from the heuristic set of Al-Razgan et al. [2] by its number of matches (n). This table shows only those items that had a mathc identified.

Finally, we did a comparison among the 52 usability problems and both heuristic sets. For this objective, we did two other analyses: (i) total of matching and (ii) unique matching. Table 4 summarizes the distribution of total matching that each heuristic set had on usability problems identified from each study. In sequence, we analyzed matches that were unique from each method (usability problems covered only by one of the sets). Table 5 shows the number of unique coverage by each heuristic set. In both cases, the heuristics from Al-Razgan et al. [2] had a higher coverage. This fact is an important evidence towards establishing a heuristic set for evaluations of mobile usability problems regarding elderly users. Nevertheless, as showed before in this section, most part of Al-Razgan et al. [2] can be linked to Nielsen’s heuristics. We understand that future studies should investigate the synergy of merging both sets in a new one, because the heuristics of Nielsen have been largely validated in the community and our study had a limited sample of usability problems.

Table 4. Total matching - Number of usability problems by study (first two columns) and total number of matching by each heuristic set (last two columns).
Table 5. Unique matching - Number of usability problems by study (first two columns) and total number of matching by each heuristic set (last two columns).

Finally, three usability problems were not covered by any of the heuristic sets. The first of these usability problems (retrieved from our study and from Harada et al. [12]) indicated that users wanted a confirmation before going on with a task (after completing a sub-task), which was not provided by the interface. The other usability problem showed that users preferred to drag and pinch then tapping [21].

5 Implications for Design

We understand that designers and practitioners can apply the heuristics from Al-Razgan et al. [2] for evaluating mobile usability for the elderly. In addition, we suggest the use of these heuristics as a complement for the traditional heuristics of Nielsen, this can be done by applying the set of Nielsen and the other 11 items from Al-Razgan et al. [2] that were not related to any of Nielsen’s heuristics.

Designers can also use our list of usability problems (see Sect. A) as complement for initial requirements for design of mobile usability for elderly.

6 Conclusions

This study aimed to continue the development of heuristics for mobile usability and elderly users. For this reason, we conducted additional validations of the heuristics from Al-Razgan et al. [2] with traditional heuristics of Nielsen, 52 usability problems retrieved tests with users (from a literature survey and tests with six users). Most part of the checklist items of heuristics from Al-Razgan et al. [2] are related to nine out of the ten heuristics of Nielsen. In addition, the 48 items from the heuristics of Al-Razgan et al. [2] covered 41 mobile usability problems related to elderly users collected in our study, while traditional heuristics of Nielsen covered 28 of such mobile usability problems.

Future studies should compare both heuristic sets from case studies with group of evaluators conducting heuristic evaluation with each set and, then, compare the extent of outcomes against outcomes from test with potential users. This was suggested because our method was focused on comparing matching of problems without considering that, during a heuristic evaluation, evaluators may differ in the discovery of such problems through each heuristic set due to evaluator and expertise effect [3, 15].

The main limitation of our study relates to the evaluator-effect [15], other researchers may perform different matches of the 52 mobile usability problems with both heuristic sets approached. Future studies can explore the development of a unique heuristic set from a synergistic merging of both Nielsen’s heuristics and the heuristics of Al-Razgan et al. [2]. Also, future studies can investigate short/long text descriptions for the heuristics of Al-Razgan et al. [2] through a factor analysis from a larger sample of mobile usability problems, because our data was not sufficient for such.