Keywords

1 Introduction

Industry analysts estimate that there are over 250,000 mobile applications available in the various application stores, some of which are available for various types of mobile devices, such as, smartphones and tablets [13]. As a result of this growth, software companies began to investigate the interactions between the user and the product in order to develop applications with higher quality [13].

The characteristics of the interaction and the interface that make an application appropriate are emphasized by the use of quality criteria [10]. One important quality of use criteria is usability [10]. Usability assesses how easy the interface is to use, as well as the user satisfaction as a result of such use [9]. In the definition of usability, the use of an application is affected by the user’s characteristics (his/her cognitive perception, his/her ability to act upon the application and how (s)he perceives the response from the application) [10].

Just evaluating usability is not enough to improve the quality of an application. Practitioners must also be concerned about the emotions and feelings of users with respect to the applications they use. The quality related to the evaluation of the users’ feeling while interacting with a software application is defined as User Experience (UX) [4]. User experience is associated with aspects that go from traditional usability to beauty, hedonic, affective or experiential aspects of the system [3]. In order to achieve a positive UX, it is necessary that the application promotes satisfaction of the human needs of the users [4].

Therefore, usability evaluation focuses on the realization of the task, i.e. it considers user performance while (s)he performs a particular activity [5], while UX evaluation focuses on his/her experiences, emotions, perceptions and judgment in the evaluation of applications [3]. Consequently, both are important for the evaluation of the quality of applications, especially for mobile applications, since these applications have features that make the evaluation difficult [2]. As examples of these features, we have the mobile context, connectivity, small screen size, different display resolution, limited processing capability and power, and data entry methods [2]. In these applications, the dynamism of the mobile scenarios makes the task of evaluating the user experience, context and usability altogether more difficult [6].

Considering the importance to evaluate both, usability as user experience, in mobile applications, we have developed a technique called Userbility. This paper presents an empirical study of Userbility, which was carried out to verify the feasibility of the technique when employed by practitioners without expertise in UX and usability evaluation. It also indicates which parts of the technique need improvement. During evaluation, subjects evaluated the user experience and usability of five developed applications. Furthermore, based on the results found, we made enhancements on the technique.

The paper is organized as follows: Sect. 2 focuses on the works related to this research. Section 3 presents the initial version of the Userbility technique. Section 4 describes the empirical study, the obtained results and a second version of Userbility. Finally, Sect. 5 presents our final remarks.

2 Background

Usability and user experience are important quality attributes for applications. User experience (UX) focuses on hedonic aspects such as, fun and enjoyment [3]. Two main types of quality attributes are perceived by the user when evaluating UX: pragmatic quality (usability perceived by the user) and the hedonic quality (pleasure - producing product quality) [3]. Even though hedonic aspects can meet universal human needs, they do not necessarily have utilitarian value. This aspect is explored to make the user experience more pleasant [3].

The Expressing Emotions and Experiences (3E) method aims at capturing the experience and feelings of users [11]. It uses an approach in which users draw and write their experiences and emotions about the evaluated application. The model of this method includes: (a) a blank face, where the user can draw his/her emotional state; (b) a speech bubble where the user can verbally express him/herself; and (c) a bubble cloud, in which the user can report what (s)he is thinking [11]. Through this method, users can express themselves more freely, either by writing (through the bubbles), or drawing (through the face expression).

Besides evaluating aspects related to the emotions, experience and feelings of the users, it is also necessary to evaluate the usability. According to the ISO 9241-11 norm (1998), usability is defined as the extent to which a product can be used by users to achieve a set of goals with effectiveness, efficiency and satisfaction in a specified context of use [5].

There are different types of usability inspection methods to evaluate the usability of an application. One of the most accepted evaluation methods for diagnosing usability problems is the Heuristic Evaluation [1], due to its simplicity and low application costs. The Heuristic Evaluation is summarized in ten “golden rules”, developed for the design and evaluation of interactive applications [9].

Furthermore, there are also usability inspection techniques to assess mobile applications. An example is the Checklist for measuring usability of mobile phone applications that consists of a questionnaire [12]. This questionnaire is employed to identify usability issues in mobile applications, using a 67 items questionnaire (which identifies a higher number of usability issues) and a 48 items questionnaire (which is less demanding and therefore, requires less time to accomplish an evaluation) [12].

As noted above, the presented studies allow assessing UX and usability separately. Even though it is possible to capture aspects focused on usability, this is not the only focus of the presented UX studies. Therefore, it is important to investigate and propose a technique for mobile applications that allows evaluating UX and usability at the same time.

3 Userbility v 1.0

The Userbility technique (Integration of User eXperience and Usability) aims at helping non-specialist practitioners in HCI to evaluate user experience, considering the usability of mobile applications. The Userbility technique v 1.0 was proposed based on two methods: the Heuristic Evaluation [9], as this is the most employed method to perform usability evaluations [1]; and the Expressing Emotions and Experiences (3E) method, since this is a method that collects rich data on the emotional response of users [11]. Therefore, the Userbility technique aims to integrate the usability evaluation with UX, which is more focused on emotions and user experiences. This integration is important in order to improve the evaluation process, especially for less experienced evaluators.

Based on heuristics from the Heuristic Evaluation, we defined ten aspects to evaluate applications using questions related to UX. These aspects were included in the Userbility technique, as shown in Table 1. We’ve simplified the ten aspects of the heuristics from the Heuristic Evaluation to make it possible for non-specialists in HCI to apply a UX and usability technique. Our technique consists on a questionnaire that assesses the usability of the application and the user experience on mobile devices after using the mobile application.

Table 1. Usability aspects based on the Heuristic Evaluation

From the 3E model, where users draw and write their experiences and emotions [11], we selected two questions (Q1 and Q2) in each aspect that evaluates the usability of the application: (Q1) “What did you feel regarding this aspect in the application?”; and (Q2) “What do you think or would improve regarding this aspect in the application?”.

The selected questions aim to assist in capturing the experience and emotions of non-specialist evaluators about the application. Furthermore these issues are simple and easy to understand by no experts practitioners. In order to answer Q1, the evaluator describes how (s)he feels when observing a certain aspect of the mobile application. For Q2, the evaluator answers by describing what (s)he “thinks” of the application, which problems occurred when using this application, what is missing or what could be improved in the application. As an example, let’s consider a scenario where an inspector needs to evaluate an e-commerce app. The task (s)he might accomplish is to create a shopping list. In question Q1 regarding aspect A1, a possible answer could be: “I felt discouraged to find that I did not receive feedback when making a purchase”. In this case, the user experience was negative, because the user did not manage to receive feedback from the application. In addition, there is a possible usability problem, as suggested in aspect A1. Also, considering the same application to answer question Q2, a possible answer could be “I think that the application has failed in the A1 aspect, because I did not identify that I was in the products evaluation page”. In this case the evaluator can indicate what (s) he is thinking made A1 fail and why.

In addition, an item related to the satisfaction of the evaluator for the usability aspects was also included. This item was included so the evaluators could provide their degrees of satisfaction about the aspects of the technique. This item is composed of a five-point scale (unsatisfied, little satisfied, moderately satisfied, very satisfied and extremely satisfied) and represented by a face. These five points were chosen in order to provide richer information about the user satisfaction on every usability aspect.

Figure 1 shows the Userbility v 1.0 technique and also shows its organization: usability aspects (1), the UX questions (2 and 3) and the satisfaction items on the usability aspects (4).

Fig. 1.
figure 1

Userbility technique v 1.0

4 Empirical Study

We conducted an empirical study with the Userbility v 1.0 technique using five mobile applications, in order to verify the feasibility of the technique and indicate what parts of the technique need improvements. The evaluated applications were: (1) Simbora, which provides means for university students to collaboratively get a ride; (2) GRUM, which provides the location of events inside the university; (3) PartyNote, which informs what festivals and events are taking place on a date or month; (4) Personal Diet, which helps or to motivates people to keep diets when they are out-of-home; and (5) Bookzone, which helps users find and sell their books.

The subjects of this study were student volunteers of a class on Collaborative Mobile Systems. There were 09 undergraduate students of the Engineering and Computer Science courses at Federal University of Amazonas (UFAM). The subjects were also the developers of the evaluated mobile applications. However, each subject did not evaluate his/her own applications, but another application developed by students from other groups. All students signed a consent form. Figure 2 shows the process of applying the Userbility technique.

Fig. 2.
figure 2

Study’s execution phases

The process depicted in Fig. 2 was executed following the phases in Table 2.

Table 2. Description of the study phases

Table 3 shows the usability problems found in the applications and the suggested improvements indicated by the subjects to mitigate these problems. We classified the possible usability problems and the related improvements suggested. Moreover, it was possible to see suggestions related to the hedonic quality attributes of user experience. Some of these improvements were related to expectations and user experience of the subjects, such as: “One could add favorite rides” (Simbora app); “There could be an option to trace a route on the map.” (Grum app); “I expected more, since there should be an incentive from the system as it is a diet” (Personal Diet app); and “The use of words such as healthy is an attraction for the younger audience. But I do not know if there would be an acceptance from the general public” (Personal Diet app).

Table 3. Main problems and suggestions/improvements that the subjects made regarding the applications using the Userbility v 1.0 technique.

After applying the questionnaire of the Userbility technique, we conducted the collection activity. We generated a list of all the problems and suggestions without duplicates (problems or suggestions identified by more than one inspector). A researcher with high experience in usability and user experience evaluations grouped the problems, and we perform the removal of duplicates. Then one researcher revised these problems, and, finally, we conducted the discrimination activity with the participation of two researchers.

Table 4 shows the time for evaluation, the number of duplicated usability problems, suggestions and improvements indicated in each application. Two subjects did not count the time spent for the inspection (P3 and P9). Furthermore, only one subject evaluated the PartyNote application, therefore, there are no duplicated problems (DP) and suggestions (DS) for this application.

Table 4. Inspection results: number of defects/suggestions and duplicates per application

Figure 3 shows the level of satisfaction of the subjects, on the scale set, for the applications. We determined the level of user satisfaction, item 4 in Fig. 1, according to every aspect of the application (A1, A2 …) identified in Table 1. In the evaluation of the Simbora application, two subjects (P1 and P2) were satisfied with 9 out of 10 evaluated aspects. Considering the aspect A3, subject P1 was little satisfied. The most and least satisfactory aspects are shown for each application. This analysis of the level of satisfaction is interesting for researchers because it makes it possible to identify which aspects satisfy or dissatisfied the evaluators.

Fig. 3.
figure 3

The level of satisfaction of subjects per application

In order to evaluate the ease of use and adoption of the Userbility technique, a post-inspection questionnaire was applied with the inspectors. This questionnaire had questions related to the ease of use, usefulness, positive and negative points of the Userbility technique. For the analysis of the post-inspection questionnaire we adopted procedures of the Grounded Theory method [8] to perform data coding. A researcher performed the coding procedure. Then, two experts reviewed the coding.

During the coding process, three categories and 25 codes emerged. The categories were: Benefits of the use of the technique (14 codes), Difficulties in understanding (4 codes) and Suggestions for improvements of the technique (7 codes). These categories are described as follows:

Benefits of the use of the technique: this category highlights the benefits perceived by subjects in the use of the technique. Some of the codes were: The Technique characterizes the usability of the application (“Easy characterization of usability features” - P6); It guides what aspects need to be inspected (“I would not notice things like menu navigation feedback” - P2); The technique is intuitive (“The technique is very intuitive” - P6); The questions helped in the evaluation (“The questions raised helped in this evaluation” - P4); The justification fields allow the evaluators to express themselves (“And the justification fields let us expressed ourselves in more detail” - P6). In this category, we perceived that the technique helps and guides inspectors during the evaluation. Also, inspectors pointed out that the questions and the fields were positive factors in the use of the technique.

Difficulties in understanding: this category highlights the difficulties in understanding that were perceived by the subjects in the use of the technique. Some of the codes were: The aspects of shortcuts are subjective (“Not the shortcuts, which is something very subjective” - P7); Some aspects of the technique are similar and confusing (“Some sections are similar: the system messages, communication with the user… It just confused me” - P2); It generates repetition of information (“I felt that once or twice I said the same thing. I gave the same answer” - P1); The item related to satisfaction is hard to understand (“Most of them are easy to understand, but the part of the technique that was difficult to understand was the part of symbols” - P8). In this category, we perceived that the technique caused confusion in some aspects that are similar and generated repetition of information. Also, the inspectors pointed out that the shortcuts and items related to satisfaction were difficult to understand. These points should be improved in the technique.

Suggestions for improvements in the technique: this category highlights improvement suggestions cited by the subjects in the use of the technique. Some of the codes were as follows: Making the technique less tiring, reducing the analysis (“Being less tiring” - P8); Allow users to comment freely (“One should be able to comment freely on other themes, letting the user make suggestions” - P7); The technique should provide multiple choice questions (“More options (multiple choice questions)” - P5); Prioritize the app context in the technique (“Directly approaching what is a priority in the context of an application” - P9). In this category, we perceived that the technique analysis is tiring and that the subjects prefer more multiple-choice questions. Consequently, we included sub-items with yes/no questions, to report problems through the sub-items with less cognitive effort. Also, it was pointed out that we could guide the technique regarding the app context, but it is preferred that the technique is generic to allow evaluating different types of mobile applications.

Finally, the qualitative analysis revealed some issues that needed to be reviewed in the technique. Taking into account the comments of the subjects, we also make improvements in the Userbility technique based on a questionnaire to support the usability evaluation on mobile applications [12]. This questionnaire was chosen because, although it does not evaluate UX, it evaluated the usability in mobile applications. This new version includes usability items, example for the usability items, UX questions and item related to the satisfaction of the evaluator.

After empirical study, we identified that some of the usability problems had not been clearly indicated. In some circumstances, subjects did not know how describe the problem, but gave suggestions and improvements (identified in Table 3). In these cases, the subjects did not indicate where exactly the problem was. Therefore, we decided to detail each aspect with sub-questions. These sub-questions have been created in order to make the aspects easier to understand, and to be more helpful in the process of identifying usability and UX problems in mobile applications.

We also propose to improve usability and UX questions adding new sub-questions related to specific aspect of mobile applications, based on Checklist Heuristic Evaluation for Smartphones Applications [7]. This new version of Userbility was evaluated and reviewed by another researcher with high knowledge of usability and UX, which was not directly involved in this research.

Figure 4 shows part of the new version of the Userbility v 2.0 technique. The usability aspects are composed of items including an issue and an example to facilitate the understanding of the evaluator.

Fig. 4.
figure 4

Exemple of the Userbility technique v 2.0

5 Conclusion

This paper presents an empirical study that was conducted in order to evaluate the feasibility of using the Userbility technique with five applications under development. Userbility is a technique that helps designers to evaluate the usability and user experience of mobile applications. Applying the technique makes it possible to verify two different quality criteria: usability and user experience. Aspects of usability of the Userbility assist the inspectors in describing usability problems, guiding subjects to think about the experience that each of these aspects evoked. Although the technique is not specific to identify usability problems, it leads the user to describe various problems given that the usability aspects are based on the Heuristic Evaluation. The UX aspects on Userbility are expectations and experiences, i.e. the aspects that the subjects described they wish were in the application in order to improve the experience of the end users. However, we found out that subjects were focused more on pointing out the usability problems they encountered instead of describing what they felt regarding each aspect.

Based on the results found in this empirical study, it was possible to identify various usability problems in the applications and several improvements suggested by the study subjects. Furthermore, it was observed that the Userbility v 1.0 technique found usability problems without specifying their location. Also, it was possible to perceive though the qualitative analysis that various issues still needed to be reviewed in the Userbility v 1.0. Based on the performed analysis, we proposed a new version of the technique, where verification items have been added to every usability aspect in order to obtain more detailed results from the inspection. Thus, as future work, we intend to execute a study with the Userbility v 2.0 and to analyze their effectiveness and efficiency. Consequently, we expect that Userbility v 2.0 allows identifying more specific usability problems in order to help designers when they are performing the correction of these problems. We also hope that the inspectors report more of their experiences and feelings when using the application that is being evaluated. In this way, we intend to encourage the use of Userbility in the industry for rapid assessments.