1 Introduction

Due to the present saturation in the mobile software market, consumers frequently find themselves unable to decide which application to acquire, considering that many of them have the very same functional features. It is very likely they will prefer the application that presents their functionalities in the most usable and efficient manner [9].

In [8] Harty discuss how many organizations do not perform any usability or accessibility testing. It’s seen as too expensive, too specialized, or something to address after testing all the “functionality” (which is usually prioritized because of time and other resource constraints). For these organizations, good test automation can be of great benefit.

Usability testing tends to be time consuming and hard to scale when it requires human observation of the people using the software being measured [8].

Most of the usability methodologies (e.g. usability inspection, heuristics, etc.) are both applicable to desktop as well as to mobile software, it is more difficult for mobile context to achieve relevant results with conventional assessment methods. The reason is that the emulation of real-world use during a laboratory based evaluation is only feasible for a precisely defined user context. Therefore, due to physical restrictions, it is difficult to generalize from quickly changing and possibly strongly varying user context [5].

The evaluation of usability in mobile software provides valuable measures about the quality of these applications, which assists designers and developers in identifying opportunities of improvement. But analyzing the usability of mobile user interfaces can be a tiresome assignment. It might be extensive and require expert evaluation techniques such as cognitive walkthroughs or heuristic evaluations, not to mention often expensive usability lab equipment.

Relevant and recent work has been published concerning tools for low-cost, automated usability tests for mobile devices. In [1] such tools have been reported to aid small development teams to perform fairly accurate suggestions on user interface improvements. However, these tools do not consider emotional feedback of users towards mobile software.

1.1 Emotions and Usability

Emotional feedback is a significant aspect in user experience that chronically goes unmeasured in several user-centered design projects. [2] Human emotions are indispensable to understanding users, as these can enable the increase of persistence and strengthen interest in a subject or task. The examination of this affective aspect through well-known empirical user-centered design methods supports software creators in engaging and motivating users while using their systems [3]. The gathering of emotional cues will provide an additional layer of analysis for collecting user data, augmenting common evaluation methods and resulting in a more accurate understanding of the user’s experience.

1.2 Automated Tests and Unsupervised Field Evaluations

Furthermore, it is essential to mention the importance of automated tests, while being executed through and for mobile devices. In contrast to desktop applications or web sites, mobile applications have to compete with external stimuli, as users might not be sitting in front of a screen for considerable amounts of time [4]. Due to the very nature of mobility in this scenario, in a real-world context, users might as well be walking on the street or sitting on a bus when interacting with mobile software. Therefore, it is imperative not to ignore the differences of such circumstances and desktop systems in isolated usability laboratories without distractions [5].

This contrasts with Kaikkonen et al. [11] and is supported by Lettner [1] which narrates that conducting unsupervised and automatic field studies will result in more in-depth results than supervised field studies. This statement is also supported by Hertzum [10], who confirms that the results of field studies differ between field tests and laboratory-based tests. Hertzum showed that conducting unsupervised field studies is inexpensive and does not require much preparation. Moreover, supervisors could influence the study by excluding interaction possibilities [1].

2 Related Work

Significant work has been published concerning automated software usability tests, specifically for mobile devices. Lettner et al. [1] approach the present matter implementing a framework to do user interaction logging as basis for usability evaluation on mobile devices. This paper compares commercial frameworks for logging user statistics on mobile devices, such as Flurry1, Google Analytics2, Localytics3 or User-Metrix4. However, these frameworks focus on descriptive user statistics such as user growth, demographics and commercial metrics like in-app purchases. These solutions approach automation of usability tests, but ignore emotional feedback.

Some techniques and methodologies have been reported in significant publications about gathering affective data without asking the users what and how they feel. Physiological and behavioral signals such as body worn accelerometers, rubber and fabric electrodes can be measured in a controlled environment [6, 7]. It is also feasible to evaluate users’ eye gaze and collect electrophysiological signals, galvanic skin response, electrocardiography, electroencephalography and electromyography data, blood volume pulse, heart rate, respiration and even, facial expressions detection software [2]. Most of these methods face the limitations of being intrusive, expensive, require specific expertise and additional evaluation time.

2.1 UX Mate

UX Mate [12] is a non-invasive system for the automatic assessment of User eXperience (UX). In addition, they contribute a database of annotated and synchronized videos of interactive behavior and facial expressions. UX Mate is a modular system which tracks facial expressions of users, interprets them based on pre-set rules, and generates predictions about the occurrence of a target emotional state, which can be linked to interaction events.

Although UX Mate provides an automatic non-invasive emotional assessment of interface usability evaluations, it does not consider mobile software contexts, which has been widely differentiated from desktop scenarios [5, 9, 11].

3 Contribution

Our proposal supplements the traditional methods of mobile software usability evaluation by monitoring users’ spontaneous facial expressions automatically as a method to identify the moment of occurrence of adverse and positive emotional events. Identifying those events and systematically linking them to the context of interaction is clearly an advance towards overcoming design flaws and enhancing interfaces’ strengths.

The automated test generates a graphical log report, timing (a) current application page (b) user events e.g. tap, (c) emotions levels e.g. level of happiness and finally (d) emotional events e.g. smiling or looking away from screen. The gazing away from the screen [2] may be perceived as a sign of deception. For example, looking down tends to convey a defeated attitude but can also reflect guilt, shame or submissiveness. Looking to the sides may denote that the user was easily distracted from the task.

This research has also produced a toolkit for automated emotions logging for mobile software that, in contrast to existing frameworks, is able to trace emotional reactions of users during usability tests and relating them to the specific interaction that is being performed. This framework can be added to mobile applications with minor adjustments.

3.1 System Structure

The basic system structure is displayed in Fig. 1.

Fig. 1.
figure 1figure 1

System infrastructure

The running application uses the front camera to take photos of the user every second. This image is converted to base64 format and is sent via HTTP to the server. The server decodes the base64 information into image and runs the emotion recognition software, which returns the numerical levels of happiness, anger, surprise, smile (true/false) and gaze away (true/false). This information is sent back to the phone via HTTP and written to a text file, with a set of other interaction information. When the user exits the application, the log file is sent to the server, which stores and classifies the test results in a database, which can be browsed via a web front-end.

3.2 Interaction Information Logging

The applications to be tested are written using the library (.dll) we implemented. When the application is started by the user, a log file is created, marking the time, current page, level of happiness, level of anger, level of surprise, smile (0–1), gazing away (true or false) and tap/click (true or false). When tap is true, logs position of tap and name of the control object tapped e.g. button, item on a list, radio button, checkbox, etc.

The generated log file is comma separated value format, enabling visualization in tables, as displayed in Tables 1 and 2.

Table 1. Generated log (part 1)
Table 2. Generated log (part 2)

3.3 Emotion Recognition Software

The emotion recognition software was developed using the well documented Intel RealSense SDK [13]. Among many features, this software development kit allows face location and expression detection in images. This paper does not focus on analyzing any particular image processing algorithms to detect emotions.

3.4 Usability Information Visualization

The front-end web software display one test session as in Fig. 2. It is meant to supplement the traditional usability tools and methods.

Fig. 2.
figure 2figure 2

Emotions log automatically generated chart

4 Experiments

In order to perform early system functioning check, we planned a test session that would induce negative and positive emotions, not necessarily related to the interface design.

To gather negative feedback, we asked one male adult (32 yo) to login to one of his social networks account and post one line of text to his timeline. During this task, we turned the WLAN connection on and off, in intervals of 30 s. After 5 min of not being able to execute a considerably simple task, the test subject was clearly upset. The emotional feedback logged by our system was in successful accordance to the test session.

To gather positive feedback, we asked one male adult (27 yo) to complete an online quiz with charades and funny answers. The emotional feedback logged by our system was in successful accordance to the test session, as the user smiled and even laugh about the funny text and imagery.

The test session displayed in Table 1 and Fig. 2 show an example of one test session we have run. The user was asked to login to one communications application in development stage in a research institute.

5 Future Work and Discussions

This work presents an early approach to emotional feedback logging for mobile software usability evaluation. The problem space was narrated through referencing other usability automation research. Some relevant related work was described and distinguished from the present proposal. A system was developed as a proof-of-concept tool to our hypothesis and experiments where performed to raise argumentation topics to provoke advances on the current matter.

Our system logs emotional feedback from users, using the front camera on mobile devices. It stands as a solution for automated mobile software usability evaluation.

The system functional features were tested for negative and positive emotional feedbacks by test sessions that where planned to fail and succeed/provoke smiles, respectively.

Future work will investigate a more in-depth applicability of the logged interaction information. For example, our system is not yet detecting and identifying usability problems. It is strictly logging emotional feedback and UI interactions, merging this information on a timeline, to aid usability evaluation.