1 Introduction

While the early years of research in the field of human-technology interaction (HTI) were spent around terms like ‘efficiency’ and ‘usability of products’ in the past two decades a more holistic perspective on the interaction between humans and technical products was established. Summarized under the term “User Experience” (UX) this perspective includes traditional factors, e.g. usability, functionality or task efficiency, as well as additional qualities emphasizing emotional components of the experience with technology [1]. The ISO 9241-210 on human-centered design defines UX as “all the users’ emotions, beliefs, preferences, perceptions, physical and psychological responses, behaviors and accomplishments that occur before, during and after use.” [2]. Hassenzahl narrows this general definition down, focusing on particular aspects of emotions and their incorporation into products and devices to promote positive experiences. He defines UX as a “momentary, primarily evaluative feeling (good-bad) while interacting with a product or service” [3]. He adds that a good UX is constituted by the fulfillment of psychological needs [4, 5]. Furthermore he proposes a dichotomous model of UX which “assumes that people perceive interactive products along two different dimensions” [6]: hedonic and pragmatic qualities. Hedonic qualities emphasize the emotional and affective components of a product, such as the enhancement of a person’s psychological wellbeing [1, 3, 7]. Pragmatic qualities, on the other hand, describe instrumental, task-related qualities of a product, meaning that the product shall serve as means to an end for the user to manipulate the environment [3, 6, 7].

To assess UX, researchers mainly make use of self-reported measures that rely on introspection, the ability of the user to have explicit access to and be able to verbalize their recently gained experiences with a product. Although these methods can provide valuable insight to the user’s perception of a technical product, self-reported data can be distorted by biases. It is questionable whether users can indeed access their experiences through introspection as required by questionnaires, surveys and interviews. According to Nosek et al. [8] humans often lack the motivation, opportunities, abilities or even awareness to report their experiences. This appears to be especially true for hedonic product qualities, as the emotional state constantly changes over time. Emotions are of more subjective nature and therefore more difficult to grasp than pragmatic components of a product. Even if the users are capable of introspection, their answers can still be incorrect in that sense that they do not reflect the true experience [9]. Answers can be distorted either unintentionally or intentionally (faking), due to cognitive biases such as attribute substitution [10] and social desirability [11]. Self-reported measures could hence benefit from an extension with complementing methods to enable a deeper and more detailed understanding of the user’s experience.

Therefore, the present paper investigates the feasibility of using behavioral and neurophysiological implicit measurement tools, as they are commonly used in psychological and neuroscientific research, to obtain additional, valuable information about the UX while interacting with a software tool. On the one hand, this study serves as a feasibility study to examine the value and usefulness of the two approaches (and the combination of both) for UX assessment. On the other hand, the study also investigates whether implicit methods can be used to distinguish hedonic from pragmatic components of UX. The latter is especially relevant in order to draw conclusion about which aspects of a product need to be adjusted to arrive at an overall positive UX. While pragmatic qualities can be assessed with traditional usability methods, measuring the hedonic qualities seems to be a challenge UX researchers are still struggling with. The present study examines whether implicit methods can close this gap.

2 Implicit Methods for UX Assessment

Implicit methods have first been suggested in different fields of psychology as a tool to infer participants’ implicit cognitive processes and attitudes based on behavioral measures and without requiring any (self-assessment) effort. In implicit measures the construct is “inferred through a within-subject experimental design: [by] comparing behavioral performance between conditions (e.g. different primes)” [8].

The first prominent tool for implicit assessment was the Implicit Association Test (IAT) by Greenwald et al. [12]. The IAT is based on the assumption that memory content is organized as associative networks in the human brain [13], and that the activation of a certain piece of information automatically results in the activation of associated information within the network. The test requires participants to characterize two different types of target stimuli (e.g. adjectives and pictures of animals) by two categories (e.g. cute/scary, cat/spider) by pressing two different buttons. It is assumed that participants respond faster if two concepts that are strongly associated with each other (e.g. scary and spider) are represented by pressing the same button. The IAT has widely been used to uncover racist tendencies that manifest themselves on fast reaction times (RTs), i.e. fast button presses, for positive adjectives and white people/negative adjectives and black people, respectively (as compared to positive/black, negative/white) [12]. A first attempt to apply the IAT in HTI has been made by Devezas and Giesteira [14] who compared implicit and explicit measures. They assessed the aesthetic judgment (valence and self-identification) of eight participants for pictures of different interfaces by using the Picture Implicit Association Test (P-IAT). Two bipolar scales were used as explicit measures for valence and self-identification. The authors reported a “medium” correlation (r = .42, p > 0.05) between implicit and explicit measures and suggest that the IAT can be used as “a complementary or substitutive method for self-report measures” [14] (p. 15).

Another implicit method is the Affect Misattribution Procedure (AMP). In this test participants are briefly (≈200 ms) presented with emotionally-loaded prime pictures followed by pictures of Chinese characters [15]. Their task is to rate the Chinese character – and not the picture – as positive or negative via a button press. Chinese characters are assumed to be of no special (emotional) meaning for participants who are not Chinese native speakers. Studies have shown that a participant’s attitude towards the preceding prime can be inferred by their response to the Chinese character [16]. The AMP has been applied in an HTI context once to assess participants’ implicit attitudes towards robots [9]. In this study, participants were presented with videos of moving or static robots. Afterwards, the participants received a questionnaire as explicit and the AMP as implicit measures. The study revealed a negative tendency towards a certain type of robots, which did not show in the self-reports.

Schmettow et al. [17] investigated participants’ implicit associations and attitudes towards technical devices, such as computers or tablets with the Stroop priming task: Participants had to react to colored words of three categories (hedonic, utilitarian and geekism), after seeing a picture of a technical devices. Afterwards, their need-for-cognition level was compared with the latencies retrieved from the task. The study’s main aim was to examine the suitability of the Stroop priming task to extend current methods. The authors concluded from the results that “implicit […] methods […] may serve to better understand the nature of rating scales in HCI, and give more direct access to users’ spontaneous associations and affects” [17].

Lastly, implicit attitudes can be assessed by the Approach-Avoidance Task (AAT) by Rinck and Becker [18]. The task requires participants to respond to a certain picture format (landscape or portrait) by pulling or pushing a joystick. The pictures used for this task are emotionally loaded, i.e. positively or negatively valenced [19, 20]. The test is based on the assumption that pulling (arm flexion) is linked to a positive interpretation of the stimulus (approach tendency), while pushing (arm extension) is related to a negative interpretation of a stimulus (avoidance tendency). Reaction times should hence be shorter for compatible trials (pull positive picture/push negative picture) than for incompatible trials (pull negative/push positive). While it has been shown that the AAT is suitable to assess social anxiety, phobias or alcohol disorders [18, 19], to our knowledge it has never been applied in HTI.

Neurophysiological methods provide another interesting approach to assess implicit information about the user’s evaluation of a technical product through monitoring of cognitive and affective processes in the brain. In medical contexts, electroencephalography (EEG) has long been used for this purpose. The method also shows high application potential in HTI research, as EEG devices are portable and allow the user to obtain a rather comfortable position. EEG assesses the temporal and spatial characteristics of brain activity by monitoring electromagnetical processes, i.e. the synchronization processes between populations of thousands of neurons at the cortical surface of the brain [21]. The time course of brain processes can be determined on a millisecond scale, which allows us to study the precise timing of emotional or affective user reactions in the form of event-related potentials (ERPs) [21]. The ERP reflects a stereotypical electrophysiological response to a given stimulus by indicating the latency and amplitude potential of the recorded EEG signal. Thus, with the help of ERPs, the different stages (or components) of emotional stimulus processing and perception can be objectively studied. Generally, the early latency components of the ERPs (N100, P100, N200, i.e. peaks around 100 and 200 ms after stimulus onset) indicate processes, which are involved in the initial perception and automatic evaluation of emotional stimuli. Later ERP components, i.e. later than 300 ms after stimulus presentation, are supposed to reflect higher-level cognitive processes and conscious evaluation of the stimulus. Potential changes around the P300 components are modulated by stimulus-response compatibility [22]. Hence they might play an important role in the evaluation processes taking place in the brain during the behavioural implicit tasks described above.

3 Methods and Materials

Until now, neither of the two implicit measurement approaches, behavioral and neurophysiological, have specifically been used to assess UX in HTI contexts. Our study combines the AAT with EEG recordings to evaluate pragmatic and hedonic qualities of a technical product. This requires dichotomous stimulus material that stimulates the participant’s experience of pragmatic and hedonic qualities. We decided to use a simple software tool as the basis of the stimulus material, as the evaluation of software is the most common use case for UX evaluations. Two version of the software tool were developed – one that only provides pragmatic qualities and one that, on top, provides hedonic UX. As proposed by Hassenzahl [3], hedonic qualities presuppose the existence of pragmatic qualities to evolve. It is hence important that the two version of the tool contain the same estimation of pragmatic qualities, but differ in their estimation of hedonic qualities.

As all behavioral implicit tests are based on a prompt response of the participant to a stimulus, it is impossible to carry out these tests with film recordings or even free interaction tasks. Although UX is certainly created through the interaction process and clearly requires some active engagement of the user over a period of time, none of the studies described above feature any interaction with a product. We therefore decided to take a similar approach as Strasser and colleagues [9], but included a prior interaction period instead of video presentation. Participants interact with the two software tools first. Then snapshots from the interaction are used for the implicit test that is carried out immediately after the interaction. Participants’ implicit affective reactions to the snapshots of the pragmatic and hedonic software tools, respectively, are recorded through the behavioral response (RTs) and the collateral brain activity (ERPs). It is expected that participants show stronger approach tendencies to the emotionally loaded hedonic software tool than to the solely pragmatic version. Avoidance tendencies are presumed of similar magnitude for both versions, as they provide the same solid level of pragmatic qualities. We presume that the stronger approach tendencies for snapshots taken from the hedonic prototype are also reflected in the event-related electrical potentials accompanying the behavioral response and that, on this basis, implicit reactions to hedonic and pragmatic product qualities can be distinguished on a neuronal level.

3.1 Stimulus Material for Pragmatic and Hedonic UX

The stimulus material was based on the work by Sonnleitner and colleagues [23] who designed different versions of an ideation software to address different human needs, thus increasing the perceived positive UX. While the brainstorming of new ideas is a common task in work environments, the proposed ideation software has some short-comings. It can be assumed that every user has an individual stamping of unfulfilled needs in a certain context and that this needs profile consists of several needs rather than one. A technical product which addresses a single need can thus not fully adhere to the needs of the user. As the focus of the present study lies within the dissociation of pragmatic and hedonic qualities, the original ideation software was redesigned and two prototypes were developed: One that highlights only the pragmatic qualities of the software, and a second one that empathizes the hedonic qualities.

Initial Concept for Pragmatic and Hedonic Prototypes of an Ideation Software.

The two prototypes are note applications that enable and support people in the generation of ideas. The tools provide them with a predetermined set of topics, one after another on individual screens, and guides them through the ideation process like a wizard. On each screen the user is provided with the means to collect up to nine ideas. The prototypes were realized with Axure RP 8 [24], a prototyping tool for web applications.

The pragmatic prototype was created with a clean and clutter-free outer appearance using greyscale color, following the design of an unobtrusive graph paper. The user can write down ideas in designated text boxes and continue by pressing the “next topic”-button. It thus supports an efficient goal achievement, thereby creating good usability estimates, but no other (positive) emotions and experiences.

The hedonic prototype can be described as a variant of the pragmatic prototype that includes additional design elements to promote positive emotions and experiences. To arrive at these design elements, the experience categories by Zeiner et al. [25] were used. In their study, they examined the emergence of positive experience in work contexts through experience interviews. The resulting experience reports were grouped into experience categories that describe underlying themes of positive work experiences, e.g. receiving feedback, exchanging ideas or finishing a task. Table 1 describes the design elements included in the hedonic prototype and the related experience categories.

Table 1. Design elements included in the hedonic prototype and related experience categories.

Iterative Development of Pragmatic and Hedonic Prototypes.

In order to ensure that the manipulation of hedonic and pragmatic qualities was successful, users and UX experts were involved in the iterative development of the prototypes. Five user tests (3 females, M Age  = 23.8, SD = 1.64) were performed to evaluate the pragmatic and hedonic qualities of the prototypes and they were iteratively changed and improved. Afterwards, four UX experts (1 female; M Age  = 32.8, SD = 2.99) assessed the final versions of both prototypes and their feedback was used to apply the last modifications.

Both, users and experts completed the process of idea generation with the two prototypes and were asked to assess the pragmatic and hedonic qualities of the two prototypes on a quantitative and qualitative level for each prototype individually.

Quantitative feedback was provided by filling in two questionnaires after the interaction with each prototype: the User Experience Questionnaire” (UEQ) by Laugwitz et al. [26] and the meCUE questionnaire by Minge and Riedel [27]. The UEQ assesses pragmatic and hedonic components of UX through 26 items on a 7-point Likert scale, grouping them into the six factors attractiveness, perspicuity, efficiency, dependability, stimulation, and novelty [26]. In addition to the UEQ, 2 modules of the meCUE were added to assess positive and negative emotional experience during the interaction in more detail. To collect qualitative feedback, the users were instructed to describe their actions and their experience during the interaction with the prototypes using the think aloud-method. Experts completed their review on their own and provided their qualitative feedback through a review report. Moreover, both, users and UX experts, were asked to map the design elements of each prototype to the experience categories. Based on the results of the user tests and expert reviews the initial versions of the two prototypes were iteratively improved (e.g. reduced topics and interaction duration, adjusted design elements) to arrive at the finale stimulus material design.

Selected screenshots of the final versions of the two prototypes are presented in Figs. 1 and 2. In total, 64 snapshots were taken from the interaction with each prototype to serve as stimuli for the AAT.

Fig. 1.
figure 1

Screenshots of the pragmatic prototype (from top left to bottom right): Welcome screen and field for entering your name, topic introduction, note pad for ideation, final “thank you”-message.

Fig. 2.
figure 2

Screenshots of the hedonic prototype (top left to bottom right): Welcome screen including entering your name and selecting favorite color, personal “welcome”-message, note pad for ideation, feedback pop-up with personalized praise. (Color figure online)

3.2 Measurement Tools

Behavioral Implicit Measures.

To implicitly assess the user’s reaction to the two prototypes on a behavioral level the AAT was used [18]. The task requires the participants to respond by pulling or pushing a joystick to a certain picture format, either landscape or portrait. According to Laugwitz and colleagues [19] as well as Wiers et al. [20], every stimulus of this test contains a valence ranging from positive to negative. This notion makes the AAT most suitable for the goal of our study to differentiate between pragmatic and hedonic product qualities. As mentioned earlier, the pragmatic prototype could neither be regarded as negatively valenced stimulus material, nor as as positively valenced as the hedonic prototype. The AAT is the only implicit psychological tests which allows for a differentiation on a valence scale rather than the dichotomous distinction between positive and negative. Rinck and Becker [18] have shown good Spearman-Brown reliability estimates (r = .71) for the AAT. The defined time-windows of the trial structures are very suitable for simultaneous event-related EEG recordings.

Two sets of 64 snapshots taken from the interaction with the pragmatic and hedonic prototype, respectively, were used as picture stimuli for the AAT. Each set contained 32 different pictures in both, landscape and portrait format. A pull movement with the joystick increased the size of the snapshot and a push movement decreased the picture size. Each picture remained visible until the joystick was moved to its maximum positions, then it vanished.

Each participant completed two runs of the AAT with this set of in total 128 pictures. In one run participants were instructed to pull portrait and push landscape pictures. In the second run they had to push portrait and pull landscape pictures. Thus, as shown in Table 2, congruent and incongruent trials were created for the pragmatic and hedonic prototype. Congruent trials are all trials were participants had to approach/pull the strongly positively valenced snapshots taken from the hedonic prototype or avoid/push the less positively valenced pictures from the pragmatic prototype. Incongruent trials were avoiding/pushing hedonic prototype pictures and approaching/pulling pragmatic prototype pictures. The two runs were counterbalanced across participants.

Table 2. Overview of AAT trials.

Neurophysiological Implicit Measures.

Scalp EEG potentials were recorded (BrainAmp, [28]) from 32 positions, with Ag/AgCl electrodes (actiCAP, Brainproducts GmbH, Germany) from: Fp1, Fp2, F7, F3, Fz, F4, F8, FC5, FC1, FCz, FC2, FC6, T7, C3, Cz, C4, T8, CP5, CP1, CP2, CP6, TP10, P7, P3, Pz, P4, P8, PO9, O1, Oz, O2, PO10. The left mastoid was used as common reference and EEG was grounded to Cz. All impedances were kept below 20 kΩ at the onset of each session. EEG data was digitized at 1 kHz, high-pass filtered with a time constant of 10 s and stored for off-line data analysis by using the Brain Vision Recorder Software [28]. All EEG data analysis was performed with custom written or adapted scripts in MATLAB®.

Explicit Measures.

Besides the implicit methods, two explicit measures were used, mainly to conduct a manipulation check for the two prototypes. We made use of a combination of the meCUE [27] and UEQ [26], as the two questionnaires have already shown to yield good results in the user test during the development of the prototypes.

3.3 Experimental Design

The experiment was conducted as a within-subject design featuring prototype as independent variable and behavioral implicit evaluation, neuroscientific implicit evaluation, and explicit evaluation as dependent variables with two levels each: hedonic and pragmatic. The orders of the two prototypes and the two AAT runs were counterbalanced across participants to control for order effects.

3.4 Participants

43 participants were recruited via the participant database of the Fraunhofer Institute for Industrial Engineering IAO in Stuttgart, Germany, as well as via Facebook and flyers. As EEG data was only recorded for 12 participants, we only include the data of those participants in the present paper, in order to guarantee the comparability between behavioral and neurophysiological implicit measures. Participants were between 20 and 29 years old (\( M_{Age} = 24.42\,{\text{years}} \), \( SD_{Age} = 2.93\,{\text{years}} \)). Participants received a compensation of 15€ for their participation.

3.5 Set-Up and Procedure

Participants had to complete three parts of the experiment: the interaction, the explicit UX evaluation and the AAT (see Fig. 3). At the beginning of the session, participants were briefed about the setup of the EEG and the tasks. After giving their informed consent, participants received the instructions for the 15-min interactions with the prototypes. After each interaction, the participants filled in the UEQ and meCUE questionnaires. Subsequently, they completed the two runs of the AAT. At the end of the study the participants were debriefed about the actual construct assessing, their implicit attitudes on the prototypes.

Fig. 3.
figure 3

Experimental procedure.

4 Results

4.1 Subjective Evaluations

For the manipulation check data from both questionnaires were combined. For the analysis of the complete data set an exploratory factor analysis was conducted, in order to obtain clusters of variables among the two merged questionnaires. A principal component analysis was performed with a Direct Oblimin rotation and a Kaiser Normalization. Eigenvalues equal to or greater than 1 were extracted and yielded six factors accounting for the 32 variables of both questionnaires. The majority of the questionnaire items loaded on the first two factors reflecting hedonic qualities and pragmatic qualities, which can be deduced by the content of the items they incorporate. Thus, the items were grouped according to these two factors for all further analysis. For the manipulation check the questionnaire ratings on the two defined factors were compared between the pragmatic and hedonic prototype using a Wilcoxon signed-rank test for dependent samples. For all statistical analyses the confidence interval was set to 95%.

The statistical analysis revealed that for both prototypes pragmatic qualities were rated significantly higher than hedonic qualities (hedonic prototype: p = .28, pragmatic prototype: p = .003; Fig. 4). It can be noted that pragmatic quality ratings were generally rather high, which shows that pragmatic aspects of the prototypes were perceived as intended. While for both prototypes pragmatic qualities were rated on a similar level (p = .609), the hedonic prototype received significantly higher ratings for hedonics qualities (p = . 008).

Fig. 4.
figure 4

Comparison between the medians of participants’ rating for hedonic and pragmatic qualities of the hedonic and pragmatic prototypes. Statistically significant differences are marked with a *.

4.2 Behavioral Data

Reaction times (RT) were recorded for congruent and incongruent trials for each picture shown in the AAT for the pragmatic and hedonic prototype. RTs smaller than 200 ms (i.e. the minimum of time for humans needed to perceive a stimulus and execute the appropriate action) and >1500 ms (very slow and not intuitive reactions) were removed from the data sets as outliers. We also excluded those trials from the analysis for which participants responded incorrectly.

For the data analysis, we grouped all trial of each of the four conditions (Table 2) together and the mean RTs were calculated. Based on the means, the AAT difference scores were calculated for the pragmatic and hedonic prototype separately. The AAT difference score is calculated by subtracting push-trials from pull-trials. If participants did indeed experience the hedonic prototype as positive, the difference score should be negative, as pull-trials (congruent/approach) are carried out more natural and faster than push-trials (incongruent/avoidance). As no specific approach or avoidance behavior is expected for pragmatic prototype pictures, it can be assumed that participants should be equally fast for congruent (push) and incongruent (pull) trials.

The medians of the calculated difference scores for the hedonic and pragmatic prototype pictures were entered into a Wilcoxon signed-rank test for dependent samples. The statistical analysis did not reveal any difference (Mdn hedonic  = 45.5, Mdn pragmatic  = .5; p = .41). A slightly smaller variance of the difference scores could be observed for pragmatic prototype pictures (compare Fig. 5).

Fig. 5.
figure 5

Boxplot of the AAT difference scores for the hedonic and pragmatic prototypes.

4.3 EEG Data

The ERP analysis investigates the brain processes underlying approach and avoidance tendencies.

EEG Pre-processing.

Similar to the AAT analysis, in a first step all congruent and incongruent trials were grouped together for both prototypes. In total, we arrived at four groups of EEG signals. The EEG signals were detrended, zero-padded and re-referenced to mathematically liked mastoids [21]. For the ERP analysis only the time period following the presentation of the stimulus (in our case the picture) is of relevance. Therefore, all trials were cut into several time windows and only the time epochs ranging from −200 ms to 0 ms and from 0 msc to 1000 ms were further analyzed. The 200 ms before the onset of the stimulus provide a baseline which shows the participant’s brain activity in a state of rest. The 1000 ms after the stimulus onset is the time window that contains the emotional reaction towards the picture presentation.

Before investigating statistical difference in the data, several pre-processing steps were taken to remove artifacts from the environmental noise or other physiological processes from the data. To do so, we first band-pass filtered the EEG signals between 0.5 to 22 Hz, using a first order zero-phase lag FIR filter. We removed epochs from further analysis when they contained a maximum deviation above 200 µV in any of the frontal EEG channels (Fp1 and Fp2). For the remaining epochs we further performed an independent component analysis (ICA) using the logistic infomax algorithm as implemented in the EEGlab toolbox [29]. We removed cardiac, ocular movement and muscular artefacts based on visual inspection of the topography, times course and power spectral intensity of the ICA components [30].

Estimation of ERP Components.

To study the dynamical changes of the ERP, artefact-free trials were baseline-corrected by subtracting the mean amplitude during the baseline interval (−200 ms to 0 ms). The result is a signal that only represents the emotional reaction of interest and not the general brain activity as present in the baseline. For each participant we then took all the trials in each of the four groups and calculated the average over all trials. The average provides us with a signal representing the time course of the electrical activity in the 0 ms to 1000 ms time window and at any electrode position on the head (Fig. 6). ERP analysis takes a closer look at the characteristic peaks of these averaged signals, also refered to as components. It is of special interest to examine the time points and location in which the peaks occur in order to draw conclusions about the underlying cognitive and emotional processes.

Fig. 6.
figure 6

Temporal dynamics of event-related potentials (ERPs) following the presentation of hedonic (A) and pragmatic prototype pictures (B). The plots show the grand-averaged waveforms (averages across all participants) of ERPs visualized as butterfly plots. Every line represent a single EEG-channel and the dashed black line indicates the beginning of the picture presentation.

To investigate which time windows of interest (TOIs) are most likely to reveal differences between approach and avoidance tendencies for the hedonic and pragmatic prototype, we computed the signed r2-value (difference score) as implemented in the Berlin Brain-Computer Interface (BBCI) toolbox [31]. The difference score was calculted as an average over all participants for each EEG channel and time point for both prototype picture categories (Figs. 7A and 8A).

Fig. 7.
figure 7

Spatio-temporal dynamics of event-related potentials (ERPs) during the hedonic picture presentation: (A) The plot shows the spatial distribution of strongest separability between congruent and incongruent trials during the hedonic picture presentation. The 2-D graph represents the grand-averaged signed r2-values analyzed for every time point (abscissa) for all EEG-channels (ordinate). The vertical black line represents the beginning of the picture presentation. The horizontal color bar indicates the signed r2-difference between the congruent and incongruent trials. Note that strongest difference between congruent and incongruent trials were present at two time intervals: 270–400 ms and 410–600 ms after picture presentation. (B) The plots represent the t-value topographies differences for the first time interval [270–400 ms] and second time interval [410–600 ms] by comparing the congruent and incongruent trials. Electrode clusters, showing significant differences in the non-parametric randomization test, are indicated by filled black circles. Red color represents increase in negativity while blue represents decrease in negativity. (Color figure online)

Fig. 8.
figure 8

Spatio-temporal dynamics of event-related potentials (ERPs) during the pragmatic picture presentation: (A) The plot shows the spatial distribution of strongest separability between congruent and incongruent trials during the hedonic picture presentation. The 2-D graph represents the grand-averaged signed r2-values analyzed for every time point (abscissa) for all EEG-channels (ordinate). The vertical black line represents the beginning of the picture presentation. The horizontal color bar indicates the signed r2-difference between the congruent and incongruent trials. Note that strongest difference between congruent and incongruent trials were present at two time intervals: 300–390 ms and 410–490 ms after picture presentation. (B) The plots represent the t-value topographies differences for the first time interval [300–390 ms] and second time interval [410–490 ms] by comparing the congruent and incongruent trials. Electrode clusters, showing significant differences in the non-parametric randomization test, are indicated by filled black circles. Red color represents increase in negativity while blue represents decrease in negativity. (Color figure online)

For hedonic prototype pictures the strongest difference between congruent and incongruent trials were present at the two TOI, 270 to 400 ms and 410 to 600 ms. Both TOI include a prominent component of the ERP distribution which, in EEG research, is known as P300. For pragmatic prototype pictures the TOI of strongest separability were slightly different, but still also in the range of the P300 (300 ms to 390 ms and 410 ms to 490 ms).

Statistical Analysis of ERP Components.

Within the defined TOI we statistically examined which EEG-channels show significant modulations due to picture presentation. To do so we compared the congruent and incongruent trials for both hedonic and pragmatic prototype pictures at every EEG position on the head. We used a separate multiple dependent sample t-test with a cluster-based, non-parametric randomization approach including correction for multiple comparisons [32, 33] as implemented in the FieldTrip toolbox [34]. The results of the statistical analysis are visualized by heat maps as shown in Figs. 7B and 8B.

For hedonic prototype pictures statistically significant differences between congruent (approach) and incongruent (avoidance) trials were found in the prefrontal and motor cortices during the first TOI, where higher positive activation was observed for avoidance behavior. In the second TOI stronger negativity was observed for congruent trials.

For pragmatic prototype pictures different activation patterns were revealed: We found a positivity for congruent trials (avoidance) in the motor cortex, followed by a lower negativity in the prefrontal and motor cortex than incongruent trials during the second TOI.

5 Discussion and Future Work

The results indicate that – on a subjective level – our attempt to create a more hedonic experience with the hedonic prototype was successful. The pragmatic prototype was rated as equally pragmatic, but significantly less hedonic. The expected approach and avoidance tendencies towards the hedonic and pragmatic prototype, respectively, could not be demonstrated by the AAT, but were reflected in the neurophysiological data. EEG results provide first indications that different brain processes take place when participants are confronted with snapshots of the two prototypes. Hence it is possible to differentiate the users’ reaction to hedonic and pragmatic qualities of a software by statistically comparing the neurophysiological processes underlying approach and avoidance tendencies to hedonic and pragmatic stimuli.

The ERP time course reflects different stages of cognitive information processing starting already at 200 ms after the onset of the picture presentation for both hedonic and pragmatic prototype pictures. Looking at the most discriminative TOIs, we found that hedonic and pragmatic prototype pictures showed highest discrimination during the time window of the P300 component, i.e. around 300 ms after stimulus onset. P300 is sensitive to arousal-related effects of emotional picture processing [35] and is also involved in processing the compatibility between a given stimulus and the performed response [22]. For the congruent hedonic pictures, we found an earlier involvement of the prefrontal cortex which is believed to be involved in the control of emotional or affective behavior in humans [36, 37]. For pragmatic images, only the central brain regions, i.e. the motor cortex, were activated as a first response to the stimuli. Our findings most likely could indicate that for the processing of the hedonic images more higher-order cognitive process are involved for matching the compatibility between a given stimulus and the given movement instruction. It should be noted that the cognitive processing described above takes place before the explicit conscious evaluation of the stimulus and the performance of the pull or push behavior.

While the neurophysiological implicit method yielded promising results, it remains to be discussed why the behavioral measures failed to detect the same differences. It could be argued that the AAT is simply not suitable for UX research. However, the present study does not seem extensive enough to support this claim. In previous studies making use of the AAT the dichotomous stimulus material was always strongly valenced – either positive or negative. The hedonic prototype of our study was clearly on the positive side of the valence spectrum. It is, however, difficult to characterize the pragmatic prototype as negative as it scores high on pragmatic quality, which – even if not as positive as the hedonic prototype – should induce a more or less “neutral” emotional experience. This is in line with Tuch and colleagues who argue that only bad pragmatic qualities causes negative affective reactions [38, 39]. Thus, it is possible that the two prototypes were not different enough from each other to produce distinguishable difference scores for the AAT. Still, the differences we detected in the EEG data suggest that hedonic and pragmatic product features are processed differently in the brain and that this difference just did not manifest itself in all participants’ push and pull behavior (also notice the larger standard deviation for the hedonic prototype of the AAT difference scores in Fig. 5 as compared to pragmatic). To clarify this matter, a third prototype should be developed that offers low pragmatic as well as hedonic qualities. Comparing snapshots from the interaction with this new prototype and the hedonic prototype might shed light into whether the AAT is the wrong method for UX assessment or just not sensitive enough to differentiate between implicit attitudes towards hedonic and pragmatic product features.

Overall, the present study can be regarded as a good start to investigate the potential of applying implicit measures in UX research. The results would, however, benefit from a sample size extension. Additional research could also try different stimulus material and possibly also different behavioral implicit tests to widen our understanding of the psychological constructs underlying hedonic and pragmatic product qualities and provide more evidence for the usefulness of implicit measures to assess UX.