Work-domain knowledge in usability evaluation: Experiences with Cooperative Usability Testing

doi:10.1016/j.jss.2010.02.026

Journal of Systems and Software

Volume 83, Issue 11, November 2010, Pages 2019-2030

https://doi.org/10.1016/j.jss.2010.02.026 Get rights and content

Abstract

Usability evaluation helps to determine whether interactive systems support users in their work tasks. However, knowledge about those tasks and, more generally, about the work-domain is difficult to bring to bear on the processes and outcome of usability evaluation. One way to include such work-domain knowledge might be Cooperative Usability Testing, an evaluation method that consists of (a) interaction phases, similar to classic usability testing, and (b) interpretation phases, where the test participant and the moderator discuss incidents and experiences from the interaction phases. We have studied whether such interpretation phases improve the relevance of usability evaluations in the development of work-domain specific systems. The study included two development cases. We conclude that the interpretation phases generate additional insight and redesign suggestions related to observed usability problems. Also, the interpretation phases generate a substantial proportion of new usability issues, thereby providing a richer evaluation output. Feedback from the developers of the evaluated systems indicates that the usability issues that are generated in the interpretation phases have substantial impact on the software development process. The benefits of the interpretation phases may be explained by the access these provide both to the test participants’ work-domain knowledge and to their experiences as users.

Introduction

When building interactive systems for specific work-domains, a constructive interplay between software development and usability evaluation depends on the availability of work-domain knowledge during the evaluation. The usability professional needs adequate work-domain knowledge in order to prepare relevant task scenarios, to understand the interaction between the test participant and the interactive system, and to generate appropriate solutions to observed usability problems. Conversely, if usability evaluations are conducted without such knowledge, the evaluation output may not be valid or relevant, and its utility in software development will diminish. Usability problems identified by work-domain experts have been found to cause greater impact on the software development process than problems identified by usability experts without work-domain knowledge (Følstad, 2007).

A work-domain is a context of use which requires specialized knowledge and experience to understand fully. Examples of work-domains are those of mobile sales personnel and emergency response personnel, the two cases of the present study. The field of Human-Computer Interaction (HCI) offers a range of methods suitable for eliciting work-domain knowledge including task analysis, field studies, and context of use analysis (Maguire, 2001a). Outside of HCI many more methods have similar goals (e.g., Vicente, 1999, Maiden and Rugg, 1996, Byrd et al., 1992). However, eliciting and describing work-domain knowledge may be costly and are fundamentally difficult (Suchman, 1995), making it necessary to draw on the users’ work-domain knowledge also during usability evaluation (Kensing and Munk-Madsen, 1993).

How to draw on the users’ work-domain knowledge in usability evaluations is not clear. One usability evaluation method for doing so is Cooperative Usability Testing (Frøkjær and Hornbæk, 2005). In this method, the test participants are actively engaged in the interpretation of their own interaction with the evaluated system. When evaluating work-domain specific systems, such interpretation is expected to tap test participants’ work-domain knowledge. While other methods in HCI focus on obtaining work-domain knowledge independently of the usability evaluation (as for example Contextual Inquiry (Beyer and Holtzblatt, 1998)), Cooperative Usability Testing is one of the few methods that extend classic usability testing in this regard, thus retaining the benefits of such testing. Currently, however, we know of no study of the use of Cooperative Usability Testing in the development of work-domain specific systems.

The purpose of the present study is to explore empirically the effect of including interpretation phases into the usability evaluation of work-domain specific systems. We have studied an adaptation of Cooperative Usability Testing in two development cases. In particular, we look at how the test participants’ interpretations affected (a) the output of the evaluation and (b) the subsequent priorities of the software developers. Data are analyzed in three major ways:

•
Qualitative analysis of how test participants’ interpretations enable improved understanding of observed usability problems;
•
Qualitative analysis of the usability issues (usability problems or redesign suggestions) generated from test participants’ interaction and interpretation respectively;
•
Quantitative comparison of the impact of these usability issues in the subsequent software development process.

The findings provide new knowledge on how test participants’ work-domain knowledge can benefit our understanding of usability issues identified in usability testing, and how their interpretations affect the interplay between usability evaluation and software development. With basis in the findings, we make recommendations on having test participants interpret their own interaction during usability testing of work-domain specific interactive systems. Also, we put forward hypotheses for future studies.

Section snippets

Background

The background of the study is first related to general notions of work-domain knowledge and then to the specific issue of work-domain knowledge in usability evaluation. Finally, we discuss the difficulties associated with evaluating whether usability evaluation methods work; in our case, how the interpretation phases of Cooperative Usability Testing benefit the interplay between usability evaluation and software development.

Research question

The main research question of the study was formulated as:

How does the inclusion of interpretation phases affect the outcome of usability testing of work-domain specific systems?

It was assumed that users’ interpretations could affect the usability testing either by (a) providing new understanding of usability issues observed during the users’ interaction with the evaluated system or (b) generating new usability issues not observed during interaction. Alternatively, it might be the case that the

Method

In order to explore the effect of introducing interpretation phases in usability testing, an adaptation of the Cooperative Usability Testing of Frøkjær and Hornbæk (2005) was used in two cases of work-domain specific software development.

Usability issues evolving across the different phases

Twenty-eight usability issues were identified in the interaction phases of the two cases; 16 in Case 1 and 12 in Case 2. Nine of these issues were elaborated in the interpretation phases; two in Case 1 and seven in Case 2. The elaborations were found to provide (a) additional insight, (b) design suggestions, and (c) deliberations of the usability issue. The inter-rater agreement of the coding was calculated as Cohen's Kappa k = 0.93; an almost perfect agreement (Landis and Koch, 1977). All three

Discussion

We have investigated how the inclusion of interpretation phases affects the outcome of usability testing of work-domain specific systems. The first part of the discussion will be structured according to the three sub-questions explicated in Section 3. After this, we will discuss possible explanations for the observed effects of the interpretation phases, provide recommendations on interpretation phases, and finally discuss the validity and generality of the study.

Conclusions and future research

To summarize, the presented study has provided new knowledge on the inclusion of interpretation phases in usability testing. We have concluded that:

•
The interpretation phases serve to elaborate usability issues identified in the interaction phases, in particular as increased insight and design suggestions.
•
The interpretation phases generate a substantial proportion of new usability issues, providing richer insight, for instance by covering a broader range of usability issue categories.
•
The

Acknowledgements

The presented study has been conducted as part of the FLAMINKO and RECORD projects; both supported by the VERDIKT program of the Norwegian Research Council. The study forms part of a doctoral thesis to be submitted to the Department of Psychology, University of Oslo. We wish to thank Erik Frøkjær (University of Copenhagen) and Jan Heim (SINTEF) for their feedback on the study, Line C. Gjerde (University of Oslo) for her help in the analyses, and Consafe Logistics (www.consafelogistics.no) and

Asbjørn Følstad has a master in psychology from the Norwegian University of Science and Technology (NTNU) and has, since 2000, been employed as researcher at SINTEF, the largest independent research organization in Scandinavia. He is currently working on his PhD thesis, to be submitted to the University of Oslo. His research interests include human-computer interaction, in particular methods for design feedback and usability evaluation. The research is typically conducted in applied research

References (47)

G. Lindgaard
Notions of thoroughness, efficiency and validity: Are they valid in HCI practice?
International J. of Industrial Ergonomics
(2006)
M. Maguire
Methods to support human-centred design
Int. J. Hum.-Comput. Stud.
(2001)
M. Maguire
Context of use within usability activities
Int. J. Hum.-Comput. Stud.
(2001)
S. Ross et al.
PETRA: participatory evaluation through redesign and analysis
Interact. Comput.
(1995)
T. Uldall-Espersen et al.
Tracing impact in a usability improvement process
Interact. with Computers 20
(2008)
P.C. Wright et al.
A cost-effective method for use by designers
Int. J. Man-Mach. Stud.
(1991)
E. Baauw et al.
Assessing the applicability of the structured expert evaluation method (SEEM) for a wider age group
I. Bark et al.
Use and usefulness of HCI methods: results from an exploratory study among Nordic HCI practitioners
K. Beyer et al.
Contextual Design: Defining Customer-Centered Systems
(1998)
J. Blomberg et al.
Ethnographic field methods and their relation to design

S. Bødker et al.

Context: an active choice in usability work

Interactions

(1998)

M.T. Boren et al.

Thinking aloud: reconciling theory and practice

IEEE Trans. Professional Commun.

(2000)

T.A. Byrd et al.

Synthesis of research on requirements analysis and knowledge acquisition techniques

MIS Q.

(1992)

J. Chattratichart et al.

Applying user test data to UEM performance metrics

G. Cockton et al.

Inspection-based evaluations

G. Cockton et al.

Inspection-based evaluations

J. Cohen

Statistical Power Analysis for the Behavioral Sciences

(1988)

H.W. Desurvire et al.

What is gained and lost when using evaluation methods other than empirical testing

J.S. Dumas et al.

Usability testing: current practice and future directions

J.S. Dumas et al.

A Practical Guide to Usability Testing

(1999)

D. Ezzy

Qualitative Analysis: Practice and Innovation

(2002)

A. Field

Discovering Statistics Using SPSS

(2005)

A. Følstad

Work-domain experts as evaluators: usability inspection of domain-specific work support systems

Int. J. Hum.-Comput. Interact.

(2007)

Cited by (35)

Deep understanding in industrial processes by complementing human expertise with interpretable patterns of machine learning
2019, Expert Systems with Applications
Citation Excerpt :
Section 6 concludes the paper. Domain knowledge refers to a broad understanding of a particular industry (Følstad & Hornbæk, 2010). More specifically, it is the knowledge about the environment in which a business operates, and encompasses an understanding of the dynamics, risks, challenges, history, sectors, value chains, customers and the industry-specific strategies of the target enterprise (Varga, 2014).
Experts in industrial processes rely on domain knowledge (DK) repositories to identify the causes of abnormal situations in order to make appropriate decisions that mitigate the negative effects of such events. These DK repositories need to be enriched and updated continuously as different unexpected events occur. A common causality analysis method in DK repositories is the fault tree analysis (FTA). The major limitation of updating a fault tree is that it requires in-depth system knowledge, which involves a high level of human experience. Data exploitation based on machine learning (ML) can address this limitation by deeply analyzing process historical data to discover hidden phenomena that are difficult for human experts to identify and to analyze. This paper proposes an innovative methodology that combines domain knowledge, in the form of FTA, with additional knowledge extracted by a descriptive ML method called logical analysis of data (LAD). More specifically, LAD is a classification method, which provides as a by-product a set of interpretable rules (patterns) explaining the classification results. The patterns extracted from historical data represent an important and complementary source of knowledge that provides experts with insights and allows them to better understand the process operations. The objective of using these patterns in the proposed methodology is to provide automatic enrichment and updating of existing fault trees in order to achieve accurate fault detection and diagnosis (FDD) in industrial processes. The proposed methodology is demonstrated using fault trees constructed for two different systems in the process industry. The fault tree for each system was updated successfully with minimal effort from process experts.
New approaches to usability evaluation in software development: Barefoot and crowdsourcing
2015, Journal of Systems and Software
Citation Excerpt :
Findings from that study show that usability specialists found more problems using heuristic evaluation than non-specialists while the double experts found most problems (Nielsen, 1992). Additionally, Følstad and Hornbæk conducted a study in which a group of end users acted as domain experts in the conduction of cooperative usability evaluations (Følstad and Hornbæk, 2010). This study shows that evaluation output was enriched by including domain experts in the interpretation phase as they provided additional insights in identified problems and helped in uncovering a considerable amount of new problems (Følstad and Hornbæk, 2010).
Usability evaluations provide software development teams with insights on the degree to which software applications enable users to achieve their goals, how fast these goals can be achieved, how easy an application is to learn and how satisfactory it is in use. Although such evaluations are crucial in the process of developing software systems with a high level of usability, their use is still limited in small and medium-sized software development companies. Many of these companies are e.g. unable to allocate the resources that are needed to conduct a full-fledged usability evaluation in accordance with a conventional approach.
This paper presents and assesses two new approaches to overcome usability evaluation obstacles: a barefoot approach where software development practitioners are trained to drive usability evaluations; and a crowdsourcing approach where end users are given minimalist training to enable them to drive usability evaluations. We have evaluated how these approaches can reduce obstacles related to limited understanding, resistance and resource constraints. We found that these methods are complementary and highly relevant for software companies experiencing these obstacles. The barefoot approach is particularly suitable for reducing obstacles related to limited understanding and resistance while the crowdsourcing approach is cost-effective.
Test strategies in distributed software development environments
2013, Computers in Industry
Citation Excerpt :
Defect reports through error tracking tools are used to analyze qualitative data (e.g., analyzing participant's reflections and opinions with task activity scenarios) and quantitative data (e.g., locations and type of defects, when they were found and fixed, names of developers and teams). Defect data are analysed via research tactics to inform management on quality and reliability issues [21,22]. Detailed mathematical and statistical methods are applied to correlate defect for development time estimation, prediction of severity of defect, and identification of implementation issues when a number of team members are involved in a project.
Distributed software development has resulted in formation of business partners spread across different economic, temporal, and organizational zones collaborating together for shared authorship of evolving software artifacts. However, the distributed approach is not without risks, and organizations implement specific test strategies to assist in the verification process of work-in-progress software artifacts. This paper discusses the test strategies adopted in the software development lifecycle by a service provider pursuing distributed software development in New Zealand, Australia, and India. Verification and validation processes have been deployed to ascertain the quality, security and traceability of artifacts developed in distributed sites. Findings reveal that strategies are based on protection of sensitive data through management of test database, use of drivers and interfacing stub supports between modules, as well as compliance verification on incremental releases through a customized “Synchronize and Stabilize” lifecycle model. A staging environment, within the case business context, is used for evaluating the robustness of the software product before its official launch in production environment. The actual ongoing work practices within distributed software business environments are presented, which provide value to academia, industry, ICT sector and government institutions.
Making Usability Test Data Actionable! A Quantitative Test-Driven Prototyping Approach
2023, Conference on Human Factors in Computing Systems - Proceedings
Using the Think Aloud Protocol in an Immersive Virtual Reality Evaluation of a Virtual Twin
2022, Proceedings - SUI 2022: ACM Conference on Spatial User Interaction
Let’s Play! or Don’t? The Impact of UX and Usability on the Adoption of a Game-based Student Response System
2022, International Conference on Computer Supported Education, CSEDU - Proceedings

View all citing articles on Scopus

Kasper Hornbæk received his M.Sc. and Ph.D. in Computer Science from the University of Copenhagen, in 1998 and 2002, respectively. Since 2009 he has been a professor with special duties in Human-centered Computing at University of Copenhagen. His core research interests are human-computer interaction, usability research, search user interfaces, and information visualization; detours include eye tracking, cultural usability, and reality-based interfaces. Kasper serves on the editorial boards of Journal of Usability Studies, Interacting with Computers, and International Journal of Human-Computer Studies (IJHCS). He has published at CHI, UIST, ACM Transactions on Computer-Human Interaction, and Human-Computer Interaction, and won IJHCS's most cited paper award 2006–2008.

View full text

Work-domain knowledge in usability evaluation: Experiences with Cooperative Usability Testing

Abstract

Introduction

Section snippets

Background

Research question

Method

Usability issues evolving across the different phases

Discussion

Conclusions and future research

Acknowledgements

International J. of Industrial Ergonomics

Int. J. Hum.-Comput. Stud.

Int. J. Hum.-Comput. Stud.

Interact. Comput.

Interact. with Computers 20

Int. J. Man-Mach. Stud.

Assessing the applicability of the structured expert evaluation method (SEEM) for a wider age group

Use and usefulness of HCI methods: results from an exploratory study among Nordic HCI practitioners

Contextual Design: Defining Customer-Centered Systems

Ethnographic field methods and their relation to design

Context: an active choice in usability work

Interactions

Thinking aloud: reconciling theory and practice

IEEE Trans. Professional Commun.

Synthesis of research on requirements analysis and knowledge acquisition techniques

MIS Q.

Applying user test data to UEM performance metrics

Inspection-based evaluations

Inspection-based evaluations

Statistical Power Analysis for the Behavioral Sciences

What is gained and lost when using evaluation methods other than empirical testing

Usability testing: current practice and future directions

A Practical Guide to Usability Testing

Qualitative Analysis: Practice and Innovation

Discovering Statistics Using SPSS

Work-domain experts as evaluators: usability inspection of domain-specific work support systems

Int. J. Hum.-Comput. Interact.