A critical review of the literature on spreadsheet errors

doi:10.1016/j.dss.2008.06.001

Decision Support Systems

Volume 46, Issue 1, December 2008, Pages 128-138

https://doi.org/10.1016/j.dss.2008.06.001 Get rights and content

Abstract

Among those who study spreadsheet use, it is widely accepted that errors are prevalent in operational spreadsheets and that errors can lead to poor decisions and cost millions of dollars. However, relatively little is known about what types of errors actually occur, how they were created, how they can be detected, and how they can be avoided or minimized. This paper summarizes and critiques the research literature on spreadsheet errors from the viewpoint of a manager who wishes to improve operational spreadsheet quality. We also offer suggestions for future research directions that can improve the state of knowledge about spreadsheet errors and mitigate spreadsheet risks.

Introduction

The problem of eliminating errors from software has been around since the beginning of the computer era. The discipline of software engineering [38] arose out of a need for error-free software code. With the advent of the personal computer in the 1980s and the rapid rise of end-user computing, control of software development passed out of the hands of professionals and into the hands of millions of spreadsheet users, few of whom had any formal training for the task.

As spreadsheets have diffused throughout business, evidence has accumulated that many spreadsheets contain errors [16], [24], [25], [29] and that errors can be costly to the organizations that use them (European Spreadsheet Risks Interest Group: http://www.eusprig.org/stories.htm). Nevertheless, end users and organizations that rely on spreadsheets have generally not recognized the risks of spreadsheet errors [21]. In fact, spreadsheets are somewhat ignored, both as corporate assets and as sources of risk.

Although research has suggested that errors are prevalent in spreadsheets, there is much we don't know about the types of errors that occur, why they occur, and how to avoid them. We believe that a critical review of the relevant literature can inform future research on this topic. This paper provides such a review.

Rather than give a chronological account of the literature on spreadsheet errors, we organize the discussion around the following topics:

•
Classification: what types of errors occur?
•
Impact: what are the consequences of errors?
•
Frequency: how common are errors?
•
Creation and prevention: how can we build trustworthy spreadsheets?
•
Detection: how can we audit spreadsheets to correct errors when they occur?

We devote a section of the paper to each of these topics in turn and conclude by suggesting guidelines for future research.

Section snippets

Types of errors

Any classification system allows us to understand the commonalities among individual instances. The Linnaean system, for example, classifies living things into species, genera, families, and so on. This hierarchy allows us to infer that individuals in the same species are more alike than those in different species, and species in the same genera are more alike than those in different genera. An effective classification of spreadsheet errors would allow us to compare errors across studies and

Impact of errors

Ironically, the impact of errors on spreadsheet results is the least studied of all the topics we address. Impact can be measured in several ways. An obvious measure is the percentage error in the outputs of the spreadsheet. But a 1% error in one spreadsheet could be devastating, while a 10% error in another could be inconsequential. A more telling measure of impact would be the actual dollar losses from erroneous or poor decisions resulting from spreadsheet errors. To estimate this impact

Frequency of errors

How common are errors in spreadsheets? Not surprisingly, the answer to this question depends on the definition of errors, the lifecycle stage, and the setting (operational or laboratory). About the only general conclusion we can draw from the literature is that no studies have suggested that errors are not a problem in spreadsheets, with the exception of Nardi and Miller [22] who concluded that “users devote considerable effort to debugging their spreadsheet models — they are very

Creation and prevention of errors

We know remarkably little about how errors are created by end users. Not surprisingly, perhaps, no one has attempted to study spreadsheet development in the field at a level of detail that would permit observation of developers making errors. What little we do know comes from laboratory experiments, yet as previously stated, the relevance of laboratory results to the field is questionable.

In a study mentioned earlier, Brown and Gould [2] performed experiments in which their subjects created

Detection of errors

Two types of studies have examined detection of errors in completed spreadsheets. In laboratory experiments, subjects are asked to find errors placed in spreadsheets by the researcher; in field audits, experts try to find errors in operational spreadsheets.

In the great majority of laboratory studies, the subjects have been given no specific training or instruction on how to identify errors. Thus, these studies may have limited implications for detecting errors. One exception is Teo and Tan [40]

Research directions

Research on spreadsheet errors can be conducted either in the laboratory or in the field. Each type of research offers its own insights and has its own limitations.

In some ways, laboratory research is easier to conduct than field research, but its limitations are significant. In particular, error rates in laboratory experiments should not be used uncritically to infer error rates in operational spreadsheets because the underlying conditions differ. We don't yet know the impact of those

References (41)

J. Kreie et al.
Applications development by end-users: can quality be improved?
Decision Support Systems,
(2000)
S. Kruck
Testing spreadsheet accuracy theory
Information and Software Technology
(2006)
T. McGill et al.
The role of spreadsheet knowledge in user-developed application success
Decision Support Systems
(2005)
B. Nardi et al.
Twinkling lights and nested loops: distributed problem solving and spreadsheet development
International Journal of Man-Machine Studies
(1991)
R. Panko et al.
Hitting the wall: errors in developing and code inspecting a ‘simple’ spreadsheet model
Decision Support Systems
(1998)
P. Saariluoma et al.
Transforming verbal descriptions into mathematical formulas in spreadsheet calculation
International Journal of Human-Computer Studies
(1994)
T. Teo et al.
Effects of error factors and prior incremental practice on spreadsheet error detection: an experimental study
Omega — The International Journal of Management Science
(2001)
Y. Ayalew et al.
Detecting errors in spreadsheets
P. Brown et al.
An experimental study of people creating spreadsheets
ACM Transactions on Office Information Systems
(1987)
R. Butler
Is this spreadsheet a tax evader?

R. Butler, Risk assessment for spreadsheet developments: choosing which models to audit, H. M. Customs and Excise UK...

J. Caulkins, E. Morrison, T. Weidemann, Spreadsheet errors and decision making: evidence from field interviews, Journal...

M. Clermont

A spreadsheet auditing tool evaluated in an industrial context

P. Cragg et al.

Spreadsheet modelling abuse: an opportunity for OR?

Journal of Operational Research Society

(1993)

N. Davies et al.

Auditing spreadsheets

Australian Accountant

(1987)

S. Ditlea

Spreadsheets can be hazardous to your health

Personal Computing

(1987)

D. Freeman

How to make spreadsheets error-proof

Journal of Accountancy

(1996)

F. Galletta et al.

An empirical study of spreadsheet error-finding performance

Accounting, Management & Information Technology

(1993)

F. Galletta et al.

Spreadsheet presentation and error detection: an experimental study

Journal of Management Information Systems

(1997)

H. M.

Customs and excise computer audit service

Cited by (122)

Lumped record management method using BIM and dynamo for spalling maintenance
2024, Automation in Construction
Traditional approaches to managing maintenance histories for repairing spalling involved time-consuming and non-intuitive information representation and manual practices. Therefore, this study introduces an innovative approach to improve the efficiency of roadway maintenance history management. To enable intuitive understanding of the spalling maintenance history through visualization, this study utilizes Autodesk Revit and developed Dynamo scripts to automatically input the spalling maintenance history stored in Excel into a 3D model, thereby creating a Building Information Modeling (BIM). The BIM, generated through the automatic input of maintenance history using Dynamo, is expected to enhance the efficiency of maintenance history management. Additionally, it will assist stakeholders in their decision-making process through the intuitive information provision. Furthermore, there is potential for this approach to be applied beyond roadways and the management of other infrastructures.
WARDER: Towards effective spreadsheet defect detection by validity-based cell cluster refinements
2020, Journal of Systems and Software
Citation Excerpt :
Spreadsheet quality issues are common. Spreadsheets can contain various defects (Powell et al., 2008; Rajalingham et al., 2008; Panko, 2006; 2008), and these defects can cause catastrophic losses to human daily lives (Reinhart and Rogoff, 2010; FIN, 2019; Panko, 2016). Galletta et al. (1993) conducted an empirical study on spreadsheets, and reported that even spreadsheet experts cannot significantly outperform novices in identifying spreadsheet defects.
Nowadays spreadsheets are very popular and being widely used. However, they can be prone to various defects and cause severe consequences when end users poorly maintain them. Our research communities have proposed various techniques for automated detection of spreadsheet defects, but they commonly fall short of effectiveness, either due to their limited scope or relying on strict patterns. In this article, we discuss and improve one state-of-the-art technique, CUSTODES, which exploits spreadsheet cell clustering and defect detection to extend its scope and make its detection patterns adaptive to varying spreadsheet styles. Still, CUSTODES can be prone to problematic clustering when accidentally involving irrelevant cells, leading to a largely reduced detection precision. Regarding this, we present WARDER to refine CUSTODES’s spreadsheet cell clustering based on three extensible validity-based properties. Experimental results show that WARDER could improve the precision by 19.1% on spreadsheet cell clustering, which contributed to a precision improvement of 23.3 ~ 24.3% for spreadsheet defect detection, as compared to CUSTODES (F-measure increased from 0.71 to 0.79 ~ 0.82). WARDER also exhibited satisfactory results on another practical large-scale spreadsheet corpus VEnron2, improving the defect detection precision by 10.7 ~ 21.2% over CUSTODES.
A user model to directly compare two unmodified interfaces: a study of including errors and error corrections in a cognitive user model
2024, Artificial Intelligence for Engineering Design, Analysis and Manufacturing: AIEDAM
Spreadsheet quality assurance: a literature review
2024, Frontiers of Computer Science
Spreadsheet skills training: Required and needs improvement
2024, Journal of Education for Business
A Novel Tool for Collaborative and Blinded Orthopedic Image Analysis
2023, Life

View all citing articles on Scopus

Steve Powell is a Professor at the Tuck School of Business at Dartmouth. His primary research interest lies in modeling production and services processes, but he has also been active in research in energy economics, marketing, and operations. At Tuck, he has developed a variety of courses in management science, including the core Decision Science course and electives in the Art of Modeling, Business Process Redesign, and Applications of Simulation. He originated the Teacher's Forum column in Interfaces, and has written a number of articles on teaching modeling to practitioners. He is the academic director of the INFORMS Annual Teaching of Management Science Workshop. In 2001 he was awarded the INFORMS Prize for the Teaching of Operations Research/Management Science Practice. He is the co-author with Kenneth Baker of The Art of Modeling with Spreadsheets (Wiley, 2004).

Ken Baker is a faculty member at Dartmouth College. He is currently Nathaniel Leverone Professor of Management at the Tuck School of Business and also adjunct professor at the Thayer School of Engineering. At Dartmouth, he has taught courses relating to decision science, manufacturing management, and environmental management. Over the years, much of his teaching and research has dealt with production planning and control, and he is widely known for his textbook Elements of Sequencing and Scheduling, in addition to a variety of technical articles. He has served as the Tuck School's associate dean and directed the Tuck School's management development programs in the manufacturing area. In 2001 he was named a Fellow of INFORMS' Manufacturing and Service Operations Management (MSOM) Society, and in 2004 a Fellow of INFORMS. He is the co-author with Stephen Powell of The Art of Modeling with Spreadsheets (Wiley, 2004).

Barry Lawson is a research associate at the Tuck School of Business at Dartmouth and is also a visiting scholar in the geography department of the college. He founded and has served as president of Barry Lawson Associates, a consulting firm, since 1978. As visiting scholar, he coordinates the development of an atlas of the upper Connecticut River Watershed in New Hampshire and Vermont. As research associate at Tuck he serves as the program manager for the Tuck Spreadsheet Engineering Research Project. Lawson has taught in graduate programs at Boston University and Wayne State University as well as in short courses at Bentley College. He has moderated a host of public hearings for local, state and federal governments on controversial environmental and energy-and waste-related projects, and has considerable experience in group facilitation, conflict resolution and simulation design.

^☆: This work was performed under the sponsorship of the U.S. Department of Commerce, National Institute of Standards and Technology. Reproduction of this article, with the customary credit to the source, is permitted.

View full text

A critical review of the literature on spreadsheet errors☆

Abstract

Introduction

Section snippets

Types of errors

Impact of errors

Frequency of errors

Creation and prevention of errors

Detection of errors

Research directions

Decision Support Systems,

Information and Software Technology

Decision Support Systems

International Journal of Man-Machine Studies

Decision Support Systems

International Journal of Human-Computer Studies

Omega — The International Journal of Management Science

Detecting errors in spreadsheets

An experimental study of people creating spreadsheets

ACM Transactions on Office Information Systems

Is this spreadsheet a tax evader?

A spreadsheet auditing tool evaluated in an industrial context

Spreadsheet modelling abuse: an opportunity for OR?

Journal of Operational Research Society

Auditing spreadsheets

Australian Accountant

Spreadsheets can be hazardous to your health

Personal Computing

How to make spreadsheets error-proof

Journal of Accountancy

An empirical study of spreadsheet error-finding performance

Accounting, Management & Information Technology

Spreadsheet presentation and error detection: an experimental study

Journal of Management Information Systems

Customs and excise computer audit service