Model-based diagnosis of spreadsheet programs: a constraint-based debugging approach

Jannach, Dietmar; Schmitz, Thomas

doi:10.1007/s10515-014-0141-7

Model-based diagnosis of spreadsheet programs: a constraint-based debugging approach

Published: 12 February 2014

Volume 23, pages 105–144, (2016)
Cite this article

Automated Software Engineering Aims and scope Submit manuscript

Dietmar Jannach¹ &
Thomas Schmitz¹

765 Accesses
Explore all metrics

Abstract

Spreadsheet programs are probably the most successful example of end-user software development tools and are used for a variety of purposes. Like any type of software, they are prone to error, in particular as they are usually developed by non-programmers. While various techniques exist to support the developer in finding errors in procedural programs, the tool support for spreadsheet debugging is still limited. In this paper, we show how techniques from model-based diagnosis can be applied and extended for spreadsheet debugging by translating the relevant parts of a spreadsheet to a constraint satisfaction problem. We additionally propose both problem-specific and generalizable extensions to the classical diagnosis algorithms which help to detect potential problems in a spreadsheet based on user-provided test cases more efficiently. The proposed techniques were integrated into a modular framework for spreadsheet debugging and evaluated with respect to scalability based on a number of real-world and artificially created spreadsheets. An additional error detection exercise involving 24 subjects was performed to assess the general applicability of such advanced spreadsheet debugging techniques for end users.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

AI for Localizing Faults in Spreadsheets

Fragment-Based Diagnosis of Spreadsheets

Fragment-based spreadsheet debugging

Article Open access 22 December 2018

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

In the following, we will use the terms “spreadsheet”, “spreadsheet program” and “spreadsheet application” interchangeably as done in the literature.
http://www.eusprig.org/horror-stories.htm, last accessed November 2013.
The second option has its limitations because the non-existence of formal processes is probably one of the reasons for the success of spreadsheets in the first place.
When a component is considered to be faulty, we assume that it can produce an arbitrary output within the range of generally possible values.
In this paper we use the terms “diagnosis candidates” and “diagnoses” interchangeably.
An algorithm for the treatment of multiple test cases during diagnosis is given in Felfernig et al. (2004).
Alternatively, the user could have restricted the domain size of variable C1.
The correct formulas in the figures are only used to better illustrate the problem setting.
In the formalization of MBD and CSP concepts, we will rely on the notations and terms used, e.g., in Reiter (1987) or Felfernig et al. (2004), to illustrate the correspondence of our approach with previous works.
The detection of faults which are caused by semantic errors or wrong domain definitions are beyond the scope of this work.
See later sections for a further discussion of the problem of domain definitions.
In the literature, a number of formalizations for spreadsheets are proposed, e.g., in Abraham and Erwig (2006a) or Abraham and Erwig (2007), and some of them provide their own grammar for expressions or detailed mapping rules to a different representation (Abreu et al. 2012; Hofer et al. 2013). Since the specific mapping of the spreadsheet formulas to the CSP representation is not relevant for the subsequently described pruning approach, we will introduce a simple notation that focuses only on the relevant cell dependencies.
There also might be a wrong formula in D1 but this one will be independent of the other error and can only be identified through another test case.
The set of input cells that should be considered by the debugging to be potentially faulty could be determined interactively by the spreadsheet developer.
A discussion of the computational complexity of QuickXplain can be found in Junker (2004).
Usual depth limitations could be applied here.
We omitted these calls in Algorithm 1 for better readability.
Some redundant calculations of conflicts can however in theory occur depending on how the threads are scheduled.
We also made experiments in which we limited the thread-pool size to 1, thereby imitating a non-parallelized approach, leading to similar results. The additional overheads related to the creation of threads appear to be neglectable.
http://www.emn.fr/z-info/choco-solver/
Intel Core i7-3770K, 3.5 GHz, 16GB RAM, 8 hardware threads.
This is about twice the time needed for the single fault problem reported in Fig. 9.
We conducted the experiment in German and all material used in the study was prepared in German. Here the English translation is shown.
The value 1 corresponds to “never”; 5 means “very often”.
We made additional measurements regarding the time needed by the subjects to study the description of the intended semantics or go through the Exquisite tutorial. A detailed analysis of possible correlations of these measurements with the error-detection performance is however beyond the scope of this paper.
We encouraged the participants to comment on their strategies during the experiment in the sense of a think-aloud protocol.
Other types of errors could be semantic or structural errors.
Available at http://www.cril.univ-artois.fr/CPAI08/

References

Abraham, R., Erwig, M.: AutoTest: a tool for automatic test case generation in spreadsheets. In: Proceedings of the IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC 2006), Brighton, United Kingdom, pp. 43–50 (2006)
Abraham, R., Erwig, M.: Inferring templates from spreadsheets. In: Proceedings of the 28th International Conference on Software Engineering (ICSE 2006), Shanghai, China, pp. 182–191 (2006)
Abraham, R., Erwig, M.: GoalDebug: a spreadsheet debugger for end users. In: Proceedings of the 29th International Conference on Software Engineering (ICSE 2007), Minneapolis, MN, USA, pp. 251–260 (2007)
Abraham, R., Erwig, M.: Mutation operators for spreadsheets. IEEE Trans. Softw. Eng. 35(1), 94–108 (2009)
Article Google Scholar
Abreu, R., Zoeteweij, P., van Gemund, A.J.C.: An observation-based model for fault localization. In: Proceedings of the 6th International Workshop on Dynamic Analysis (WODA 2008), New York, NY, USA, pp. 64–70 (2008)
Abreu, R., Riboira, A., Wotawa, F.: Constraint-based debugging of spreadsheets. In: Proceedings of the 15th Ibero-American Conference on Software Engineering (CIbSE 2012), Buenos Aires, Argentina, pp. 1–14 (2012)
Asavametha, A.: Detecting bad smells in spreadsheets. Master’s thesis, School of Electrical Engineering and Computer Science, Oregon State University, USA (2012)
Brown, P.S., Gould, J.D.: An experimental study of people creating spreadsheets. ACM Trans. Inf. Syst. 5(3), 258–272 (1987)
Article Google Scholar
Burnett, M., Sheretov, A., Ren, B., Rothermel, G.: Testing homogeneous spreadsheet grids with the “what you see Is what you test” methodology. IEEE Trans. Softw. Eng. 28(6), 576–594 (2002)
Article Google Scholar
Burnett, M., Cook, C., Pendse, O., Rothermel, G., Summet, J., Wallace, C.: End-user software engineering with assertions in the spreadsheet paradigm. In: Proceedings of the 25th International Conference on Software Engineering (ICSE 2003), Portland, Oregon, pp. 93–103 (2003)
Campos, J., Riboira, A., Perez, A., Abreu, R.: Gzoltar: an eclipse plug-in for testing and debugging. In: Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering (ASE 2012), Essen, Germany, pp. 378–381 (2012)
Chadwick, D., Knight, B., Rajalingham, K.: Quality control in spreadsheets: a visual approach using color codings to reduce errors in formulae. Softw. Qual. Control 9(2), 133–143 (2001)
Article Google Scholar
Chittaro, L., Ranon, R.: Hierarchical model-based diagnosis based on structural abstraction. Artif. Intell. 155(1–2), 147–182 (2004)
Article MATH MathSciNet Google Scholar
Console, L., Friedrich, G., Dupré, D.T.: Model-based diagnosis meets error diagnosis in logic programs. In: Proceedings of the 13th International Joiunt Conference on Artificial Intelligence (IJCAI 1993), Chambéry, France, pp. 1494–1499 (1993)
Creeth, R.: Micro-computer spreadsheets: their uses and abuses. J. Account. 159(6), 90–93 (1985)
Google Scholar
Cunha, J., Fernandes, JaP., Ribeiro, H., Saraiva, Ja.: Towards a catalog of spreadsheet smells. In: Proceedings of the 12th International Conference on Computational Science and Its Applications (ICCSA 2012), Salvador de Bahia, Brazil, pp. 202–216 (2012)
de Kleer, J.: Using crude probability estimates to guide diagnosis. Artif. Intell. 45(3), 381–391 (1990)
Article Google Scholar
de Kleer, J., Williams, B.C.: Diagnosing multiple faults. Artif. Intell. 32(1), 97–130 (1987)
Article MATH Google Scholar
Ditlea, S.: Spreadsheets can be hazardous to your health. Pers. Comput. 11(1), 60–69 (1987)
Google Scholar
Felfernig, A., Schubert, M.: FastDiag: a diagnosis algorithm for inconsistent constraint sets. In: Proceedings of the 21st International Workshop on the Principles of Diagnosis (DX 2010), Portland, OR, USA, pp. 31–38 (2010)
Felfernig, A., Friedrich, G., Jannach, D., Stumptner, M., Zanker, M.: Hierarchical diagnosis of large configurator knowledge bases. In: Proceedings of the Joint German/Austrian Conference on Artificial Intelligence (KI 2001), Vienna, Austria, pp. 185–197 (2001)
Felfernig, A., Friedrich, G., Jannach, D., Stumptner, M.: Consistency-based diagnosis of configuration knowledge bases. Artif. Intell. 152(2), 213–234 (2004)
Article MATH MathSciNet Google Scholar
Filby, G. (ed.): Spreadsheets in Science and Engineering. Springer, Berlin (1998)
Freuder, E.C., Wallace, R.J., Heffernan, R.: Partial constraint satisfaction. Artif. Intell. 58(1–3), 278–283 (1992)
Google Scholar
Friedrich, G., Nejdl, W.: Choosing observations and actions in model-based diagnosis/repair systems. In: Proceedings of the 3rd International Conference on Principles of Knowledge Representation and Reasoning (KR 1992), Cambridge, MA, USA, pp. 489–498 (1992)
Friedrich, G., Stumptner, M., Wotawa, F.: Model-based diagnosis of hardware designs. Artif. Intell. 111(1–2), 3–39 (1999)
Article MATH MathSciNet Google Scholar
Galletta, D.F., Abraham, D., Louadi, M.E., Lekse, W., Pollalis, Y.A., Sampler, J.L.: An empirical study of spreadsheet error-finding performance. Account. Manag. Inf. Technol. 3(2), 79–95 (1993)
Article Google Scholar
Greiner, R., Smith, B.A., Wilkerson, R.W.: A correction to the algorithm in Reiter’s theory of diagnosis. Artif. Intell. 41(1), 79–88 (1989)
Article MATH MathSciNet Google Scholar
Hermans, F.: Improving spreadsheet test practices. In: Proceedings of the 23rd Annual International Conference on Computer Science and Software Engineering (CASCON 2013), Markham, Ontario, Canada, pp. 56–69 (2013)
Hermans, F., Pinzger, M., van Deursen, A.: Detecting and visualizing inter-worksheet smells in spreadsheets. In: Proceedings of the 34th International Conference on Software Engineering (ICSE 2012), Zurich, Switzerland, pp. 441–451 (2012)
Hermans, F., Pinzger, M., van Deursen, A.: Detecting code smells in spreadsheet formulas. In: Proceedings of the 28th IEEE International Conference on Software Maintenance (ICSM 2012), pp. 409–418. Trento, Riva del Garda (2012)
Herndon, T., Ash, M., Pollin, R.: Does high public debt consistently stifle economic growth? A critique of Reinhart and Rogoff. Working paper 322, Political Economy Research Institute, University of Massachusetts, Amherst (2013)
Hofer, B., Riboira, A., Wotawa, F., Abreu, R., Getzner, E.: On the empirical evaluation of fault localization techniques for spreadsheets. In: Proceedings of the 16th International Conference on Fundamental Approaches to Software Engineering (FASE 2013), Rome, Italy, pp. 68–82 (2013)
Jannach, D., Engler, U.: Toward model-based debugging of spreadsheet programs. In: Proceedings of the 9th Joint Conference on Knowledge-Based Software Engineering (JCKBSE 2010), Kaunas, Lithuania, pp. 252–264 (2010)
Jannach, D., Baharloo, A., Williamson, D.: Toward an integrated framework for declarative and interactive spreadsheet debugging. In: Procedings of the 8th International Conference on Evaluation of Novel Approaches to Software Engineering (ENASE 2013), Angers, France, pp. 117–124 (2013)
Jones, J.A., Harrold, M.J., Stasko, J.: Visualization of test information to assist fault localization. In: Proceedings of the 24th International Conference on Software Engineering (ICSE 2002), Orlando, FL, USA, pp. 467–477 (2002)
Junker, U.: QUICKXPLAIN: preferred explanations and relaxations for over-constrained problems. In: Proceedings of the 19th National Conference on Artificial Intelligence (AAAI 2004), San Jose, CA, USA, pp. 167–172 (2004)
Larrosa, J., Meseguer, P., Schiex, T.: Maintaining reversible DAC for Max-CSP. Artif. Intell. 107(1), 149–163 (1999)
Google Scholar
Mateis, C., Stumptner, M., Wieland, D., Wotawa, F.: Model-based debugging of Java Programs. In: Proceedings of the Fourth International Workshop on Automated Debugging (AADEBUG 2000), Munich, Germany (2000)
Metodi, A., Stern, R., Kalech, M., Codish, M.: Compiling model-based diagnosis to Boolean satisfaction. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence (AAAI 2012), Toronto, Canada (2012)
Panko, R.R.: What we know about spreadsheet errors. J. End User Comput. 10(2), 15–21 (1998)
Article Google Scholar
Panko, R.R., Halverson, R.P.: Spreadsheets on trial: a survey of research on spreadsheet risks. In: Proceedings of the 29th Hawaii International Conference on System Sciences (HICSS 1996), Wailea, HI, USA, pp. 326–335 (1996)
Panko, R.R., Port, D.N.: End user computing: the dark matter (and dark energy) of corporate IT. In: Proceedings of the 45th Hawaii International Conference on System Sciences (HICSS 2012), Wailea, HI, USA, pp. 4603–4612 (2012)
Parnin, C., Orso, A.: Are automated debugging techniques actually helping programmers?. In: Proceedings of the 2011 International Symposium on Software Testing and Analysis (ISSTA 2011), Toronto, Canada, pp. 199–209 (2011)
Pemberton, J., Robson, A.: Spreadsheets in business. Ind. Manag. Data Syst. 100(8), 379–388 (2000)
Article Google Scholar
Powell, S.G., Baker, K.R., Lawson, B.: A critical review of the literature on spreadsheet errors. Decis. Support Syst. 46(1), 128–138 (2008)
Article Google Scholar
Reichwein, J., Rothermel, G., Burnett, M.: Slicing spreadsheets: an integrated methodology for spreadsheet testing and debugging. In: Proceedings of the 2nd Conference on Domain-Specific Languages (DSL 1999), Austin, Texas, pp. 25–38 (1999)
Reinhart, C.M., Rogoff, K.S.: Growth in a time of debt. Am. Econ. Rev. 100(2), 573–578 (2010)
Article Google Scholar
Reiter, R.: A theory of diagnosis from first principles. Artif. Intell. 32(1), 57–95 (1987)
Article MATH MathSciNet Google Scholar
Rothermel, G., Li, L., Dupuis, C., Burnett, M.: What you see is what you test: a methodology for testing form-based visual programs. In: Proceedings of the 20th International Conference on Software Engineering (ICSE 1998), Kyoto, Japan, pp. 198–207 (1998)
Shchekotykhin, K., Friedrich, G., Fleiss, P., Rodler, P.: Interactive ontology debugging: two query strategies for efficient fault localization. J. Web Semant. 12–13, 88–103 (2012)
Article Google Scholar
Smith, B.M.: Locating the phase transition in binary constraint satisfaction problems. Artif. Intell. 81, 155–181 (1994)
Article Google Scholar
Stumptner, M., Wotawa, F.: Model-based program debugging and repair. In: Proceedings of the 9th International Industrial and Engineering Applications of Artificial Intelligence and Expert Systems (IEA/AIE 1996), Fukuoka, Japan, pp. 155–160 (1996)
Stumptner, M., Wotawa, F.: Debugging functional programs. In: Proceedings of the 16th International Joint Conference on Artificial Intelligence (IJCAI 1999), Stockholm, Sweden, pp. 1074–1079 (1999)
Stumptner, M., Wotawa, F.: Diagnosing tree-structured systems. Artif. Intell. 127(1), 1–29 (2001)
Article MATH MathSciNet Google Scholar
Tsang, E.: Foundations of Constraint Satisfaction. Academic Press, London (1993)
Google Scholar
Wotawa, F., Pill, I.: On classification and modeling issues in distributed model-based diagnosis. AI Commun. 26(1), 133–143 (2013)
MathSciNet Google Scholar
Wotawa, F., Weber, J.: Challenges of distributed model-based diagnosis. In: Trends in Applied Intelligent Systems, Springer, Lecture Notes in Computer Science, pp. 711–720 (2010)

Download references

Acknowledgments

This work was partially supported by the European Union through the programme “Europäischer Fonds für regionale Entwicklung—Investition in unsere Zukunft” under contract number 300251802.

Author information

Authors and Affiliations

TU Dortmund, Dortmund, Germany
Dietmar Jannach & Thomas Schmitz

Authors

Dietmar Jannach
View author publications
You can also search for this author inPubMed Google Scholar
Thomas Schmitz
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Dietmar Jannach.

Appendices

Appendix 1: Performance measurements for regular CSPs

The parallelization approach described in Sect. 3 is not limited to spreadsheet debugging problems but can also be applied to general CSPs. In order to demonstrate the generalizability of the approach we have conducted a series of experiments to measure the performance gains that can be obtained for a number of benchmark problems taken from the 2008 CSP Solver competition^{Footnote 28}.

1.1 Problem selection and experimental procedure

When selecting CSP instances from the solver competition, we only included solvable instances and picked constraint problems that could be solved in relatively short time (i.e., below 2 s), since the diagnosis process can require a number of solution searches and we had to repeat all experiments several times to factor out random effects.

For each tested problem instance, we first generated a number of random solutions. From each solution, we randomly picked a fraction of the variables (e.g. 10 %) and stored their values. These variable/value-combinations thus represent our partial test cases to be used by both algorithm variants. Next, we manually inserted errors (mutations) in the constraint problem formulations, e.g., by changing a $\le $ operator to a $<$ operator, which corresponds to a mutation-based approach as proposed for example in Abraham and Erwig (2009).

The different scenarios are summarized in Fig. 13, in which we give details about the constraint problem, the number and characteristics of diagnoses induced by the test cases as well as information about the number of corresponding test cases. “TC fail” is the number of test cases which were inconsistent with the mutated CSP. “TC ok” is the number of non-failing partial test cases.

To measure the performance gains for parallelization, we used the absolute running times (“wall time”) to compute all diagnoses up to a certain search depth. The experiments were repeated 10 times and average running times were reported. Experiments with more runs did not lead to different results.

1.2 Effects of Parallelization

The results of our first experiment using the scenario costasArray-13 shows how the computation time depends on the number of parallel threads. For these experiments, we used a laptop computer with an Intel Core i7-3610QM processor with 8 hardware threads.

Figure 14 shows running times for the CostasArray-13 scenario with a varying number of threads. On the left hand part of the figure, we used the original CSP instance and observed that the running times strongly decrease for this problem setting when more than one thread is used. After about 4–5 threads, however, no further improvement is obtained and performance slightly degrades when the number of threads is higher than the number of CPU threads. Overall, the running times can be decreased by more than 50 % from 2.36 s to about 1 s. Since on average it only takes a few milliseconds to find a solution for this problem, we made another experiment shown on the right hand side of the figure. In this experiment, we artificially extended the search time by keeping the CPU busy in a no-operation loop for ten milliseconds before actually starting search. The numbers show that the same trend can be observed also when the solution search lasts longer. The achieved improvement here was even at about 60 %, which might be caused by the fact that the additional thread management is neglectable when compared with the search time in this case.

The relative performance improvements for the other analyzed scenarios from Fig. 13 using 6 parallel threads are given in Fig. 15 below. Note that some of them were comparably hard, with running times above one minute to compute the diagnoses. A detailed analysis of the relation of problem characteristics and the observed run-time improvements is part of our current work. Obviously, however, the size of the conflicts and thus the breadth of the search tree can be considered as one of the most relevant factors.

Appendix 2: Detailed questionnaire items used in user study

The post-experiment questionnaire to be filled out by the participants of the user study consisted of the following items. The questions and answers are translated into English; the original questionnaire was in German. A 5-point rating scale was used for all questions ranging from “completely agree/yes” (5) to “completely disagree/no” (1).

On debugging tools

Do you think that better tool support for fault localization is on principle desirable for spreadsheets?
Do you think that the Exquisite tool is helpful for fault localization?
Do you think that faults can be located faster with this tool support?
Would you use the tool (again) to find an error in a spreadsheet application?

On testing

Do you think that relying on explicit test cases is advantageous for the detection of faults?
Do you think that the usage of test cases through the Exquisite tool is helpful for the detection of faults?
Would you be willing to define test cases, if you can detect faults with them?

Tool assessment

Do you think it is important that an error-detection tool is embedded within a system like MS Excel?
Was it clear to you how to use the Exquisite tool?

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jannach, D., Schmitz, T. Model-based diagnosis of spreadsheet programs: a constraint-based debugging approach. Autom Softw Eng 23, 105–144 (2016). https://doi.org/10.1007/s10515-014-0141-7

Download citation

Received: 13 June 2013
Accepted: 21 January 2014
Published: 12 February 2014
Issue Date: March 2016
DOI: https://doi.org/10.1007/s10515-014-0141-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Model-based diagnosis of spreadsheet programs: a constraint-based debugging approach

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

AI for Localizing Faults in Spreadsheets

Fragment-Based Diagnosis of Spreadsheets

Fragment-based spreadsheet debugging

Explore related subjects

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1: Performance measurements for regular CSPs

1.1 Problem selection and experimental procedure

1.2 Effects of Parallelization

Appendix 2: Detailed questionnaire items used in user study

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now