Skip to main content

Spreadsheet Engineering

  • Chapter
  • First Online:
Central European Functional Programming School (CEFP 2013)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8606))

Included in the following conference series:

Abstract

These tutorial notes present a methodology for spreadsheet engineering. First, we present data mining and database techniques to reason about spreadsheet data. These techniques are used to compute relationships between spreadsheet elements (cells/columns/rows), which are later used to infer a model defining the business logic of the spreadsheet. Such a model of a spreadsheet data is a visual domain specific language that we embed in a well-known spreadsheet system.

The embedded model is the building block to define techniques for model-driven spreadsheet development, where advanced techniques are used to guarantee the model-instance synchronization. In this model-driven environment, any user data update has to follow the model-instance conformance relation, thus, guiding spreadsheet users to introduce correct data. Data refinement techniques are used to synchronize models and instances after users update/evolve the model.

These notes briefly describe our model-driven spreadsheet environment, the MDSheet environment, that implements the presented methodology. To evaluate both proposed techniques and the MDSheet tool, we have conducted, in laboratory sessions, an empirical study with the summer school participants. The results of this study are presented in these notes.

This work is part funded by ERDF - European Regional Development Fund through the COMPETE Programme (operational programme for competitiveness) and by National Funds through the FCT - Fundação para a Ciência e a Tecnologia (Portuguese Foundation for Science and Technology) within projects FCOMP-01-0124-FEDER-010048, and FCOMP-01-0124-FEDER-020532. The first author was funded by FCT grant SFRH/BPD/73358/2010.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Microsoft Excel: http://office.microsoft.com/en-us/excel.

  2. 2.

    Gnumeric: http://projects.gnome.org/gnumeric.

  3. 3.

    OpenOffice: http://www.openoffice.org.

  4. 4.

    LibreOffice: http://www.libreoffice.org.

  5. 5.

    Google Drive: http://drive.google.com.

  6. 6.

    Microsoft Office 365: http://www.microsoft.com/en-us/office365/online-software.aspx.

  7. 7.

    ZoHo Sheet: http://sheet.zoho.com.

  8. 8.

    This list of horror stories is available at: http://www.eusprig.org/horror-stories.htm.

  9. 9.

    HaExcel can be found at http://ssaapp.di.uminho.pt.

  10. 10.

    We omit here the column labels, whose names depend on the number of columns in the generated table.

  11. 11.

    We assume colors are visible in the digital version of this paper.

  12. 12.

    “It allows to assign more precise types to data constructors by restricting the variables of the datatype in the constructors’ result types.”

  13. 13.

    http://ssaapp.di.uminho.pt.

References

  1. Abraham, R., Erwig, M.: Header and unit inference for spreadsheets through spatial analyses. In: 2004 IEEE Symposium on Visual Languages and Human Centric Computing, pp. 165–172, September 2004

    Google Scholar 

  2. Abraham, R., Erwig, M.: UCheck: a spreadsheet type checker for end users. J. Vis. Lang. Comput. 18(1), 71–95 (2007)

    Article  Google Scholar 

  3. Abraham, R., Erwig, M.: Goal-directed debugging of spreadsheets. In: VL/HCC, pp. 37–44. IEEE Computer Society (2005)

    Google Scholar 

  4. Abraham, R., Erwig, M.: Autotest: a tool for automatic test case generation in spreadsheets. In: Proceedings of the 2006 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC 2006), pp. 43–50. IEEE Computer Society (2006)

    Google Scholar 

  5. Abraham, R., Erwig, M.: Inferring templates from spreadsheets. In: Proceedings of the 28th International Conference on Software Engineering, pp. 182–191. ACM, New York (2006)

    Google Scholar 

  6. Abraham, R., Erwig, M.: Type inference for spreadsheets. In: Bossi, A., Maher, M.J. (eds.) Proceedings of the 8th International ACM SIGPLAN Conference on Principles and Practice of Declarative Programming, Venice, Italy, 10–12 July 2006, pp. 73–84. ACM (2006)

    Google Scholar 

  7. Abraham, R., Erwig, M.: Goaldebug: a spreadsheet debugger for end users. In: ICSE 2007: Proceedings of the 29th International Conference on Software Engineering, pp. 251–260. IEEE Computer Society, Washington, DC (2007)

    Google Scholar 

  8. Abraham, R., Erwig, M.: Mutation operators for spreadsheets. IEEE Trans. Softw. Eng. 35(1), 94–108 (2009)

    Article  Google Scholar 

  9. Abraham, R., Erwig, M., Kollmansberger, S., Seifert, E.: Visual specifications of correct spreadsheets. In: Proceedings of the 2005 IEEE Symposium on Visual Languages and Human-Centric Computing, VL/HCC 2005, pp. 189–196. IEEE Computer Society (2005)

    Google Scholar 

  10. Aho, A.V., Sethi, R., Ullman, J.D.: Compilers: Principles, Techniques and Tools. Addison Wesley, Reading (1986)

    Google Scholar 

  11. Alhajj, R.: Extracting the extended entity-relationship model from a legacy relational database. Inf. Syst. 28(6), 597–618 (2003)

    Article  MATH  Google Scholar 

  12. Alves, T.L., Silva, P.F., Visser, J.: Constraint-aware schema transformation. Electron. Notes Theor. Comput. Sci. 290, 3–18 (2012)

    Article  Google Scholar 

  13. Bricklin, D.: VisiCalc: Information from its creators, Dan Bricklin and Bob Frankston. http://www.bricklin.com/visicalc.htm. Accessed 5 Dec 2013

  14. Bruins, E.: On Plimpton 322. Pythagorean numbers in Babylonian mathematics. Koninklijke Nederlandse Akademie van Wetenschappen 52, 629–632 (1949)

    MATH  MathSciNet  Google Scholar 

  15. Burnett, M., Cook, C., Pendse, O., Rothermel, G., Summet, J., Wallace, C.: End-user software engineering with assertions in the spreadsheet paradigm. In: Proceedings of the 25th International Conference on Software Engineering, ICSE 2003, pp. 93–103. IEEE Computer Society (2003)

    Google Scholar 

  16. Campbell-Kelly, M., Croarken, M., Flood, R., Robson, E.: The History of Mathematical Tables: From Sumer to Spreadsheets. Oxford University Press, Oxford (2003)

    Book  Google Scholar 

  17. Codd, E.F.: A relational model of data for large shared data banks. Commun. ACM 13(6), 377–387 (1970)

    Article  MATH  Google Scholar 

  18. Cunha, A., Oliveira, J.N., Visser, J.: Type-safe two-level data transformation. In: Misra, J., Nipkow, T., Sekerinski, E. (eds.) FM 2006. LNCS, vol. 4085, pp. 284–299. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  19. Cunha, J., Erwig, M., Saraiva, J.: Automatically inferring classsheet models from spreadsheets. In: IEEE Symposium on Visual Languages and Human-Centric Computing, VL/HCC 2010, pp. 93–100. IEEE Computer Society (2010)

    Google Scholar 

  20. Cunha, J., Fernandes, J., Mendes, J., Saraiva, J.: Embedding, evolution, and validation of model-driven spreadsheets. IEEE Trans. Software Eng. PP(99), 1 (2014)

    Google Scholar 

  21. Cunha, J., Fernandes, J.P., Ribeiro, H., Saraiva, J.: Towards a catalog of spreadsheet smells. In: Murgante, B., Gervasi, O., Misra, S., Nedjah, N., Rocha, A.M.A.C., Taniar, D., Apduhan, B.O. (eds.) ICCSA 2012, Part IV. LNCS, vol. 7336, pp. 202–216. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  22. Cunha, J., Fernandes, J.P., Mendes, J., Martins, P., Saraiva, J.: Smellsheet detective: a tool for detecting bad smells in spreadsheets. In: Proceedings of the 2012 IEEE Symposium on Visual Languages and Human-Centric Computing, VLHCC 2012, pp. 243–244. IEEE Computer Society, Washington, DC (2012)

    Google Scholar 

  23. Cunha, J., Fernandes, J.P., Mendes, J., Saraiva, J.: Extension and implementation of ClassSheet models. In: Proceedings of the 2012 IEEE Symposium on Visual Languages and Human-Centric Computing, VLHCC 2012, pp. 19–22. IEEE Computer Society (2012)

    Google Scholar 

  24. Cunha, J., Fernandes, J.P., Mendes, J., Saraiva, J.: MDSheet: a framework for model-driven spreadsheet engineering. In: Proceedings of the 34th International Conference on Software Engineering, ICSE 2012, pp. 1412–1415. ACM (2012)

    Google Scholar 

  25. Cunha, J., Fernandes, J.P., Saraiva, J.: From relational ClassSheets to UML+OCL. In: Proceedings of the Software Engineering Track at the 27th Annual ACM Symposium on Applied Computing, pp. 1151–1158. ACM (2012)

    Google Scholar 

  26. Cunha, J., Mendes, J., Fernandes, J.P., Saraiva, J.: Embedding and evolution of spreadsheet models in spreadsheet systems. In: Proceedings of the 2011 IEEE Symposium on Visual Languages and Human-Centric Computing, VLHCC 2011, pp. 186–201. IEEE (2011)

    Google Scholar 

  27. Cunha, J., Saraiva, J., Visser, J.: Model-based programming environments for spreadsheets. Sci. Comput. Program. (SCP) 96, 254–275 (2014)

    Article  Google Scholar 

  28. Cunha, J., Visser, J., Alves, T., Saraiva, J.: Type-safe evolution of spreadsheets. In: Giannakopoulou, D., Orejas, F. (eds.) FASE 2011. LNCS, vol. 6603, pp. 186–201. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  29. Engels, G., Erwig, M.: ClassSheets: automatic generation of spreadsheet applications from object-oriented specifications. In: Proceedings of the 20th IEEE/ACM International Conference on Automated Software Engineering, pp. 124–133. ACM (2005)

    Google Scholar 

  30. Erdweg, S., et al.: The state of the art in language workbenches. In: Erwig, M., Paige, R.F., Van Wyk, E. (eds.) SLE 2013. LNCS, vol. 8225, pp. 197–217. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  31. Erwig, M.: Software engineering for spreadsheets. IEEE Softw. 29(5), 25–30 (2009)

    Article  Google Scholar 

  32. Erwig, M., Abraham, R., Cooperstein, I., Kollmansberger, S.: Automatic generation and maintenance of correct spreadsheets. In: Proceedings of the 27th International Conference on Software Engineering, pp. 136–145. ACM (2005)

    Google Scholar 

  33. Erwig, M., Abraham, R., Kollmansberger, S., Cooperstein, I.: Gencel: a program generator for correct spreadsheets. J. Funct. Program. 16(3), 293–325 (2006)

    Article  MATH  Google Scholar 

  34. Erwig, M., Burnett, M.: Adding apples and oranges. In: Adsul, B., Ramakrishnan, C.R. (eds.) PADL 2002. LNCS, vol. 2257, pp. 173–191. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  35. Fisher II, M., Cao, M., Rothermel, G., Cook, C., Burnett, M.: Automated test case generation for spreadsheets. In: Proceedings of the 24th International Conference on Software Engineering (ICSE 2002), pp. 141–154. ACM Press, New York, 19–25 May 2002

    Google Scholar 

  36. Fisher II, M., Rothermel, G., Brown, D., Cao, M., Cook, C., Burnett, M.: Integrating automated test generation into the WYSIWYT spreadsheet testing methdology. ACM Trans. Softw. Eng. Methodol. 15(2), 150–194 (2006)

    Article  Google Scholar 

  37. Fisher II, M., Rothermel, G., Creelan, T., Burnett, M.: Scaling a dataflow testing methodology to the multiparadigm world of commercial spreadsheets. In: Proceedings of the 17th IEEE International Symposium on Software Reliability Engineering, Raleigh, NC, USA, pp. 13–22, November 2006

    Google Scholar 

  38. Gibbons, J.: Functional programming for domain-specific languages. In: Zsok, V. (ed.) Central European Functional Programming - Summer School on Domain-Specific Languages, July 2013

    Google Scholar 

  39. Hermans, F., Pinzger, M., van Deursen, A.: Automatically extracting class diagrams from spreadsheets. In: D’Hondt, T. (ed.) ECOOP 2010. LNCS, vol. 6183, pp. 52–75. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  40. Hermans, F., Pinzger, M., van Deursen, A.: Supporting professional spreadsheet users by generating leveled dataflow diagrams. In: Proceedings of the 33rd International Conference on Software Engineering, ICSE 2011, pp. 451–460. ACM (2011)

    Google Scholar 

  41. Hermans, F., Pinzger, M., van Deursen, A.: Detecting and visualizing inter-worksheet smells in spreadsheets. In: Proceedings of the 2012 International Conference on Software Engineering, ICSE 2012, pp. 441–451. IEEE Press (2012)

    Google Scholar 

  42. Hermans, F., Pinzger, M., van Deursen, A.: Detecting code smells in spreadsheet formulas. In: ICSM, pp. 409–418 (2012)

    Google Scholar 

  43. Hinze, R., Löh, A., Oliveira, B.C.S.: “Scrap your boilerplate” reloaded. In: Hagiya, M. (ed.) FLOPS 2006. LNCS, vol. 3945, pp. 13–29. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  44. Hudak, P.: Building domain-specific embedded languages. ACM Comput. Surv. 28(4es), 196 (1996)

    Article  Google Scholar 

  45. Jones, S.P., Blackwell, A., Burnett, M.: A user-centred approach to functions in excel. In: Proceedings of the 8th ACM SIGPLAN International Conference on Functional Programming, ICFP 2003, pp. 165–176. ACM (2003)

    Google Scholar 

  46. Kankuzi, B., Sajaniemi, J.: An empirical study of spreadsheet authors’ mental models in explaining and debugging tasks. In: 2013 IEEE Symposium on Visual Languages and Human-Centric Computing, VL/HCC 2013, pp. 15–18 (2013)

    Google Scholar 

  47. Kuiper, M., Saraiva, J.: Lrc - a generator for incremental language-oriented tools. In: Koskimies, K. (ed.) CC 1998. LNCS, vol. 1383, pp. 298–301. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  48. Lämmel, R., Visser, J.: A Strafunski application letter. In: Dahl, V. (ed.) PADL 2003. LNCS, vol. 2562, pp. 357–375. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  49. Lämmel, R., Saraiva, J., Visser, J. (eds.): GTTSE 2005. LNCS, vol. 4143. Springer, Heidelberg (2006)

    Google Scholar 

  50. Luckey, M., Erwig, M., Engels, G.: Systematic evolution of model-based spreadsheet applications. J. Vis. Lang. Comput. 23(5), 267–286 (2012)

    Article  Google Scholar 

  51. Maier, D.: The Theory of Relational Databases. Computer Science Press, Rockville (1983)

    MATH  Google Scholar 

  52. Morgan, C., Gardiner, P.: Data refinement by calculation. Acta Inform. 27, 481–503 (1990)

    Article  MATH  MathSciNet  Google Scholar 

  53. Nardi, B.A.: A Small Matter of Programming: Perspectives on End User Computing, 1st edn. MIT Press, Cambridge (1993)

    Google Scholar 

  54. Oliveira, J.: A reification calculus for model-oriented software specification. Form. Asp. Comput. 2(1), 1–23 (1990)

    Article  MATH  Google Scholar 

  55. Oliveira, J.N.: Transforming data by calculation. In: Lämmel, R., Visser, J., Saraiva, J. (eds.) Generative and Transformational Techniques in Software Engineering II. LNCS, vol. 5235, pp. 134–195. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  56. Panko, R.R.: What we know about spreadsheet errors. J. End User Comput. (Special issue on Scaling Up End User Development) 10(2), 15–21 (1998)

    Google Scholar 

  57. Panko, R.R.: Spreadsheet errors: what we know. what we think we can do. In: Proceedings of the European Spreadsheet Risks Interest Group (EuSpRIG) (2000)

    Google Scholar 

  58. Panko, R.R.: Facing the problem of spreadsheet errors. Decis. Line 37(5), 8–10 (2006)

    Google Scholar 

  59. Panko, R.R., Aurigemma, S.: Revising the panko-halverson taxonomy of spreadsheet errors. Decis. Support Syst. 49(2), 235–244 (2010)

    Article  Google Scholar 

  60. Panko, R.R., Ordway, N.: Sarbanes-Oxley: What About all the Spreadsheets? CoRR abs/0804.0797 (2008)

    Google Scholar 

  61. Peyton Jones, S., Washburn, G., Weirich, S.: Wobbly types: type inference for generalised algebraic data types. Technical report, MS-CIS-05-26, University of Pennsylvania, July 2004

    Google Scholar 

  62. Powell, S.G., Baker, K.R., Lawson, B.: A critical review of the literature on spreadsheet errors. Decis. Support Syst. 46(1), 128–138 (2008)

    Article  Google Scholar 

  63. Rajalingham, K., Chadwick, D.R., Knight, B.: Classification of spreadsheet errors. In: Proceedings of the 2001 European Spreadsheet Risks Interest Group, EuSpRIG 2001, Amsterdam (2001)

    Google Scholar 

  64. Reinhart, C.M., Rogoff, K.S.: Growth in a time of debt. Am. Econ. Rev. 100(2), 573–578 (2010)

    Article  Google Scholar 

  65. Robson, E.: Neither Sherlock Holmes nor Babylon: a reassessment of Plimpton 322. Historia Mathematica 28(3), 167–206 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  66. Rothermel, G., Burnett, M., Li, L., Sheretov, A.: A methodology for testing spreadsheets. ACM Trans. Softw. Eng. Methodol. 10, 110–147 (2001)

    Article  Google Scholar 

  67. Ruthruff, J., Creswick, E., Burnett, M., Cook, C., Prabhakararao, S., Fisher II, M., Main, M.: End-user software visualizations for fault localization. In: Proceedings of the ACM Symposium on Software Visualization, San Diego, CA, USA, pp. 123–132, June 2003

    Google Scholar 

  68. Scaffidi, C., Shaw, M., Myers, B.: Estimating the numbers of end users and end user programmers. In: Proceedings of the 2005 IEEE Symposium on Visual Languages and Human-Centric Computing, pp. 207–214 (2005)

    Google Scholar 

  69. Stevens, P., Whittle, J., Booch, G. (eds.): UML 2003. LNCS, vol. 2863. Springer, Heidelberg (2003)

    Google Scholar 

  70. Swierstra, D., Azero, P., Saraiva, J.: Designing and implementing combinator languages. In: Swierstra, S.D., Oliveira, J.N. (eds.) AFP 1998. LNCS, vol. 1608, pp. 150–206. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  71. Ullman, J.D., Widom, J.: A First Course in Database Systems. Prentice Hall, Upper Saddle River (1997)

    Google Scholar 

  72. Ullman, J.: Principles of Database and Knowledge-Base Systems, vol. I. Computer Science Press, Rockville (1988)

    Google Scholar 

  73. Visser, E.: A survey of strategies in rule-based program transformation systems. J. Symbolic Comput. 40, 831–873 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  74. Visser, J., Saraiva, J.: Tutorial on strategic programming across programming paradigms. In: 8th Brazilian Symposium on Programming Languages, Niteroi, Brazil, May 2004

    Google Scholar 

Download references

Acknowledgments

The theories, techniques and tools presented in this tutorial paper were developed under the project SSaaPP - SpreadSheets as a Programming Paradigm: a research project funded by the Portuguese Science Foundation (contract number FCOMP-01-0124-FEDER-010048). We would like to thank the members and consultants of this project who made important contributions for the results presented in this document, namely: Rui Maranhão Abreu, Tiago Alves, Laura Beckwith, Orlando Belo, Martin Erwig, Pedro Martins, Hugo Pacheco, Christophe Peixoto, Rui Pereira, Alexandre Perez, Hugo Ribeiro, André Riboira, André Silva, and Joost Visser.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to João Saraiva .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Cunha, J., Fernandes, J.P., Mendes, J., Saraiva, J. (2015). Spreadsheet Engineering. In: Zsók, V., Horváth, Z., Csató, L. (eds) Central European Functional Programming School. CEFP 2013. Lecture Notes in Computer Science(), vol 8606. Springer, Cham. https://doi.org/10.1007/978-3-319-15940-9_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-15940-9_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-15939-3

  • Online ISBN: 978-3-319-15940-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics