Abstract
Model-finders such as SAT-solvers are attractive for producing concrete models, either as sample instances or as counterexamples when properties fail. However, the generated model is arbitrary. To address this, several research efforts have proposed principled forms of output from model-finders. These include minimal and maximal models, unsat cores, and proof-based provenance of facts.
While these methods enjoy elegant mathematical foundations, they have not been subjected to rigorous evaluation on users to assess their utility. This paper presents user studies of these three forms of output performed on advanced students. We find that most of the output forms fail to be effective, and in some cases even actively mislead users. To make such studies feasible to run frequently and at scale, we also show how we can pose such studies on the crowdsourcing site Mechanical Turk.
Keywords
This is a preview of subscription content, log in via an institution.
Notes
- 1.
We tried to conduct a study at the ABZ conference (which has exactly the expertise we need), handing out well over a hundred brief surveys on paper and electronically over several days. Sadly, we received only two responses.
- 2.
Here, “coding” denotes classifying responses, not the colloquial term for programming.
- 3.
Only one author coded the free-form explanations into the 0–3 possible categories; thus, no inter-coder-reliability is reported. This is reasonable because the objective nature of having students give explanations along the different blame categories suggests a low likelihood of inaccurate coding.
- 4.
We did try to find Alloy users on MTurk. However, in twice the time it took to complete the studies of this section, we received at most 8 valid responses.
References
Aitken, S., Gray, P., Melham, T., Thomas, M.: Interactive theorem proving: an empirical study of user activity. J. Symb. Comput. 25(2), 263–284 (1998)
Akhawe, D., Barth, A., Lam, P., Mitchell, J., Song, D.: Towards a formal foundation of web security. In: IEEE Computer Security Foundations Symposium (2010)
Beckert, B., Grebing, S., Böhl, F.: How to put usability into focus: using focus groups to evaluate the usability of interactive theorem provers. In: Workshop on User Interfaces for Theorem Provers (UITP) (2014)
Beckert, B., Grebing, S., Böhl, F.: A usability evaluation of interactive theorem provers using focus groups. In: Workshop on Human Oriented Formal Methods (HOFM) (2014)
Bry, F., Yahya, A.: Positive unit hyperresolution tableaux and their application to minimal model generation. J. Autom. Reason. 25(1), 35–82 (2000)
Cunha, A., Macedo, N., Guimarães, T.: Target oriented relational model finding. In: Gnesi, S., Rensink, A. (eds.) FASE 2014. LNCS, vol. 8411, pp. 17–31. Springer, Heidelberg (2014). doi:10.1007/978-3-642-54804-8_2
D’Antoni, L., Kini, D., Alur, R., Gulwani, S., Viswanathan, M., Hartmann, B.: How can automatic feedback help students construct automata? Trans. Comput. Hum. Interact. 22(2), March 2015
DeOrio, A., Bertacco, V.: Human computing for EDA. In: Proceedings of the 46th Annual Design Automation Conference, pp. 621–622 (2009)
Doghmi, S.F., Guttman, J.D., Thayer, F.J.: Searching for shapes in cryptographic protocols. In: Grumberg, O., Huth, M. (eds.) TACAS 2007. LNCS, vol. 4424, pp. 523–537. Springer, Heidelberg (2007). doi:10.1007/978-3-540-71209-1_41
Fagin, R., Ullman, J.D., Vardi, M.Y.: On the semantics of updates in databases. In: Principles of Database Systems (PODS), pp. 352–365. ACM (1983)
Fu, Z., Malik, S.: On solving the partial MAX-SAT problem. In: Biere, A., Gomes, C.P. (eds.) SAT 2006. LNCS, vol. 4121, pp. 252–265. Springer, Heidelberg (2006). doi:10.1007/11814948_25
Ghoniem, M., Fekete, J.D., Castagliola, P.: A comparison of the readability of graphs using node-link and matrix-based representations. In: Information Visualization (INFOVIS) (2004)
Gould, S., Cox, A.L., Brumby, D.P.: Diminished control in crowdsourcing: an investigation of crowdworker multitasking behavior. Trans. Comput. Hum. Interact. 23, 19:1–19:29 (2016)
Hentschel, M., Hähnle, R., Bubel, R.: An empirical evaluation of two user interfaces of an interactive program verifier. In: International Conference on Automated Software Engineering (2016)
Herman, G.L., Kaczmarczyk, L.C., Loui, M.C., Zilles, C.B.: Proof by incomplete enumeration and other logical misconceptions. In: International Computing Education Research Workshop, ICER, pp. 59–70 (2008)
Jackson, D.: Software Abstractions: Logic, Language, and Analysis. MIT Press, Cambridge (2012)
Janota, M.: SAT solving in interactive configuration. Ph.D. thesis, University College Dublin (2010)
Kittur, A., Chi, E.H., Suh, B.: Crowdsourcing user studies with Mechanical Turk. In: Conference on Human Factors in Computing Systems (CHI) (2008)
Koshimura, M., Nabeshima, H., Fujita, H., Hasegawa, R.: Minimal model generation with respect to an atom set. In: First-Order Theorem Proving (FTP), p. 49 (2009)
Maldonado-Lopez, F.A., Chavarriaga, J., Donoso, Y.: Detecting network policy conflicts using Alloy. In: International Conference on Abstract State Machines, Alloy, B, and Z (2014)
Maoz, S., Ringert, J.O., Rumpe, B.: CD2Alloy: class diagrams analysis using Alloy revisited. In: Model Driven Engineering Languages and Systems (2011)
Maoz, S., Ringert, J.O., Rumpe, B.: CDDiff: semantic differencing for class diagrams. In: European Conference on Object Oriented Programming (2011)
Mason, W., Suri, S.: Conducting behavioral research on Amazon’s Mechanical Turk. Behav. Res. Methods 44(1), 1–23 (2012)
McCune, W.: Mace4 reference manual and guide. arXiv preprint cs/0310055 (2003)
Munzner, T.: Visualization Analysis and Design. CRC Press (2014)
Nelson, T., Danas, N., Dougherty, D.J., Krishnamurthi, S.: The power of “why” and “why not”: enriching scenario exploration with provenance. In: Foundations of Software Engineering (2017)
Nelson, T., Saghafi, S., Dougherty, D.J., Fisler, K., Krishnamurthi, S.: Aluminum: principled scenario exploration through minimality. In: ICSE, pp. 232–241 (2013)
Nelson, T., Barratt, C., Dougherty, D.J., Fisler, K., Krishnamurthi, S.: The Margrave tool for firewall analysis. In: Large Installation System Administration Conference (2010)
Niemelä, I.: A tableau calculus for minimal model reasoning. In: Miglioli, P., Moscato, U., Mundici, D., Ornaghi, M. (eds.) TABLEAUX 1996. LNCS, vol. 1071, pp. 278–294. Springer, Heidelberg (1996). doi:10.1007/3-540-61208-4_18
Ottley, A., Peck, E.M., Harrison, L.T., Afergan, D., Ziemkiewicz, C., Taylor, H.A., Han, P.K., Chang, R.: Improving Bayesian reasoning: the effects of phrasing, visualization, and spatial ability. Vis. Comput. Graph. 22(1), 529–538 (2016)
Peer, E., Vosgerau, J., Acquisti, A.: Reputation as a sufficient condition for data quality on Amazon Mechanical Turk. Behav. Res. Methods 46(4), 1023–1031 (2014)
Robinson, A., Voronkov, A.: Handbook of Automated Reasoning, vol. 1. Elsevier, Amsterdam (2001)
Ruchansky, N., Proserpio, D.: A (not) NICE way to verify the OpenFlow switch specification: formal modelling of the OpenFlow switch using Alloy. ACM Comput. Commun. Rev. 43(4), 527–528 (2013)
Saghafi, S., Danas, R., Dougherty, D.J.: Exploring theories with a model-finding assistant. In: Felty, A.P., Middeldorp, A. (eds.) CADE 2015. LNCS, vol. 9195, pp. 434–449. Springer, Cham (2015). doi:10.1007/978-3-319-21401-6_30
Simons, D.J.: Current approaches to change blindness. Vis. Cogn. 7(1–3), 1–15 (2000)
Torlak, E., Chang, F.S.H., Jackson, D.: Finding minimal unsatisfiable cores of declarative specifications. In: International Symposium on Formal Methods (FM) (2008)
Wills, G.J.: Visual exploration of large structured datasets. In: Proceedings of New Techniques and Trends in Statistics (NTTS), pp. 237–246 (1997)
Acknowledgment
This work is partially supported by the US National Science Foundation.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Danas, N., Nelson, T., Harrison, L., Krishnamurthi, S., Dougherty, D.J. (2017). User Studies of Principled Model Finder Output. In: Cimatti, A., Sirjani, M. (eds) Software Engineering and Formal Methods. SEFM 2017. Lecture Notes in Computer Science(), vol 10469. Springer, Cham. https://doi.org/10.1007/978-3-319-66197-1_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-66197-1_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66196-4
Online ISBN: 978-3-319-66197-1
eBook Packages: Computer ScienceComputer Science (R0)