Utility and accuracy of smell-driven performance analysis for end-user programmers☆
Introduction
Professional software engineers have long known that the performance of a program can have a powerful effect on its value. Some end-user programmers also share this concern about performance. For example, empirical studies have revealed that scientists tend to rely on textual general-purpose languages such as C++ or Fortran for high-performance computing, despite the availability of domain-specific languages designed for high usability [2], [10], [22].
LabVIEW is one example of a programming language that has high usability but sometimes leads to programs with inadequate performance. Its maker, National Instruments, claims LabVIEW is the “most widely used development environment for instrument connectivity and [hardware] test application,” particularly among engineers and scientists [12]. An independent survey of LabVIEW users investigated the reasons for this high level of adoption, finding that users appreciated LabVIEW primarily due to its visual dataflow language and secondarily due to its support for code reuse [26]. One respondent summarized, “The development time for LabVIEW is less than half that for C,” while another claimed to be “3X more productive than programming in C”. Yet at the same time, survey respondents sometimes found LabVIEW performance to be inadequate. They expressed concerns that LabVIEW offered no way to optimize usage of registers, memory, disk, and CPU. The researchers conducting the survey noted, “LabVIEW solves these problems by allowing the programmer to call code written in other languages”—in other words, the “solution” to the problem is to step outside LabVIEW.
These challenges are not unique to LabVIEW. Nardi lists HP VEE and Prograph as two other canonical examples of visual dataflow languages [16], where data input/output nodes are connected to one another via computation nodes and virtual dataflow “wires.” Scientists and engineers using these languages (e.g., [11], [23]) also encountered performance problems. To solve such a problem, “Parts of the application were written in the C language and linked to the VEE application to obtain higher throughput” [11].
In short, these empirical studies reveal a way in which visual dataflow languages can present a “low ceiling” over what users can accomplish [15]. To date, researchers have mainly considered language ceiling from the standpoint of functionality. For example, one paper discussed how end-user programmers might encounter a ceiling when they try to create custom widgets [15], while another discussed the ceiling that people encounter when they go from animating 2D images to animating 3D images [20]. In contrast, the empirical studies above reveal a ceiling in terms of how well programs carry out functions, rather than in terms of what functions are carried out. Helping end-user programmers to overcome performance problems and break through this ceiling – without leaving the visual dataflow language – is an unexplored challenge.
In this paper, we propose a new technique called smell-driven performance analysis (SDPA) aimed at meeting this challenge. The technique uses the established concept of a “bad smell,” which is a heuristic for finding sections of code that function correctly but that have poor quality [5], and applies this concept in the context of visual dataflow languages favored by some end-user programmer populations. Specifically, smell-driven performance analysis involves statically analyzing programs to heuristically detect areas with potential performance problems, optionally combining static analysis results with a profiling-based dynamic analysis (smell-driven profiling) that identifies areas in the code that are consuming significant execution time (“hotspots”), alerting end-user programmers about problems, and advising on how to fix those problems. The feedback is provided through situated explanations inside the language’s IDE, which means that the users will not have to switch windows to find the information and can remain in the visual language or IDE that they are comfortable working in.
To test this approach, we have created a proof-of-concept tool that applies our proposed technique to LabVIEW. To assess utility, we conducted a user study with programmers (skilled LabVIEW users) involving tasks that called for troubleshooting performance problems that we derived directly from real-world programs posted in online forums. The experiment showed that participants were more successful at finding performance problems with our prototype than without it, and having found a problem, they were faster at finding the correct solution. We tested the accuracy of this approach for finding non-trivial performance problems by retrieving a corpus of real programs that users had posted onto the LabVIEW online forums with requests for help at improving performance. We found that most of the issues with the code identified by our prototype did, in fact, indicate real cases where users’ code structures caused non-trivial impacts on performance in terms of CPU or memory usage. We found that activating the dynamic analysis enhancement of our technique further improved accuracy.
The remainder of this paper is organized as follows. Section 2 presents a formative study aimed at grounding our approach. Section 3 outlines the techniques of our approach in general terms, and Section 4 describes the specific implementation we created for evaluating the approach with LabVIEW. Section 5 presents the user study that showed high potential utility for users, while Section 6 discusses a second study showing that the approach does an accurate job of finding performance problems on a wide range of real user programs. 7 Threats to validity, 8 Related work, 9 Conclusions and future work summarize threats to validity, related work, and future work, respectively.
Section snippets
Preliminary formative study
In our formative study, we interviewed LabVIEW technical support personnel to investigate the problems that LabVIEW users encounter. We were particularly interested in knowing whether many performance problems primarily result from how end-user programmers implement programs.
Smell-driven performance analysis
The formative study revealed that many performance problems do result from end-user programmers’ code. Below, we describe our technique at the conceptual level, from the standpoint of an end-user programmer who needs help with solving a performance problem. We defer discussion of technical implementation details to the next section.
Prototype tool for LabVIEW
We have implemented a prototype tool to apply our technique on LabVIEW code. This tool is fully integrated with the version of the LabVIEW environment currently under development at National Instruments. This integration enabled us to evaluate the effectiveness of our technique in a realistic end-user programming environment. Even though we used LabVIEW for this work, we see no reason why we could not create similar tools for other visual dataflow languages, although presumably the specific
Evaluation of utility
We performed a laboratory study to evaluate the utility of the static SDPA technique implemented in our prototype, as well as to uncover opportunities for future prototype enhancements. During this within-subjects study, 13 people performed tasks with our tool, and the same 13 people performed comparable tasks without the tool. The tasks and tools were counterbalanced to cancel any learning- and task-related effects.
Evaluation of accuracy
To assess how well the approach would accurately identify performance problems in general, a corpus of real world LabVIEW programs was gathered that the tool could be run on. The source for these programs was the LabVIEW online forums (http://forums.ni.com/t5/LabVIEW/bd-p/170), where users post questions about how to create or improve programs. While there are many LabVIEW programs that are available on the forums, the parameters of this study required a focused search for programs that had
Threats to validity
The primary threat to validity is if the participants in our evaluation were atypical of most LabVIEW programmers, or if our programming tasks were atypical of real-world tasks. We intentionally recruited from a sample frame of people (AEs) who were likely to be able to find and solve performance problems on their own, without our tool. Yet even they found the tool helpful. So we anticipate that typical LabVIEW programmers would benefit even more from our tool than these participants did.
Related work
Our technique fits within the well-established area of tool support for software performance engineering [28], which includes performance testing and optimization. The novelty of our work is in its orientation toward the needs of end-user programmers, with a particular emphasis on providing situated explanations within a visual dataflow language. We grounded our technique in empirical formative research that led to our technique for identifying bad smells through static analysis. Other tools
Conclusions and future work
In this paper, we have presented the technique of smell-driven performance analysis, which helps end-user programmers to discover and solve performance problems in the source code of a visual dataflow language. The approach accomplishes this by providing situated explanations within the visual dataflow language’s IDE. We have developed a prototype implementation for LabVIEW and integrated it with the existing LabVIEW IDE.
We performed a formative study that confirmed our first hypothesis that
Acknowledgments
We thank National Instruments for funding this research, helping to recruit study participants, and providing access to the latest version of the LabVIEW development environment. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of National Instruments.
References (30)
- et al.
Usability analysis of visual programming environments: a ‘cognitive dimensions’ framework
J. Visual Lang Comput
(1996) - et al.
Visual programming in the wild: a survey of LabVIEW programmers
J. Visual Lang. Comput.
(2001) - Bodik, P., Goldszmidt, M., Fox, A., Woodard, D., Andersen, H. (2010) Fingerprinting the datacenter: automated...
- Carver, J., Kendall, R., Squires, S., Post, D. (2007) Software development environments for scientific and engineering...
- Dorn, B. (2011) ScriptABLE: supporting informal learning with cases, in: Proceedings of the Seventh International...
- Fischer, G. (1987) A critic for LISP, in: 10th International Joint Conference on Artificial Intelligence,...
- et al.
Refactoring: Improving the Design of Existing Code
(1999) - Han, S., Dang, Y., Ge, S., Zhang, D., Xie, T. (2012) Performance debugging in the large via mining millions of stack...
- Hassan, O., Ramaswamy, L., Miller, J. (2010) Enhancing scalability and performance of mashups through merging and...
- et al.
Visualizing the performance of parallel programs
IEEE Softw.
(1991)
Obstacles and opportunities with using visual and domain-specific languages in scientific programming
IEEE Symp. Visual Lang. Hum. Centr. Comput.
RF monitoring system in the Injector Linac
Int. Conf. Accel. Large Exp. Phys. Control Syst.
Comparing LabVIEW graphical code to text-based alternatives for use in test applications
IEEE Autotestcon
Content Analysis: An Introduction to Its Methodology
Cited by (6)
Towards a metrics suite for the complexity analysis of LabVIEW systems models
2023, Science of Computer ProgrammingLanguage impact on productivity for industrial end users: A case study from Programmable Logic Controllers
2022, Journal of Computer LanguagesCitation Excerpt :However, not all of these areas regularly face scrutiny and methodological evaluation: While previous work has analyzed, categorized, and standardized educational (e.g., [10,11]) and business (e.g., [12–15]) languages to great extent, industrial end-user programming (e.g., [16–18]) remains comparatively unexplored. Except for pioneering work on analyzing code smells in LabVIEW [19,20], little work has focused on industrial languages. Worse yet, there is little standardization: new, supposedly end-user friendly, programming languages constantly enter the market, and once popular languages rarely disappear entirely [21].
From Design Thinking to Art Thinking with an open innovation perspective-a case study of how Art Thinking rescued a cultural institution in Dublin
2018, Journal of Open Innovation: Technology, Market, and ComplexityImpact and utility of smell-driven performance tuning for end-user programmers
2015, Journal of Visual Languages and ComputingCitation Excerpt :If the programmer runs the program, SDPA can track whether those portions of the code consume a large proportion of the CPU time or memory, to help filter out false positives. The memory is tracked because excessive memory use has been observed to cause thrashing to disk, depending on how much memory happens to be available when the program is run, which is hard to predict in advance [4]. SDPA inserts icons into the dataflow program to indicate the portions of code that should probably be fixed, and it provides a table of explanations about these problems.
A Survey-Based Empirical Evaluation of Bad Smells in LabVIEW Systems Models
2021, Proceedings - 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2021BESMER: An approach for bad smells summarization in systems models
2019, Proceedings - 2019 ACM/IEEE 22nd International Conference on Model Driven Engineering Languages and Systems Companion, MODELS-C 2019
- ☆
An earlier version of this paper appeared as the following: Chambers, C., and Scaffidi, C. (2013) Smell-driven performance analysis for end-user programmers, IEEE Symposium on Visual Languages and Human-Centric Computing.
The current paper expands on the earlier paper by providing (1) an enhanced detection method called smell-driven profiling that also incorporates runtime analysis in addition to the static analysis presented in the earlier work, and (2) a new study assessing how accurately the original and enhanced methods can identify non-trivial performance problems.