Utility and accuracy of smell-driven performance analysis for end-user programmers

doi:10.1016/j.jvlc.2014.10.017

Journal of Visual Languages & Computing

Volume 26, February 2015, Pages 1-14

https://doi.org/10.1016/j.jvlc.2014.10.017 Get rights and content

Highlights

•
Smell-driven performance analysis (SDPA) finds dataflow performance problems.
•
SDPA provides situated explanations within the visual dataflow language.
•
We present an extended form of the technique that incorporates runtime profiling.
•
In a user study, participants could more easily diagnose performance problems.
•
A second study confirmed that profiling improves accuracy.

Abstract

This paper proposes a technique, called smell-driven performance analysis (SDPA), which automatically provides situated explanations within a visual dataflow language IDE to help end-user programmers to overcome performance problems without leaving the visual dataflow paradigm. An experiment showed SDPA increased end-user programmers’ success rates at finding performance problems and decreased the time required for finding solutions. Another study, based on using SDPA to analyze a corpus of example end-user programs, revealed that it is usually accurate at identifying performance problems. Based on these results, we conclude that SDPA provides a reliable basis for helping end-user programmers to troubleshoot performance problems, as well as a potential foundation for future work aimed at training users and at aiding code reuse.

Introduction

Professional software engineers have long known that the performance of a program can have a powerful effect on its value. Some end-user programmers also share this concern about performance. For example, empirical studies have revealed that scientists tend to rely on textual general-purpose languages such as C++ or Fortran for high-performance computing, despite the availability of domain-specific languages designed for high usability [2], [10], [22].

LabVIEW is one example of a programming language that has high usability but sometimes leads to programs with inadequate performance. Its maker, National Instruments, claims LabVIEW is the “most widely used development environment for instrument connectivity and [hardware] test application,” particularly among engineers and scientists [12]. An independent survey of LabVIEW users investigated the reasons for this high level of adoption, finding that users appreciated LabVIEW primarily due to its visual dataflow language and secondarily due to its support for code reuse [26]. One respondent summarized, “The development time for LabVIEW is less than half that for C,” while another claimed to be “3X more productive than programming in C”. Yet at the same time, survey respondents sometimes found LabVIEW performance to be inadequate. They expressed concerns that LabVIEW offered no way to optimize usage of registers, memory, disk, and CPU. The researchers conducting the survey noted, “LabVIEW solves these problems by allowing the programmer to call code written in other languages”—in other words, the “solution” to the problem is to step outside LabVIEW.

These challenges are not unique to LabVIEW. Nardi lists HP VEE and Prograph as two other canonical examples of visual dataflow languages [16], where data input/output nodes are connected to one another via computation nodes and virtual dataflow “wires.” Scientists and engineers using these languages (e.g., [11], [23]) also encountered performance problems. To solve such a problem, “Parts of the application were written in the C language and linked to the VEE application to obtain higher throughput” [11].

In short, these empirical studies reveal a way in which visual dataflow languages can present a “low ceiling” over what users can accomplish [15]. To date, researchers have mainly considered language ceiling from the standpoint of functionality. For example, one paper discussed how end-user programmers might encounter a ceiling when they try to create custom widgets [15], while another discussed the ceiling that people encounter when they go from animating 2D images to animating 3D images [20]. In contrast, the empirical studies above reveal a ceiling in terms of how well programs carry out functions, rather than in terms of what functions are carried out. Helping end-user programmers to overcome performance problems and break through this ceiling – without leaving the visual dataflow language – is an unexplored challenge.

In this paper, we propose a new technique called smell-driven performance analysis (SDPA) aimed at meeting this challenge. The technique uses the established concept of a “bad smell,” which is a heuristic for finding sections of code that function correctly but that have poor quality [5], and applies this concept in the context of visual dataflow languages favored by some end-user programmer populations. Specifically, smell-driven performance analysis involves statically analyzing programs to heuristically detect areas with potential performance problems, optionally combining static analysis results with a profiling-based dynamic analysis (smell-driven profiling) that identifies areas in the code that are consuming significant execution time (“hotspots”), alerting end-user programmers about problems, and advising on how to fix those problems. The feedback is provided through situated explanations inside the language’s IDE, which means that the users will not have to switch windows to find the information and can remain in the visual language or IDE that they are comfortable working in.

To test this approach, we have created a proof-of-concept tool that applies our proposed technique to LabVIEW. To assess utility, we conducted a user study with programmers (skilled LabVIEW users) involving tasks that called for troubleshooting performance problems that we derived directly from real-world programs posted in online forums. The experiment showed that participants were more successful at finding performance problems with our prototype than without it, and having found a problem, they were faster at finding the correct solution. We tested the accuracy of this approach for finding non-trivial performance problems by retrieving a corpus of real programs that users had posted onto the LabVIEW online forums with requests for help at improving performance. We found that most of the issues with the code identified by our prototype did, in fact, indicate real cases where users’ code structures caused non-trivial impacts on performance in terms of CPU or memory usage. We found that activating the dynamic analysis enhancement of our technique further improved accuracy.

The remainder of this paper is organized as follows. Section 2 presents a formative study aimed at grounding our approach. Section 3 outlines the techniques of our approach in general terms, and Section 4 describes the specific implementation we created for evaluating the approach with LabVIEW. Section 5 presents the user study that showed high potential utility for users, while Section 6 discusses a second study showing that the approach does an accurate job of finding performance problems on a wide range of real user programs. 7 Threats to validity, 8 Related work, 9 Conclusions and future work summarize threats to validity, related work, and future work, respectively.

Section snippets

Preliminary formative study

In our formative study, we interviewed LabVIEW technical support personnel to investigate the problems that LabVIEW users encounter. We were particularly interested in knowing whether many performance problems primarily result from how end-user programmers implement programs.

Smell-driven performance analysis

The formative study revealed that many performance problems do result from end-user programmers’ code. Below, we describe our technique at the conceptual level, from the standpoint of an end-user programmer who needs help with solving a performance problem. We defer discussion of technical implementation details to the next section.

Prototype tool for LabVIEW

We have implemented a prototype tool to apply our technique on LabVIEW code. This tool is fully integrated with the version of the LabVIEW environment currently under development at National Instruments. This integration enabled us to evaluate the effectiveness of our technique in a realistic end-user programming environment. Even though we used LabVIEW for this work, we see no reason why we could not create similar tools for other visual dataflow languages, although presumably the specific

Evaluation of utility

We performed a laboratory study to evaluate the utility of the static SDPA technique implemented in our prototype, as well as to uncover opportunities for future prototype enhancements. During this within-subjects study, 13 people performed tasks with our tool, and the same 13 people performed comparable tasks without the tool. The tasks and tools were counterbalanced to cancel any learning- and task-related effects.

Evaluation of accuracy

To assess how well the approach would accurately identify performance problems in general, a corpus of real world LabVIEW programs was gathered that the tool could be run on. The source for these programs was the LabVIEW online forums (http://forums.ni.com/t5/LabVIEW/bd-p/170), where users post questions about how to create or improve programs. While there are many LabVIEW programs that are available on the forums, the parameters of this study required a focused search for programs that had

Threats to validity

The primary threat to validity is if the participants in our evaluation were atypical of most LabVIEW programmers, or if our programming tasks were atypical of real-world tasks. We intentionally recruited from a sample frame of people (AEs) who were likely to be able to find and solve performance problems on their own, without our tool. Yet even they found the tool helpful. So we anticipate that typical LabVIEW programmers would benefit even more from our tool than these participants did.

Related work

Our technique fits within the well-established area of tool support for software performance engineering [28], which includes performance testing and optimization. The novelty of our work is in its orientation toward the needs of end-user programmers, with a particular emphasis on providing situated explanations within a visual dataflow language. We grounded our technique in empirical formative research that led to our technique for identifying bad smells through static analysis. Other tools

Conclusions and future work

In this paper, we have presented the technique of smell-driven performance analysis, which helps end-user programmers to discover and solve performance problems in the source code of a visual dataflow language. The approach accomplishes this by providing situated explanations within the visual dataflow language’s IDE. We have developed a prototype implementation for LabVIEW and integrated it with the existing LabVIEW IDE.

We performed a formative study that confirmed our first hypothesis that

Acknowledgments

We thank National Instruments for funding this research, helping to recruit study participants, and providing access to the latest version of the LabVIEW development environment. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of National Instruments.

References (30)

T. Green et al.
Usability analysis of visual programming environments: a ‘cognitive dimensions’ framework
J. Visual Lang Comput
(1996)
K. Whitley et al.
Visual programming in the wild: a survey of LabVIEW programmers
J. Visual Lang. Comput.
(2001)
Bodik, P., Goldszmidt, M., Fox, A., Woodard, D., Andersen, H. (2010) Fingerprinting the datacenter: automated...
Carver, J., Kendall, R., Squires, S., Post, D. (2007) Software development environments for scientific and engineering...
Dorn, B. (2011) ScriptABLE: supporting informal learning with cases, in: Proceedings of the Seventh International...
Fischer, G. (1987) A critic for LISP, in: 10th International Joint Conference on Artificial Intelligence,...
M. Fowler et al.
Refactoring: Improving the Design of Existing Code
(1999)
Han, S., Dang, Y., Ge, S., Zhang, D., Xie, T. (2012) Performance debugging in the large via mining millions of stack...
Hassan, O., Ramaswamy, L., Miller, J. (2010) Enhancing scalability and performance of mashups through merging and...
M. Heath et al.
Visualizing the performance of parallel programs
IEEE Softw.
(1991)

M. Jones et al.

Obstacles and opportunities with using visual and domain-specific languages in scientific programming

IEEE Symp. Visual Lang. Hum. Centr. Comput.

(2011)

H. Katagiri et al.

RF monitoring system in the Injector Linac

Int. Conf. Accel. Large Exp. Phys. Control Syst.

(1999)

E. Kerry et al.

Comparing LabVIEW graphical code to text-based alternatives for use in test applications

IEEE Autotestcon

(2010)

K. Krippendorff

Content Analysis: An Introduction to Its Methodology

(2012)

Liu, J., Wei, J., Ye, D., Huang, T. (2010) A new approach to performance optimization of mashups via data flow...

Cited by (6)

Towards a metrics suite for the complexity analysis of LabVIEW systems models
2023, Science of Computer Programming
LabVIEW is a popular commercial modeling tool that is often used in systems engineering. LabVIEW also includes a special programming language developed for engineers to help them support the automation of computer-aided systems. Although LabVIEW is widely used in various fields (e.g., industrial design, academic research, and engineering education), there has not been much attention given to the systems models built in LabVIEW (e.g., support for analyzing the complexity of systems models). Our previous work in surveying engineers who use LabVIEW suggests that systems engineers are deeply concerned about the complexity of the LabVIEW systems models that they create. To address the need for additional support in understanding the complexity of LabVIEW systems models, we introduce in this paper a metrics suite to assist end-users in characterizing the complexities of LabVIEW systems models from different aspects. We theoretically validated the metrics using Weyuker's validation. In addition, our metric suite was applied to 10 LabVIEW models mined from GitHub to empirically evaluate their suitability to support the description of systems model complexities. Our research is one of the first efforts to address the complexity analysis of LabVIEW systems models through a software metrics approach.
Language impact on productivity for industrial end users: A case study from Programmable Logic Controllers
2022, Journal of Computer Languages
Citation Excerpt :
However, not all of these areas regularly face scrutiny and methodological evaluation: While previous work has analyzed, categorized, and standardized educational (e.g., [10,11]) and business (e.g., [12–15]) languages to great extent, industrial end-user programming (e.g., [16–18]) remains comparatively unexplored. Except for pioneering work on analyzing code smells in LabVIEW [19,20], little work has focused on industrial languages. Worse yet, there is little standardization: new, supposedly end-user friendly, programming languages constantly enter the market, and once popular languages rarely disappear entirely [21].
Industrial workplaces increasingly require end-users to create programs for embedded systems, but little expert scrutiny has been devoted to studying this domain. As a result, industrial end-user programmers may rely on programming languages and development environments that do not necessarily follow the state-of-the-art of software engineering. Consider Ladder Logic, the most popular language used to program the most widely deployed type of industrial hardware, programmable logic controllers (PLCs). Ladder Logic’s fundamental design is based on electric relay circuits that have long since disappeared from practice. Does Ladder Logic inhibit the productivity of end-user programmers, slowing progress in industrial settings like manufacturing sites and scientific labs where it is widely used? To better understand the usage of domain-specific languages in industrial practices, we conducted a survey with 175 technical employees from an international engineering conglomerate. This survey introduced participants to Ladder Logic and asked them questions that all programmers, including novices, should answer with ease. Nearly 70% failed, including those with previous Ladder Logic experience. We combined end-user performance with answers in an open-ended question, where many employees complained about the programming language. The breadth and depth of these struggles suggest that outdated languages, which industrial end users must increasingly use, could dramatically impact productivity and that further studies on these industrial end user programmers be necessary to better support them in their increasingly complex workplaces.
From Design Thinking to Art Thinking with an open innovation perspective-a case study of how Art Thinking rescued a cultural institution in Dublin
2018, Journal of Open Innovation: Technology, Market, and Complexity
This article uses a contemporary and revelatory case study to explore the relationship between three conversations in the innovation literature: Design Thinking, creativity in strategy, and the emerging area of Art Thinking. Businesses are increasingly operating in a VUCA environment where they need to design better experiences for their customers and better outcomes for their firm and the Arts are no exception. Innovation, or more correctly, growth through innovation, is a top priority for business and although there is no single, unifying blueprint for success at innovation, Design Thinking is the process that is receiving most attention and getting most traction. We review the literature on Design Thinking, showing how it teaches businesses to think with the creativity and intuition of a designer to show a deep understanding of, and have empathy with, the user. However, Design Thinking has limitations. By placing the consumer at the very heart of the innovation process, Design Thinking can often lead to more incremental, rather than radical, ideas. Now there is a new perspective emerging, Art Thinking, in which the objective is not to design a journey from the current scenario, A, to an improved position, A+. Art Thinking requires the creation of an optimal position B, and spends more time in the open-ended problem space, staking out possibilities and looking for uncontested space. This paper offers a single case study of a national arts organisation in Dublin facing an existential crisis, which used an Art Thinking approach successfully to give a much-needed shot in the arm to its commercial innovation activities.
Impact and utility of smell-driven performance tuning for end-user programmers
2015, Journal of Visual Languages and Computing
Citation Excerpt :
If the programmer runs the program, SDPA can track whether those portions of the code consume a large proportion of the CPU time or memory, to help filter out false positives. The memory is tracked because excessive memory use has been observed to cause thrashing to disk, depending on how much memory happens to be available when the program is run, which is hard to predict in advance [4]. SDPA inserts icons into the dataflow program to indicate the portions of code that should probably be fixed, and it provides a table of explanations about these problems.
This paper proposes a technique, called Smell-driven performance tuning (SDPT), which semi-automatically assists end-user programmers with fixing performance problems in visual dataflow programming languages. A within-subjects laboratory experiment showed SDPT increased end-user programmers’ success rate and decreased the time they required. Another study, based on using SDPT to analyze a corpus of example end-user programs, demonstrated that applying all available SDPT transformations achieved an execution time improvement of 42% and a memory usage improvement of 20%, comparable to improvements that expert programmers historically had manually achieved on the same programs. These results indicate that SDPT is an effective method for helping end-user programmers to fix performance problems.
A Survey-Based Empirical Evaluation of Bad Smells in LabVIEW Systems Models
2021, Proceedings - 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2021
BESMER: An approach for bad smells summarization in systems models
2019, Proceedings - 2019 ACM/IEEE 22nd International Conference on Model Driven Engineering Languages and Systems Companion, MODELS-C 2019

^☆

An earlier version of this paper appeared as the following: Chambers, C., and Scaffidi, C. (2013) Smell-driven performance analysis for end-user programmers, IEEE Symposium on Visual Languages and Human-Centric Computing.

The current paper expands on the earlier paper by providing (1) an enhanced detection method called smell-driven profiling that also incorporates runtime analysis in addition to the static analysis presented in the earlier work, and (2) a new study assessing how accurately the original and enhanced methods can identify non-trivial performance problems.

View full text

Utility and accuracy of smell-driven performance analysis for end-user programmers☆