Comparing manual and automated feature location in conceptual models: A Controlled experiment
Introduction
Feature Location (FL) has been recognized as one of the most common activities undertaken by software developers [1]. During maintenance activities, developers need to identify where and how a feature (i.e., particular functionality) is realized in software artifacts in order to fix bugs, introduce new features, and adapt or enhance existing features.
FL is considered as an important support activity during development, management, and maintenance of software, since it is helpful for a number of software tasks such as feature coverage, software reuse, program comprehension, or impact analysis. This kind of tasks are considered as a good practice by numerous major software standards such as CMMI or ISO 15504 [2], and can be critical to the success of a project [3], since they lead to increased maintainability and reliability of complex software systems [4] and decrease the expected defect rate in developed software [5].
Furthermore, as reflected in a recent survey [6], FL is gaining momentum in the research community since it helps initiate Software Product Lines (SPLs) from already existing software systems. SPLs enable a systematic reuse of variants to tailor different products. Savings of $584 million in development costs, a 2x-4x reduction in time to market, or a reduction in maintenance costs of around 60% are among the documented real-world examples of the benefits of SPLs [7]. Hence, there is a need to adopt SPLs in companies that deal with other complex software systems such as automotive, cyber-physical and robotics [8]. To adopt a SPL, the located features are used to formalize the commonalities and variabilities across the product family. To do the formalization, feature modeling [9] can be used. In spite of the utility of FL, manual FL is a challenging activity in complex and large repositories of software artifacts that have been developed over several years by different developers [2], [10], [11]. In this context, FL activities become time-consuming and error prone [12], [13], [14], [15].
In order to reduce the effort of developers during manual FL, researchers have presented several techniques that provide automated assistance to locate features. A compendium of the most well-known techniques can be found within the survey by Julia Rubin and Marsha Chechik [16]. In the survey, the techniques are classified into static or dynamic techniques (depending on whether they involve program execution information or not) and sub-classified into plain or guided techniques (depending on whether they produce an output automatically or semi-automatically with user guidance). Most of the techniques focus on FL in source code, and rely on Information Retrieval (IR) techniques to locate the features [1], [17], [18]. There are many IR techniques, but most of the research efforts show better results when applying Latent Semantic Indexing (LSI) [1], [19], [20].
Despite the importance of FL and the existence of techniques for automated assistance, research efforts tend to make comparisons between automated techniques, without comparing them against manual FL. In addition, automated techniques are focused on code, neglecting other software artifacts such as models (which have proved to increase efficiency and effectiveness in software development [21]). Thus, several important questions remain unanswered with regard to the differences in performance, productivity, and satisfaction when the manual and automated FL treatments are used to locate features in models.
To answer these questions, we conducted an experiment to compare manual FL against automated FL in models. Specifically, we recruited 18 subjects (5 experts and 13 non-experts) in the domain of our industrial partner, BSH, who has manufactured induction hobs (under the Siemens and Bosch brands, among others) for more than 15 years. For the manual FL treatment, the subjects manually located the model elements that realize a set of features using the name of each feature and models as the search space. For the automated FL treatment, we used an algorithm that leverages LSI to obtain the model elements that realize a feature description, provided by the subjects.
The experiment was conducted in terms of performance (recall, precision, and F-measure), productivity (ratio between F-measure and spent time), and satisfaction (perceived ease of use, perceived usefulness, and intention to use). Manual FL obtains average values of 44.42% recall, 42.36% precision, 41.49% F-measure, 3.43%/min productivity, 3.42 Perceived Ease of Use (PEOU), 3.47 Perceived Usefulness (PU), and 3.22 Intention to Use (ITU). Automated FL obtains average values of 76.60% recall, 14.57% precision, 22.44% F-measure, 1.21%/min productivity, 3.42 PEOU, 3.56 PU, and 3.33 ITU. After the experiment, we analyzed the results of the manual and automated FL treatments by means of a statistical analysis to find out whether significant differences exist between both. The analysis determines that the differences in performance and productivity are statistically significant, while revealing that differences in satisfaction are so minimal as to be of no practical statistical significance.
The results of our work suggest that (1) neither domain experts nor domain non-experts find the perfect solutions for the features, (2) manual FL outperforms automated FL, and (3) satisfaction results are very similar for both manual and automated FL. These issues and their causes have a number of readings and implications that can be leveraged to either improve the results for manual and automated FL (by, for instance, pairing software engineers or designing complementary artefacts for automated approaches), or to design further experiments that tackle novel research questions that arise from this work. Overall, the contributions of the paper can be summarized as follows:
- •
We propose an experiment for comparing manual and automated FL in models.
- •
We show that neither domain experts nor domain non-experts find the perfect solutions when locating features.
- •
We show that the use of the automated FL yields worse results than those that are manually obtained. This is a novel point that our work uncovers since research efforts have so far compared assistance tools against assistance tools in order to improve their results, instead of comparing them against humans.
- •
Our analysis suggests how to advance research on FL to lead to an improvement of the results for assistance tools. In addition, the findings of our work present implications for the usage and possible combination of manual, automated, and guided FL techniques.
The rest of the paper is structured as follows: Section 2 provides the necessary background in FL in models, manual FL, and automated FL. Section 3 describes the design of the experiment. Section 4 presents the results and their statistical analysis. Section 5 discusses the outcomes of our work. Section 6 deals with the threats to the validity of our work. Section 7 reviews the related work. Finally, Section 8 concludes the paper.
Section snippets
Feature location in models
Feature Location (FL) is one of the most important and common activities performed by developers during software maintenance and evolution [22]. FL is the process of finding the set of software artifacts that realize a specific feature. FL can be performed either manually or in an automated fashion. Manual FL is a common practice but it can become error prone and time-consuming [8], [10], [12], [13], [14], [15], [23], so automated FL has received much attention during recent years [6], [16],
Objective
The experiment for comparing manual FL with automated FL in models was designed following the Wohlin et al. guidelines [35]. The goal of our experiment was to analyze FL in models, for the purpose of filling in the gap in empirical evaluation on this topic, with respect to the different FL treatments, from the viewpoint of both experts and non-experts in a domain, in the context of software development for induction hobs.
The measures used in our research to achieve the determined goal are
Results
The findings of our work for each of the research questions under study can be summarized as follows:
RQ1: While automated FL obtains better recall values than manual FL, manual FL outperforms automated FL overall. Differences in the results are statistically significant.
RQ2: Manual FL obtains better productivity results than automated FL. Again, differences in the results are statistically significant.
RQ3: Automated FL obtains generally better results than manual FL regarding satisfaction.
Discussion
Table 3 provides an overview of the main discussion issues that arise from the results of our work. The rest of the section provides more details on each of the points outlined in the table.
The results of our work suggest that neither domain experts nor domain non-experts find the perfect solutions for the features. In addition, the performance results show indicates that while automated FL outperforms manual FL up to 32.18% in recall, manual FL outperforms automated FL up to 27.79% in
Threats to validity
To describe the threats of validity of our work, we use the classification of [54], which distinguishes four aspects of validity (construct validity, internal validity, external validity, and reliability).
Construct validity reflects the extent to which the operational measures that are studied represent what the researchers have in mind and what is investigated based on the research questions. There are six threats of this kind: author bias, task design, mono-method bias, hypothesis guessing,
Related work
Some works research how developers locate features. The work presented in [59] reports an exploratory study of FL, consisting of three experiments with six FL exercises. The study evaluates the quality of FL and the impact of explicit FL knowledge, also proposing a conceptual framework for understanding FL processes. In [60], the authors present an exploratory case study on identifying and manually locating features in Marlin, a variant-rich open-source embedded firmware. Another work [20]
Conclusions
Feature Location (FL) is one of the most frequent activities in software development, particularly during maintenance activities. However, in industrial environments, software artifacts are developed over long periods of time by different software engineers, resulting in complex and large repositories, and thus FL becomes a challenging, time-consuming activity that does not guarantee good results. To tackle this issue, researches have proposed automated Information Retrieval techniques such as
CRediT authorship contribution statement
Francisca Pérez: Conceptualization, Methodology, Formal analysis, Investigation, Resources, Data curation, Writing - original draft, Writing - review & editing, Supervision. Jorge Echeverría: Methodology, Formal analysis, Investigation, Resources, Data curation, Visualization. Raúl Lapeña: Software, Validation, Data curation, Writing - original draft, Writing - review & editing. Carlos Cetina: Conceptualization, Investigation, Resources, Writing - review & editing, Supervision, Project
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
This work has been partially supported by the Ministry of Economy and Competitiveness (MINECO) through the Spanish National R+D+i Plan and ERDF funds under the Project ALPS (RTI2018-096411-B-I00).
References (68)
- et al.
Feature location benchmark for extractive software product line adoption research using realistic and synthetic eclipse variants
Inf. Softw. Technol.
(2018) - et al.
Where is my feature and what is it about? A case study on recovering feature facets
J. Syst. Softw.
(2019) - et al.
In search of evidence for model-driven development claims
Inf. Soft. Techn.
(2015) - et al.
Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power
Inf. Sci. (Ny)
(2010) - et al.
Understanding replication of experiments in software engineering: a classification
Inf. Softw. Technol.
(2014) - et al.
A comparison of methods for locating features in legacy software
J. Syst. Softw.
(2003) - et al.
Improving feature location in long-living model-based product families designed with sustainability goals
J. Syst. Softw.
(2017) - et al.
Feature location using probabilistic ranking of methods based on execution scenarios and information retrieval
IEEE Trans. Softw. Eng.
(2007) - et al.
On the equivalence of information retrieval methods for automated traceability link recovery
2010 IEEE 18th International Conference on Program Comprehension
(2010) - et al.
Why and how of requirements tracing
IEEE Softw.
(1994)
A research agenda for software reliability
IEEE Reliab. Soc. 2009 Ann. Technol. Rep.
Preventing defects: the impact of requirements traceability completeness on software quality
IEEE Trans. Softw. Eng.
Reengineering legacy applications into software product lines: a systematic mapping
Empir. Softw. Eng.
The state of adoption and the challenges of systematic variability management in industry
Empir. Softw. Eng.
Feature-Oriented Domain Analysis (FODA) feasibility study
Technical Report
Features and how to find them: A survey of manual feature location
Software Engineering for Variability Intensive Systems - Foundations and Applications
Ontological approach for the semantic recovery of traceability links between software artefacts
IET Softw.
Map - mining architectures for product line evaluations
Proceedings Working IEEE/IFIP Conference on Software Architecture
Model-based customization and deployment of eclipse-based tools: industrial experiences
2009 IEEE/ACM International Conference on Automated Software Engineering
Advancing candidate link generation for requirements tracing: the study of methods
IEEE Trans. Softw. Eng.
An exploratory study of how developers seek, relate, and collect relevant information during software maintenance tasks
IEEE Trans. Softw. Eng.
A survey of feature location techniques
Domain Engineering
Recovering documentation-to-source-code traceability links using latent semantic indexing
Proceedings of the 25th International Conference on Software Engineering
Sniafl: towards a static non-interactive approach to feature location
Proceedings. 26th International Conference on Software Engineering
Using data fusion and web mining to support feature location in software
IEEE 18th International Conference on Program Comprehension (ICPC)
Feature location via information retrieval based filtering of a single scenario execution trace
Proceedings of the Twenty-second IEEE/ACM International Conference on Automated Software Engineering
Model-driven software engineering in practice
Synthesis Lect. Softw. Eng.
Feature location in source code: a taxonomy and survey
J. Softw.
Feature location in models through a genetic algorithm driven by information retrieval techniques
Proceedings of the ACM/IEEE 19th International Conference on Model Driven Engineering Languages and Systems
Automating the extraction of model-based software product lines from model variants (t)
2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE)
Domain-specific languages: an annotated bibliography
SIGPLAN Not.
A survey of traceability in requirements engineering and model-Driven development
Softw. Syst. Model. (SoSyM)
Cited by (6)
Evaluating the influence of scope on feature location
2021, Information and Software TechnologyCitation Excerpt :These facts suggest that the tasks are non-trivial. Furthermore, the domain used in this experiment has already been used in previous works [24][40]. The domain threat appears since we only analyzed the induction hob domain.
FeatRacer: Locating Features Through Assisted Traceability
2023, IEEE Transactions on Software EngineeringEnvironmental and economic sustainability assessment of an industry 4.0 application: the AGV implementation in a food industry
2022, International Journal of Advanced Manufacturing TechnologyHAnS: IDE-based editing support for embedded feature annotations
2021, ACM International Conference Proceeding SeriesVariability management: re-engineering microservices with delta-oriented software product lines
2020, ACM International Conference Proceeding Series