Towards semi-automated assignment of software change requests

doi:10.1016/j.jss.2016.01.038

Journal of Systems and Software

Volume 115, May 2016, Pages 82-101

https://doi.org/10.1016/j.jss.2016.01.038 Get rights and content

Highlights

•
We present a configurable approach to assign Change Requests to software developers.
•
It supports contextual information necessary to dynamic environments.
•
The approach relies on Rule-Based Expert System and machine learning techniques.
•
It shows an improvement of accuracy up to 46.5% over other approaches.

Abstract

Change Requests (CRs) are key elements to software maintenance and evolution. Finding the appropriate developer to a CR is crucial for obtaining the lowest, economically feasible, fixing time. Nevertheless, assigning CRs is a labor-intensive and time consuming task. In this paper, we report on a questionnaire-based survey with practitioners to understand the characteristics of CR assignment, and on a semi-automated approach for CR assignment which combines rule-based and machine learning techniques. In accordance with the results of the survey, the proposed approach emphasizes the use of contextual information, essential to effective assignments, and puts the development team in control of the assignment rules, toward making its adoption easier. The assignment rules can be either extracted from the assignment history or created from scratch. An empirical validation was performed through an offline experiment with CRs from a large software project. The results pointed out that the approach is up to 46.5% more accurate than other approaches which relying solely on machine learning techniques. This indicates that a rule-based approach is a viable and simple method to leverage CR assignments.

Introduction

Change Request (CR) are software artifacts that describe defects to be fixed or enhancements to be implemented in a software system (Cavalcanti et al., 2013a). CRs are managed with the support of a CR repository software, such as Bugzilla (Bugzilla, 2013) and Mantis (Mantis Bug Tracker, 2013). These repositories play a fundamental role in the software maintenance process, being a common place for communication and coordination among different stakeholders (Bertram et al., 2010). Indeed, the CR artifact is the primary unit of work in many software development projects (Anvik and Murphy, 2007).

The task of assigning a CR, also known as CR triage, consists of selecting the most suitable software developer to handle a given CR. Generally, such a developer is the one who has enough expertise to handle the issues reported in the CR (Aljarah et al., 2011). In addition, the assignment decision must take into account the developer’s workload, availability, and the CR priority, in order to obtain the lowest, economically feasible time to fix (Di Lucca, Di Penta, Gradara, 2002, Hosseini, Nguyen, Godfrey, 2012, Cavalcanti, Neto, Machado, de Almeida, de Lemos Meira, 2013). Thus, this task requires considerable knowledge of the project, and good communication skills to negotiate with the involved stakeholders (Cavalcanti et al., 2013c).

Assigning CRs to developers is both labor-intensive and time consuming, as it is usually regarded as a manual handling task (Anvik, Hiew, Murphy, 2006, Jeong, Kim, Zimmermann, 2009). Depending on the software project, the number of new CRs can vary from dozens to hundreds in a single day (Cavalcanti et al., 2013a). As a consequence, the greater the number of CRs that are opened, the more complex the problem becomes.

Several automated approaches have been proposed to overcome the problem of CR assignment by using machine learning techniques. Some of these approaches are based on the hypothesis that the most suitable developer for a new CR is the one who has already solved similar CRs in the past (Di Lucca, Di Penta, Gradara, 2002, Cubranic, Murphy, 2004, Anvik, Hiew, Murphy, 2006, Ahsan, Ferzund, Wotawa, 2009b, Jeong, Kim, Zimmermann, 2009, Lin, Shu, Yang, Hu, Wang, 2009, Rahman, Ruhe, Zimmermann, 2009). Other approaches consider that an appropriate developer can be found by looking at past CRs and data from version control systems (Canfora, Cerulo, 2006, Ahsan, Ferzund, Wotawa, 2009a, Matter, Kuhn, Nierstrasz, 2009, Kagdi, Gethers, Poshyvanyk, Hammad, 2012) or source code (Linares-Vásquez et al., 2012). In general, these approaches use machine learning techniques to automatically suggest a list of appropriate developers for a new incoming CR.

Despite the number of proposals, there is no empirical evidence about their applicability to real-world environments. To the best of our knowledge, most practitioners are still assigning CRs manually. Current approaches have not been adopted because of two main problems, as follows (Cavalcanti et al., 2014):

•
They were designed to be autonomous, so that the software analysts do not have the control of the approach; this is, they cannot modify the behavior of the approach. Without such control, in turn, the approach cannot be properly calibrated. As a consequence, if its performance is not satisfactory, it is simply discarded.
•
These approaches lack contextual information necessary to assign CRs properly. Software development companies might be highly dynamic, in terms of involved staff, e.g., developers move from project to project; developers can be hired/fired during project development; or they can even take a vacation or a day off. This dynamic influences the assignment of CRs. Thus, contextual information impacts the performance of automated approaches.

In this paper, we present a configurable approach developed to assign CRs which enables software analysts to control its behavior, as well as, it provides a mean to support contextual information necessary to perform effective assignments in dynamic environments. The approach relies on Rule-Based Expert System (RBES) and machine learning techniques.

The main ideas for this work come from our past three publications (Cavalcanti, da Mota Silveira Neto, Machado, Vale, de Almeida, de Lemos Meira, 2013, Cavalcanti, Neto, Machado, de Almeida, de Lemos Meira, 2013, Cavalcanti, Machado, da Mota Silveira Neto, de Almeida, de Lemos Meira, 2014). In Cavalcanti et al. (2014), our approach was introduced but with less details. Thus we added more information about the approach, such as its architecture, implementation, and machine learning techniques. From Cavalcanti et al. (2013c), which is a survey with software developers, we selected the specifics results that helped us to propose the semi-automated solution. Then, we used results from the work (Cavalcanti et al., 2013b), which is an extensive mapping study on CR repositories issues, to elaborate the related work specific to the topic of assigning CRs.

Besides putting together these work, we also provided an extended experimental study of the proposed approach. According to the experiment performed, which compared our approach against other solution based solely on machine learning algorithm, we observed that ours improved the accuracy of assignments by 46.5%.

The remainder of this paper is organized as follows: Section 2 provides some background on CR management; in Section 3 we present the questionnaire-based survey; Section 4 presents the proposed approach to semi-automate the assignment of CRs; Section 5 describes the empirical validation performed to evaluate the proposed approach; Section 6 describes related work; and Section 7 concludes this work.

Section snippets

Change request management

A CR is a software artifact that describes a defect to be fixed, an adaptive or perfective change, or a new functionality to be implemented in a software system (Cavalcanti et al., 2013a). They are managed with the support of specific software systems which we simply refer as CR repositories. Examples of such repositories are Bugzilla (Bugzilla, 2013), Mantis (Mantis Bug Tracker, 2013), RedMine (Redmine, 2013), and Trac (The Trac Project, 2013). The CR repositories play a fundamental role in

Understanding change request assignment

Although many automated approaches for CR assignment were proposed, there is a lack of research for investigating the characteristics of the activity itself. Such kind of investigation would be helpful towards driving more effective solutions, since the specific aspects of the task can be understood and properly handled. In this sense, this section presents a questionnaire-based survey with the objective of understanding the impact of CR assignment on software development (Cavalcanti et al.,

Semi-automated approach to change request assignment

According to the results presented in the previous section, we can conclude that CR assignment takes place in complex and highly dynamic environments. In these environments, analysts using an automated approach to assign CRs would need to intervene in such approach for changing its behavior in order to meet the new facts of the environment.

For instance, if a developer moved out from a project, then the CRs that would be assigned to him should be routed to some other developer, also respecting

Empirical validation

In order to validate the proposed approach, it is reasonable that we compare it in relation to other approaches proposed in the literature. Machine learning-based approaches have proved to be the best choice to assign CRs, especially when applying the SVM algorithm (Cavalcanti et al., 2013b). In this way, this experiment compared the proposed approach with a pure machine learning approach that uses the SVM algorithm.

The experiment was performed as an off-line experiment, in which we used

Related work

Most of the research found in the literature addresses the problem of CR assignment by using machine learning models and techniques with the objective of providing automated solutions. When applying machine learning models to the CR assignment problem, the content of an incoming CR is used to query the database of the CRs already fixed. Then, a list of potential developers to be assigned is retrieved from the CRs in the query results and suggested to the analyst. The analyst, in turn, selects

Conclusion and future work

CRs are key elements to software maintenance; however, assigning CRs to developers is expensive. In order to overcome this, researchers have proposed semi-automated approaches for CR assignment. Although they represent advances to the area, to the best of our knowledge they have not been adopted in practice. This is mainly because these approaches lack controlling mechanisms, for changing their behavior, and they do not consider contextual information that may influence CR assignment, such as

Yguaratã Cerqueira Cavalcanti received his Ph.D. degree in Computer Science from the Federal University of Pernambuco. He is experienced in Software Engineering, working mainly with development of reusable software and maintenance and evolution of software. In his research, he have applied techniques for mining software repositories in order to improve the practice of Software Engineering. He is a researcher in the Reuse in Software Engineering (RiSE) group and the National Institute for

References (53)

AhsanS.N. et al.
Automatic classification of software change request using multi-label machine learning methods
Proceedings of the 2009 33rd Annual IEEE Software Engineering Workshop (SEW’2009)
(2009)
AhsanS.N. et al.
Automatic software bug triage system (BTS) based on latent semantic indexing and support vector machine
Proceedings of the 2009 Fourth International Conference on Software Engineering Advances (ICSEA’09)
(2009)
AljarahI. et al.
Selecting discriminating terms for bug assignment: a formal analysis
Proceedings of the 7th International Conference on Predictive Models in Software Engineering
(2011)
AnvikJ. et al.
Who should fix this bug?
Proceedings of the 28th International Conference on Software Engineering (ICSE’2006)
(2006)
AnvikJ. et al.
Determining implementation expertise from bug reports
Proceedings of the Fourth International Workshop on Mining Software Repositories (MSR’2007)
(2007)
AnvikJ. et al.
Reducing the effort of bug report triage: Recommenders for development-oriented decisions
ACM Trans. Softw. Eng. Methodol.
(2011)
Apache, 2013....
Baeza-YatesR.A. et al.
Modern Information Retrieval
(1999)
BasiliV. et al.
Experimentation in software engineering
IEEE Trans. Softw. Eng.
(1986)
BertramD. et al.
Communication, collaboration, and bugs: the social nature of issue tracking in small, collocated teams
Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work (CSCW’2010)
(2010)

BettenburgN. et al.

Duplicate bug reports considered harmful... Really?

Proceedings of the 24th IEEE International Conference on Software Maintenance (ICSM’2008)

(2008)

BhattacharyaP. et al.

Automated, highly-accurate, bug assignment using machine learning and tossing graphs

J. Syst. Softw

(2012)

BirdC. et al.

Fair and balanced? bias in bug-fix datasets

Proceedings of the ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE’2009)

(2009)

Bugzilla, 2013. URL:...

BystrK. et al.

Task complexityaffects information task complexity affects information

Inf. Process. Manag.

(1995)

CaglayanB. et al.

Issue ownership activity in two large software projects

ACM SIGSOFT Softw. Eng. Notes

(2012)

CanforaG. et al.

A taxonomy of information retrieval models and tools

Comput. Inf. Technol.

(2004)

CanforaG. et al.

Supporting change request assignment in open source development

Proceedings of the ACM Symposium on Applied Computing (SAC’2006)

(2006)

CasavantT.L. et al.

A taxonomy of scheduling in general-purpose distributed computing systems

IEEE Trans. Softw. Eng.

(1988)

CavalcantiY.C. et al.

Combining Rule-based and Information Retrieval Techniques to assign Software Change Requests

Proceedings of The 29th IEEE/ACM International Conference on Automated Software Engineering (ASE’2014)

(2014)

CavalcantiY.C. et al.

The bug report duplication problem: an exploratory study

Softw. Qual. J.

(2013)

CavalcantiY.C. et al.

Challenges and opportunities for software change request repositories: a systematic mapping study

J. Softw.: Evolut. Process

(2013)

CavalcantiY.C. et al.

Towards understanding software change request assignment: a survey with practitioners

Proceedings of the 17th International Conference on Evaluation and Assessment in Software Engineering (EASE’2013)

(2013)

ChenL. et al.

An approach to improving bug assignment with bug tossing graphs and bug similarities

J. Softw.

(2011)

CrowstonK. et al.

Coordination practices within floss development teams the bug fixing process

Proceedings of the 1st International Workshop on Computer Supported Activity Coordination (CSAC’2004)

(2004)

CubranicD. et al.

Automatic bug triage using text categorization

Proceedings of the 16th International Conference on Software Engineering & Knowl. Engineering (SEKE’2004)

(2004)

Cited by (14)

An artificial intelligence framework on software bug triaging, technological evolution, and future challenges: A review
2023, International Journal of Information Management Data Insights
The timely release of defect-free software and the optimization of development costs depend on efficient software bug triaging (SBT) techniques. SBT can also help in managing the vast information available in software bug repositories. Recently, Artificial Intelligence (AI)-based emerging technologies have been utilized excessively, however, it is not clear how it is shaping the design, development, and performance in the field of SBT. It is therefore important to write this well-planned, comprehensive, and timely needed AI-based SBT review, establishing clear findings. For selecting the key studies in SBT, Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) analysis was carried out, and 123 studies were selected for the AI-based review, addressing key research questions. Further, Cochrane protocol was applied for risk-of-bias computations for selecting AI techniques. We studied the six types of software bug triaging techniques (SBTT) that were analyzed. AI has provided the possibility of automating the time-consuming manual SBT process. Our study shows that AI-based architectures, developers for newly reported bugs can be identified more accurately and quickly. Deep learning (DL)-based approaches demonstrate capabilities for developing SBT systems having improved (i) learning rate, (ii) scalability, and (iii) performance as compared to conventional approaches. For evaluating the SBT techniques, apart from the accuracy, precision, and recall, the mean average precision (mAP) is suggested to be an effective metric. In the future, more work is expected in the direction of SBT considering additional information from developer's networks, other repositories, and modern AI technologies.
A scheduling-driven approach to efficiently assign bug fixing tasks to developers
2021, Journal of Systems and Software
Citation Excerpt :
This process improves the time and the accuracy of bug triaging. Cavalcanti et al. use contextual information to provide a rule-based system to improve the assignment of change requests to developers (Cavalcanti et al., 2016). The authors address the dynamicity in the developer team structure as a main concern that needs to be taken into consideration when assigning tasks.
The efficient assignment of bug fixing tasks to software developers is of major importance in software maintenance and evolution. When those tasks are not efficiently assigned to developers, the software project might confront extra costs and delays. In this paper, we propose a strategy that minimizes the time and the cost in bug fixing by finding the best feasible developer arrangement to handle bug fixing requests. We enhance therefore a state-of-the-art solution that uses an evolutionary bi-objective algorithm by involving a scheduling-driven approach that explores more parts of the search space. Scheduling is the process of evaluating all possible orders that developers can follow to fix the bugs they have been assigned. Through an empirical study we analyze the performance of the scheduling-driven approach and compare it to state of the art solutions. A non-parametric statistical test with four quality indicator metrics is used to assure its superiority. The experiments using two case-studies (JDT and Platform) showed that the scheduling-driven approach is superior to the state of the art approach in 71% and 74% of cases, respectively. Thus, our approach offers superior performance by assigning more conveniently bug fixing tasks to developers, while still avoiding to overload developers.
Automatic assignment of integrators to pull requests: The importance of selecting appropriate attributes
2018, Journal of Systems and Software
Citation Excerpt :
That way, in projects that receive a large number of issues (e.g., Mozilla and Eclipse), there are many open issues that need to be solved but do not have assigned developer. Thus, allocating this backlog for developers makes bug triage a complex activity Cavalcanti et al. (2014, 2016). The pull-based development differs from the issue-based model because the pull request contains source code, providing information like the number of lines that have been added/removed, the names of the files changed, their locations, the number of commits in the pull request, and the content of the changes.
In open-source projects that adopt the pull-based development workflow, a core developer needs to analyze the contribution received via pull requests and decide on integrating it or not in the repository. However, this process is time-consuming, leading to an increasing number of pull requests left to be analyzed. Consequently, the assignment of suitable integrators to pull requests becomes an important step in the pull-based development workflow. Classification methods have already been used to recommend integrators, based on different sets of predictive attributes. The main contribution of this paper is to identify a set of attributes that can improve the performance of the integrator prediction task reported in the literature. To do so, we first evaluate different sets of attributes used by previous studies with different classification algorithms. Besides, we explore attribute selection strategies on an extended set of attributes composed not only by the attributes already used in the literature but also new attributes we consider relevant to the problem. Experiments with 32 open-source projects evidenced that after applying attribute selection strategies and, consequently, identifying a more suitable set of attributes, the recommendation has achieved normalized improvements 54% higher than the state-of-the-art.
Survey on User Feature Requests Analysis and Processing
2023, Ruan Jian Xue Bao/Journal of Software
Exploring the Key Features for Automatic Bug Assignment: An Empirical Study
2022, SSRN
Extracting Software Change Requests from Mobile App Reviews
2021, Proceedings - 2021 36th IEEE/ACM International Conference on Automated Software Engineering Workshops, ASEW 2021

View all citing articles on Scopus

Ivan do Carmo Machado is a post-doctoral associate at the Computer Science Department at the Federal University of Bahia, Brazil. He received a Ph.D. in Computer Science from the Federal University of Bahia in 2014. His research interests include empirical and evidence-based software engineering, software product lines, software testing, and software architecture. He is a member of the ACM and the Brazilian Computer Society.

Paulo Anselmo da Mota Silveira Neto has a Bachelor of Computer Science degree from Catholic University of Pernambuco (UNICAP), Specialist in Software Engineering from University of Pernambuco (UPE), Master of Science degree in Computer Science (Software Engineering) from Federal University of Pernambuco (UFPE). Nowadays, he is a Ph.D. candidate in Computer Science at Federal University of Pernambuco and member of the Reuse in Software Engineering (RiSE) Group, which has executed research regarding to Software Product Lines (SPL) Testing, SPL Architecture Evaluation, Test Selection Techniques and Regression Testing. He is also participating on important research projects in Software Engineering area, as the National Institute of Science and Technology for Software Engineering (I.N.E.S.).

Eduardo Santana de Almeida is an assistant professor at Federal University of Bahia and head of the Reuse in Software Engineering (RiSE) Labs. He has more than 200 papers published in the main conferences and journals related to Software Engineering and has chaired several national and international conferences and workshops. His research areas include: methods, processes, tools and metrics to develop reusable software. Contact him at [email protected].

View full text

Towards semi-automated assignment of software change requests

Highlights

Abstract

Introduction

Section snippets

Change request management

Understanding change request assignment

Semi-automated approach to change request assignment

Empirical validation

Related work

Conclusion and future work

Automatic classification of software change request using multi-label machine learning methods

Proceedings of the 2009 33rd Annual IEEE Software Engineering Workshop (SEW’2009)

Automatic software bug triage system (BTS) based on latent semantic indexing and support vector machine

Proceedings of the 2009 Fourth International Conference on Software Engineering Advances (ICSEA’09)

Selecting discriminating terms for bug assignment: a formal analysis

Proceedings of the 7th International Conference on Predictive Models in Software Engineering

Who should fix this bug?

Proceedings of the 28th International Conference on Software Engineering (ICSE’2006)

Determining implementation expertise from bug reports

Proceedings of the Fourth International Workshop on Mining Software Repositories (MSR’2007)

Reducing the effort of bug report triage: Recommenders for development-oriented decisions

ACM Trans. Softw. Eng. Methodol.

Modern Information Retrieval

Experimentation in software engineering

IEEE Trans. Softw. Eng.

Communication, collaboration, and bugs: the social nature of issue tracking in small, collocated teams

Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work (CSCW’2010)

Duplicate bug reports considered harmful... Really?

Proceedings of the 24th IEEE International Conference on Software Maintenance (ICSM’2008)

Automated, highly-accurate, bug assignment using machine learning and tossing graphs

J. Syst. Softw

Fair and balanced? bias in bug-fix datasets

Proceedings of the ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE’2009)

Task complexityaffects information task complexity affects information

Inf. Process. Manag.

Issue ownership activity in two large software projects

ACM SIGSOFT Softw. Eng. Notes

A taxonomy of information retrieval models and tools

Comput. Inf. Technol.

Supporting change request assignment in open source development

Proceedings of the ACM Symposium on Applied Computing (SAC’2006)

A taxonomy of scheduling in general-purpose distributed computing systems

IEEE Trans. Softw. Eng.

Combining Rule-based and Information Retrieval Techniques to assign Software Change Requests

Proceedings of The 29th IEEE/ACM International Conference on Automated Software Engineering (ASE’2014)

The bug report duplication problem: an exploratory study

Softw. Qual. J.

Challenges and opportunities for software change request repositories: a systematic mapping study

J. Softw.: Evolut. Process

Towards understanding software change request assignment: a survey with practitioners

Proceedings of the 17th International Conference on Evaluation and Assessment in Software Engineering (EASE’2013)

An approach to improving bug assignment with bug tossing graphs and bug similarities

J. Softw.

Coordination practices within floss development teams the bug fixing process

Proceedings of the 1st International Workshop on Computer Supported Activity Coordination (CSAC’2004)

Automatic bug triage using text categorization

Proceedings of the 16th International Conference on Software Engineering & Knowl. Engineering (SEKE’2004)