Authors:
Andrada-Mihaela-Nicoleta Moldovan
and
Andreea Vescan
Affiliation:
Computer Science Department, Faculty of Mathematics and Computer Science, Babeş-Bolyai University, Cluj-Napoca, Romania
Keyword(s):
Outlier Detection, Anomaly Detection, Software Defect Prediction.
Abstract:
Regression testing becomes expensive in terms of time when changes are often made. In order to simplify
testing, supervised/unsupervised binary classification Software Defect Prediction (SDP) techniques may rule
out non-defective components or highlight those components that are most prone to defects. In this paper,
outlier detection methods for SDP are investigated. The novelty of this approach is that it was not previously
used for this particular task. Two approaches are implemented, namely, simple use of the local outlier factor
based on connectivity (Connectivity-based Outlier Factor, COF), respectively, improving it by the Pareto rule
(which means that we consider samples with the 20% highest outlier score resulting from the algorithm as
outliers), COF + Pareto. The solutions were evaluated in 12 projects from NASA and PROMISE datasets.
The results obtained are comparable to state-of-the-art solutions, for some projects, the results range from
acceptable to good, compa
red to the results of other studies.
(More)