research-article

How Far Have We Progressed in Identifying Self-admitted Technical Debts? A Comprehensive Empirical Study

Authors:

Yuming ZhouAuthors Info & Claims

ACM Transactions on Software Engineering and Methodology (TOSEM), Volume 30, Issue 4

Article No.: 45, Pages 1 - 56

https://doi.org/10.1145/3447247

Published: 23 July 2021 Publication History

Get Access

Abstract

Background. Self-admitted technical debt (SATD) is a special kind of technical debt that is intentionally introduced and remarked by code comments. Those technical debts reduce the quality of software and increase the cost of subsequent software maintenance. Therefore, it is necessary to find out and resolve these debts in time. Recently, many automatic approaches have been proposed to identify SATD. Problem. Popular IDEs support a number of predefined task annotation tags for indicating SATD in comments, which have been used in many projects. However, such clear prior knowledge is neglected by existing SATD identification approaches when identifying SATD. Objective. We aim to investigate how far we have really progressed in the field of SATD identification by comparing existing approaches with a simple approach that leverages the predefined task tags to identify SATD. Method. We first propose a simple heuristic approach that fuzzily Matches task Annotation Tags (MAT) in comments to identify SATD. In nature, MAT is an unsupervised approach, which does not need any data to train a prediction model and has a good understandability. Then, we examine the real progress in SATD identification by comparing MAT against existing approaches. Result. The experimental results reveal that: (1) MAT has a similar or even superior performance for SATD identification compared with existing approaches, regardless of whether non-effort-aware or effort-aware evaluation indicators are considered; (2) the SATDs (or non-SATDs) correctly identified by existing approaches are highly overlapped with those identified by MAT; and (3) supervised approaches misclassify many SATDs marked with task tags as non-SATDs, which can be easily corrected by their combinations with MAT. Conclusion. It appears that the problem of SATD identification has been (unintentionally) complicated by our community, i.e., the real progress in SATD comments identification is not being achieved as it might have been envisaged. We hence suggest that, when many task tags are used in the comments of a target project, future SATD identification studies should use MAT as an easy-to-implement baseline to demonstrate the usefulness of any newly proposed approach.

References

[1]

Python Developer's Guide. 2020. Retrieved from https://www.python.org/dev/peps/pep-0350/.

Abstract

References

Cited By

Index Terms

Recommendations

Neural Network-based Detection of Self-Admitted Technical Debt: From Performance to Explainability

An empirical study on self-admitted technical debt in modern code review

Self-Admitted Technical Debt and comments’ polarity: an empirical study

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

HTML Format

Share

Share this Publication link

Share on social media

Affiliations