Cost-aware triage ranking algorithms for bug reporting systems

Park, Jin-woo; Lee, Mu-Woong; Kim, Jinhan; Hwang, Seung-won; Kim, Sunghun

doi:10.1007/s10115-015-0893-9

Cost-aware triage ranking algorithms for bug reporting systems

Regular Paper
Published: 13 October 2015

Volume 48, pages 679–705, (2016)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Jin-woo Park¹,
Mu-Woong Lee¹,
Jinhan Kim¹,
Seung-won Hwang² &
…
Sunghun Kim³

501 Accesses
Explore all metrics

Abstract

Bug triaging of deciding whom to fix the bug has been studied actively. However, existing work does not consider varying cost of the same bug over developers with diverse backgrounds and experiences. In clear contrast, we argue the “cost” of one bug can be low for one developer, while high for another. Based on this view, we study an automatic triaging system considering both accuracy and cost. Our preliminary solution, CosTriage, models user-specific experiences and estimated cost on each bug category, obtained from topic modeling, and assigns the bug to the developer who not only can, but also is expected to fix fast. For user-specific cost modeling, we are inspired by recommender system work, of estimating user-specific rating of items, e.g., movies. With this view, existing triaging work of categorizing bugs and assigning developers with experiences in the category falls into content-based recommendation (CBR). However, CBR is well known to cause overspecialization because it recommends only the types of bugs that each developer has solved before. This problem is critical because the experienced developers can become overloaded with bugs they hate to fix, though there exist other categories he can fix faster. CosTriage adopts content-boosted collaborative filtering (CBCF), considering not only similar bugs (content-based) but similar developers (collaborative) for estimating user-specific cost. In this paper, we extend to include special scenarios. First, bug may not have textual report (e.g., crash report) or textual report may lack a topic word (e.g., 1957 of 48,424 in Mozilla reports) Mozilla reports. Second, in some scenarios, developer profiles may change over time. For these scenarios, we extend CosTriage to support non-textual description and dynamic profiles, which we denote as CosTriage+. Our experimental evaluation shows that our solution reduces the cost efficiently by 30 % without seriously compromising accuracy in comparison with the baseline only considering accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automated Time Based Multi-Criteria Bug Triage Approach: Developer Working Efficiency and Social Network Based Developer Recommendation

Article 28 May 2022

Developer recommendation on bug commenting: a ranking approach for the developer crowd

Article 26 April 2017

NextBug: a Bugzilla extension for recommending similar bugs

Article Open access 17 April 2015

Notes

Tai’s algorithm has $O(V_1\times V_2\times D_1^2\times D_2^2)$ time complexity, where $V_i$ is the number of nodes in $T(\mathcal {I}_{\mathcal {B}_i})$, and $D_i$ is the depth of $T(\mathcal {I}_{\mathcal {B}_i})$.
Zhang’s algorithm has $O(V_1\times V_2\times \hbox {min}(L_1,D_1)\times \hbox {min}(L_2,D_2)$ time complexity, where $L_i$ denotes the number of leaves in $T_i$. An implementation is available at http://web.science.mq.edu.au/~swan/howtos/treedistance/.
Apache, https://issues.apache.org/bugzilla/.
Eclipse, https://bugs.eclipse.org/bugs/.
Linux kernel, https://bugzilla.kernel.org/.
Mozilla, https://bugzilla.mozilla.org/.
http://svmlight.joachims.org/.
Since the test set of Apache only has 131 bugs, several bug types have no or few bugs. In Fig. 6, we only show the bug types with more than 2.5 %.

References

Park J, Lee M-W, Kim J, Hwang S, Kim S (2011) Costriage: a cost-aware triage algorithm for bug reporting systems. In: AAAI
Anvik J (2007) Assisting bug report triage through recommendation. PhD thesis, University of British Columbia
Jeong G, Kim S, Zimmermann T (2009) Improving bug triage with bug tossing graphs. In: ESEC/FSE
Guo PJ, Zimmermann T, Nagappan N, Murphy B (2010) Characterizing and predicting which bugs get fixed: an empirical study of microsoft windows. In: ICSE
Anvik J, Hiew L, Murphy GC (2006) Who should fix this bug? In: ICSE
Anvik J, Murphy GC (2011) Reducing the effort of bug report triage: recommenders for development-oriented decisions. ACM Trans Softw Eng Methodol 20(3):10
Čubranić D (2004) Automatic bug triage using text categorization. In: SEKE
Canfora G, Cerulo L (2006) Supporting change request assignment in open source development. In: Proceedings of the 2006 ACM symposium on applied computing
Canfora G, Cerulo L (2005) How software repositories can help in resolving a new change request. In: Workshop on empirical studies in reverse engineering
di Lucca G (2002) An approach to classify software maintenance requests. In: ICSM
Matter D, Kuhn A, Nierstrasz O (2009) Assigning bug reports using a vocabulary-based expertise model of developers. In: MSR
Tamrawi A, Nguyen TT, Al-Kofahi JM, Nguyen TN (2011) Fuzzy set and cache-based approach for bug triaging. In: ESEC/FSE
Kim S, Whitehead EJ Jr (2006) How long did it take to fix bugs? In: MSR
Weiss C, Premraj R, Zimmermann T, Zeller A (2007) How long will it take to fix this bug? In: MSR
Rahman MM, Ruhe G, Zimmermann T (2009) Optimized assignment of developers for fixing bugs an initial evaluation for eclipse projects. In: ESEM
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022
Bettenburg N, Premraj R, Zimmermann T, Kim Sunghun (2008) Duplicate bug reports considered harmful... really? In: ICSM
Chen L, Wang X, Liu C (2011) An approach to improving bug assignment with bug tossing graphs and bug similarities. J Softw 6(3):421–427
Xuan J, Jiang H, Ren Z, Yan J, Luo Z (2010) Automatic bug triage using semi-supervised text classification. In: SEKE
Bhattacharya P, Neamtiu I (2010) Fine-grained incremental learning and multi-feature tossing graphs to improve bug triaging. In: ICSM
Lin Z, Shu F, Yang Y, Hu C, Wang Q (2009) An empirical study on bug assignment automation using chinese bug data. In: ESEM
Kim J, Lee S, Hwang S, Kim S (2009) Adding examples into java documents. In: ASE
Kim J, Lee S, Hwang S, Kim S (2010) Towards an intelligent code search engine. In: AAAI
Kim J, Lee S, Hwang S, Kim S (2013) Enriching documents with examples: a corpus mining approach. ACM Trans Inf Syst 31(1):1
Lee M-W, Roh J-W, Hwang S, Kim S (2010) Instant code clone search. In: FSE
Park J, Lee M-W, Roh J-W, Hwang S, Kim S (2014) Surfacing code in the dark: an instant clone search approach. Knowl Inf Syst 41(3):727–759
Melville P, Mooney RJ, Nagarajan R (2002) Content-boosted collaborative filtering for improved recommendations. In: AAAI
Arun R, Suresh V, Veni Madhavan CE, Narasimha Murthy MN (2010) On finding the natural number of topics with latent dirichlet allocation: some observations. In: PAKDD
Cao J, Xia T, Li J, Zhang Y, Tang S (2009) A density-based method for adaptive lda model selection. Neurocomputing 72(7–9):1775–1781
Zavitsanos E, Petridis S, Paliouras G, Vouros GA (2008) Determining automatically the size of learned ontologies. In: ECAI
Herlocker J, Konstan JA, Riedl J (2002) An empirical analysis of design choices in neighborhood-based collaborative filtering algorithms. Inf Retr 5(4):287–310
Ma H, King I, Lyu MR (2007) Effective missing data prediction for collaborative filtering. In: SIGIR
Allan J (1996) Incremental relevance feedback for information filtering. In: SIGIR
Chen Z, Jiang Y, Zhao Y (2010) A collaborative filtering recommendation algorithm based on user interest change and trust evaluation. In: JDCTA
Lathia N, Hailes S, Capra L, Amatriain X (2010) Temporal diversity in recommender systems. In: SIGIR
Cavnar WB, Trenkle JM (1994) N-gram-based text categorization. In: SDAIR
Bettenburg N, Premraj R, Zimmermann T, Kim S (2008) Extracting structural information from bug reports. In: MSR
Microsoft (2010) Windows error reporting: getting started. http://www.microsoft.com/whdc/winlogo/maintain/StartWER.mspx
Mozilla (2010) Crash stats. http://crash-stats.mozilla.com
Apple (2010) Technical note TN2123: CrashReporter
Tai K-C (1979) The tree-to-tree correction problem. J Assoc Comput Mach
Chen W (2001) New algorithm for ordered tree-to-tree correction problem. J Algorithms 40(2):135–158
Demaine ED, Mozes S, Rossman B, Weimann O (2009) An optimal decomposition algorithm for tree edit distance. ACM Trans Algorithms 6(1):2
Dulucq S, Touzet H (2003) Analysis of tree edit distance algorithms. In: CPM
Klein PN (1998) Computing the edit-distance between unrooted ordered trees. In: Proceedings of the 6th annual European symposium on algorithms
Zhang K, Shasha D (1989) Simple fast algorithms for the editing distance between trees and related problems. SIAM J Comput 18(6):1245–1262
Bremner D, Demaine E, Erickson J, Iacono J, Langerman S, Morin P, Toussaint G (2005) Output-sensitive algorithms for computing nearest-neighbour decision boundaries. In: Algorithms and Data Structures. Proceedings of 8th International Workshop, WADS 2003, Ottawa, Ontario, Canada, July 30-August 1,2003. Springer, Heidelberg, pp 451–461
Coomans D, Massart DL (1982) Alternative k-nearest neighbour rules in supervised pattern recognition: Part 1. k-Nearest neighbour classification by using alternative voting rules. Anal Chim Acta
Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27
Brown RG (1963) Smoothing, forecasting and prediction of discrete time series. Prentice-Hall, Englewood Cliffs
Google Scholar
Weron R, Weron K, Weron A (1999) A conditionally exponential decay approach to scaling in finance. Phys A Stat Theor Phys 264(3–4):551–561
Han J, Kamber M (2006) Data mining: concepts and techniques
Jiang L, Misherghi G, Su Z, Glondu S (2007) Deckard: scalable and accurate tree-based detection of code clones. In: ICSE
Bettenburg N, Just S, Schröter A, Weiss C, Premraj R, Zimmermann T (2008) What makes a good bug report? In: SIGSOFT FSE
Hooimeijer P, Weimer W (2007) Modeling bug report quality. In: ASE
Aranda J, Venolia G (2009) The secret life of bugs: going past the errors and omissions in software repositories. In: ICSE
Giger E, Pinzger M, Gall H (2010) Predicting the fix time of bugs. In: RSSE

Download references

Acknowledgments

This work was supported by Institute for Information & communications Technology Promotion (IITP) Grant funded by the Korea government (MSIP) (No. 10041244, SmartTV 2.0 Software Platform).

Author information

Authors and Affiliations

Pohang University of Science and Technology (POSTECH), Pohang, Republic of Korea
Jin-woo Park, Mu-Woong Lee & Jinhan Kim
Yonsei University, Seoul, Republic of Korea
Seung-won Hwang
Hong Kong University of Science and Technology (HKUST), Clear Water Bay, Hong Kong
Sunghun Kim

Authors

Jin-woo Park
View author publications
You can also search for this author inPubMed Google Scholar
Mu-Woong Lee
View author publications
You can also search for this author inPubMed Google Scholar
Jinhan Kim
View author publications
You can also search for this author inPubMed Google Scholar
Seung-won Hwang
View author publications
You can also search for this author inPubMed Google Scholar
Sunghun Kim
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Seung-won Hwang.

Additional information

This work builds on and extends our preliminary work [1].

Rights and permissions

Reprints and permissions

About this article

Cite this article

Park, Jw., Lee, MW., Kim, J. et al. Cost-aware triage ranking algorithms for bug reporting systems. Knowl Inf Syst 48, 679–705 (2016). https://doi.org/10.1007/s10115-015-0893-9

Download citation

Received: 04 November 2013
Revised: 22 July 2015
Accepted: 05 October 2015
Published: 13 October 2015
Issue Date: September 2016
DOI: https://doi.org/10.1007/s10115-015-0893-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Cost-aware triage ranking algorithms for bug reporting systems

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Automated Time Based Multi-Criteria Bug Triage Approach: Developer Working Efficiency and Social Network Based Developer Recommendation

Developer recommendation on bug commenting: a ranking approach for the developer crowd

NextBug: a Bugzilla extension for recommending similar bugs

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now