Towards prioritizing user-related issue reports of mobile applications

Noei, Ehsan; Zhang, Feng; Wang, Shaohua; Zou, Ying

doi:10.1007/s10664-019-09684-y

Towards prioritizing user-related issue reports of mobile applications

Published: 29 January 2019

Volume 24, pages 1964–1996, (2019)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

Ehsan Noei ORCID: orcid.org/0000-0001-7192-4604¹,
Feng Zhang²,
Shaohua Wang³ &
…
Ying Zou¹

902 Accesses
19 Citations
1 Altmetric
Explore all metrics

Abstract

The competitive market of mobile applications (apps) has driven app developers to pay more attention to addressing the issues of mobile apps. Prior studies have shown that addressing the issues that are reported in user-reviews shares a statistically significant relationship with star-ratings. However, despite the prevalence and importance of user-reviews and issue reports prioritization, no prior research has analyzed the relationship between issue reports prioritization and star-ratings. In this paper, we integrate user-reviews into the process of issue reports prioritization. We propose an approach to map issue reports that are recorded in issue tracking systems to user-reviews. Through an empirical study of 326 open-source Android apps, our approach achieves a precision of 79% in matching user-reviews with issue reports. Moreover, we observe that prioritizing the issue reports that are related to user-reviews shares a significant positive relationship with star-ratings. Furthermore, we use the top apps, in terms of star-ratings, to train a model for prioritizing issue reports. It is a good practice to learn from the top apps as there is no well-established approach for prioritizing issue reports. The results show that mobile apps with a similar prioritization approach to our trained model achieve higher star-ratings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Turing test of online reviews: Can we tell the difference between human-written and GPT-4-written online reviews?

Article 12 April 2024

Personalized mobile marketing strategies

Article 16 October 2019

Recommender systems and their ethical challenges

Article Open access 27 February 2020

Notes

References

Alenezi M, Banitaan S (2013) Bug reports prioritization: which features and classifier to use?. In: 12th international conference on machine learning and applications (ICMLA), IEEE, vol 2, pp 112–116
Allacronyms (2017) Acronyms and abbreviations related to computer science. [Online]. Available: https://www.allacronyms.com/computer-science/abbreviations
Archer KJ, Kimes RV (2008) Empirical characterization of random forest variable importance measures. Comput Stat Data Anal 52(4):2249–2260
Article MathSciNet MATH Google Scholar
Basili VR (1992) Software modeling and measurement: the goal/question/metric paradigm. Tech. rep., Institute for advanced computer studies
Bavota G, Linares-Vasquez M, Bernal-Cardenas CE, Penta MD, Oliveto R, Poshyvanyk D (2015) The impact of api change-and fault-proneness on the user ratings of android apps. IEEE Trans Softw Eng 41(4):384–407
Article Google Scholar
Bertram D, Voida A, Greenberg S, Walker R (2010) Communication, collaboration, and bugs: the social nature of issue tracking in small, collocated teams. In: 2010 ACM conference on computer supported cooperative work, ACM, pp 29–300
Bhattacharya P, Ulanova L, Neamtiu I, Koduru SC (2013) An empirical analysis of bug reports and bug fixing in open source android apps, IEEE, CSMR
Biau G, Scornet E (2016) A random forest guided tour. Test 25(2):197–227
Article MathSciNet MATH Google Scholar
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022
MATH Google Scholar
Broder AZ, Glassman SC, Manasse MS, Zweig G (1997) Syntactic clustering of the web. Comput Netw ISDN Syst 29(8-13):1157–1166
Article Google Scholar
Bruns A, Kornstadt A, Wichmann D (2009) Web application tests with selenium. IEEE software 26(5):88–91
Article Google Scholar
Bugzilla (2018) Bugzilla. [Online]. Available: https://www.bugzilla.org/
Calders T, Verwer S (2010) Three naive bayes approaches for discrimination-free classification. Data Min Knowl Disc 21(2):277–292
Article MathSciNet Google Scholar
Cavalcanti YC, Neto PAdMS, Lucrédio D, Vale T, de Almeida ES, de Lemos Meira SR (2013) The bug report duplication problem: an exploratory study. Softw Qual J 21(1):39–66
Article Google Scholar
Chen N, Lin J, Hoi SC, Xiao X, Zhang B (2014) Ar-miner: mining informative reviews for developers from mobile app marketplace. In: 36th international conference on software engineering, ACM, pp 767–778
Ciurumelea A, Schaufelbhl A, Panichella S, Gall H (2017) Analyzing reviews and code of mobile apps for better release planning. In: 24th international conference on software analysis evolution and reengineering, IEEE
Cliff N (1993) Dominance statistics: Ordinal analyses to answer ordinal questions. Psychol Bull 114(3):494
Article Google Scholar
Cohen J (2013) Statistical power analysis for the behavioral sciences. Academic press, Cambridge
Book MATH Google Scholar
De Marneffe MC, MacCartney B, Manning CD, et al (2006) Generating typed dependency parses from phrase structure parses. In: 5th international conference on language resources and evaluation, vol 6, pp 449–454
Developer G (2018) Github developer. [Online]. Available: https://developer.github.com/v3/
Di Sorbo A, Panichella S, Alexandru CV, Shimagaki J, Visaggio CA, Canfora G, Gall HC (2016) What would users change in my app? Summarizing app reviews for recommending software changes. In: 24th ACM SIGSOFT international symposium on foundations of software engineering, ACM, pp 499–510
Ester M, Kriegel H P, Sander J, Xu X, et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: 2nd international conference on knowledge discovery and data mining, vol 96, pp 226–231
Faraway JJ (2005) Extending the linear model with R: generalized linear, mixed effects and nonparametric regression models. CRC Press, Boca Raton
MATH Google Scholar
FDroid (2017) F-droid. [Online]. Available: http://www.f-droid.org/
Galvis Carreño LV, Winbladh K (2013) Analysis of user comments: an approach for software requirements evolution. In: 35th international conference on software engineering, IEEE, pp 582–591
GitHub (2018a) Github. [Online]. Available: http://www.github.com/
GitHub (2018b) Github help. [Online]. Available: https://help.github.com/articles/viewing-contributions-on-your-profile/
Google (2017) Google play store. [Online]. Available: http://play.google.com/
Gousios G (2013) The ghtorrent dataset and tool suite. In: 10th working conference on mining software repositories. IEEE Press, Piscataway, pp 233–236
Guzman E, Maalej W (2014) How do users like this feature? A fine grained sentiment analysis of app reviews. In: 22nd international conference on requirements engineering, IEEE, pp 153–162
Hmisc (2017) Harrell miscellaneous. [Online]. Available: http://cran.r-project.org/web/packages/Hmisc/index.html
Ho TK (1995) Random decision forests. In: 3rd international conference on document analysis and recognition, IEEE, vol 1, pp 278–282
Iacob C, Harrison R (2013) Retrieving and analyzing mobile apps feature requests from online reviews. In: 10th working conference on mining software repositories, IEEE, MSR ’13, pp 41–44
Islam MR, Zibran MF (2017) Leveraging automated sentiment analysis in software engineering. In: 14th International Conference on Mining Software Repositories, IEEE Press, pp 203–214
Janák J (2009) Issue tracking systems. Brno, spring
Jazzy (2017) Jazzy spell checker. [Online]. Available: http://jazzy.sourceforge.net/
Kanwal J, Maqbool O (2012) Bug prioritization to facilitate bug report triage. J Comput Sci Technol 27(2):397–412
Article Google Scholar
Kelley TL (1947) Fundamentals of statistics. Harvard University Press, Harvard
Google Scholar
Khalid H, Nagappan M, Shihab E, Hassan AE (2014) Prioritizing the devices to test your app on: a case study of android game apps. In: 22nd international symposium on the foundations of software engineering, pp 370–379
Khalid H, Nagappan M, Hassan AE (2016) Examining the relationship between findbugs warnings and app ratings. IEEE Softw 33(4):34–39
Article Google Scholar
Kim HW, Lee H, Son J (2011) An exploratory study on the determinants of smartphone app purchase. In: 11th international dsi and the 16th APDSI joint meeting
Kim SM, Pantel P, Chklovski T, Pennacchiotti M (2006) Automatically assessing review helpfulness. In: 2006 Conference on empirical methods in natural language processing, Association for Computational Linguistics, pp 423–430
Lamkanfi A, Demeyer S, Giger E, Goethals B (2010) Predicting the severity of a reported bug. In: 7th IEEE working conference on mining software repositories (MSR), IEEE, pp 1–10
Liaw A, Wiener M (2002) Classification and regression by randomforest. R news 2(3):18–22
Google Scholar
Linares-Vásquez M, Vendome C, Luo Q, Poshyvanyk D (2015) How developers detect and fix performance bottlenecks in android apps. In: 31st conference on software maintenance and evolution, IEEE, pp 352–361
Lovins JB (1968) Development of a stemming algorithm, MIT Information Processing Group, Electronic Systems Laboratory
Maji AK, Hao K, Sultana S, Bagchi S (2010) Characterizing failures in mobile oses: A case study with android and symbian
Mann HB, Whitney DR (1947) On a test of whether one of two random variables is stochastically larger than the other. Ann Math Stat 18:50–60
Article MathSciNet MATH Google Scholar
Manning CD, Surdeanu M, Bauer J, Finkel J, Bethard SJ, McClosky D (2014) The stanford corenlp natural language processing toolkit. In: 52nd annual meeting of the association for computational linguistics: System demonstrations, pp 55–60
Martin W, Harman M, Jia Y, Sarro F, Zhang Y (2015a) The app sampling problem for app store mining. In: 12th working conference on mining software repositories, IEEE, pp 123–133
Martin W, Sarro F, Harman M (2015b) Causal impact analysis applied to app releases in google play and windows phone store. RN 15:07
Google Scholar
McDonnell T, Ray B, Kim M (2013) An empirical study of api stability and adoption in the android ecosystem. In: 29th international conference on software maintenance, IEEE, pp 70–79
Menzies T, Marcus A (2008) Automated severity assessment of software defect reports. In: International conference on software maintenance, IEEE, pp 346–355
Miller GA (1995) Wordnet: a lexical database for english. Commun ACM 38 (11):39–41
Article Google Scholar
Moran K, Linares-Vásquez M, Bernal-Cárdenas C, Poshyvanyk D (2015) Auto-completing bug reports for android applications. In: 10th joint meeting on foundations of software engineering, ACM, pp 673–686
Nelder JA, Baker RJ (1972) Generalized linear models. Encyclopedia of statistical sciences
Netlingo (2017) Top 50 most popular text terms. [Online]. Available: http://www.netlingo.com/top50/popular-text-terms.php
Nguyen TH, Adams B, Hassan AE (2010) Studying the impact of dependency network measures on software quality. In: 26th international conference on software maintenance, IEEE, pp 1–10
Noei E, Heydarnoori A (2016) Exaf: a search engine for sample applications of object-oriented framework-provided concepts. Inf Softw Technol 75:135–147
Article Google Scholar
Noei E, Syer MD, Zou Y, Hassan AE, Keivanloo I (2017) A study of the relation of mobile device attributes with the user-perceived quality of android apps. Empir Softw Eng 22(6):3088–3116
Article Google Scholar
Noei E, Da Costa DA, Zou Y (2018) Winning the app production rally. In: 26th ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering. ESEC/FSE, vol 2018. ACM, New York, pp 283–294
Nord C (2005) Text analysis in translation: Theory, methodology, and didactic application of a model for translation-oriented text analysis. 94, Rodopi
Optimaize (2017) Language detection library for java. [Online]. Available: https://github.com/optimaize/language-detector/
Palomba F, Linares-Vásquez M, Bavota G, Oliveto R, Di Penta M, Poshyvanyk D, De Lucia A (2015) User reviews matter! tracking crowdsourced reviews to support evolution of successful apps. In: 31st international conference on software maintenance and evolution, IEEE, pp 291–300
Panichella S, Di Sorbo A, Guzman E, Visaggio C, Canfora G, Gall H (2015) How can i improve my app? classifying user reviews for software maintenance and evolution. In: 31st international conference on software maintenance and evolution
Rajaraman A, Ullman JD, Ullman JD, Ullman JD (2012) Mining of massive datasets, vol 77. Cambridge University Press, Cambridge
Google Scholar
Romero DM, Galuba W, Asur S, Huberman BA (2011) Influence and passivity in social media. In: Joint european conference on machine learning and knowledge discovery in databases, Springer, pp 18–33
Salton G, Mcgill MJ (1983) Introduction to modern information retrieval, 24–51
Salton G, Wong A, Yang CS (1975) A vector space model for automatic indexing. Commun ACM 18(11):613–620
Article MATH Google Scholar
Selenium (2017) Selenium - web browser automation. [Online]. Available: http://seleniumhq.org/
Snowball (2018) Snowball. [Online]. Available: http://snowballstem.org/
Statista (2017a) Number of apps available in leading app stores as of march 2017. [Online]. Available: http://www.statista.com/statistics/276623/number-of-apps-available-in-leading-app-stores
Statista (2017b) Number of smartphone users worldwide from 2014 to 2020 (in billions). [Online]. Available: https://www.statista.com/statistics/330695/number-of-smartphone-users-worldwide
Stats A (2016) Number of android applications. [Online]. Available: http://www.appbrain.com/stats/number-of-android-apps
Steinmacher I, Wiese IS, Gerosa MA (2012) Recommending mentors to software project newcomers. In: 3rd international workshop on recommendation systems for software engineering, IEEE Press, pp 63–67
Steinmacher I, Treude C, Gerosa M (2018) Let me in: Guidelines for the successful onboarding of newcomers to open source projects. IEEE Software
Strobl C, Boulesteix AL, Zeileis A, Hothorn T (2007) Bias in random forest variable importance measures: Illustrations, sources and a solution. BMC Bioinf 8(1):25
Article Google Scholar
Tantithamthavorn C, McIntosh S, Hassan AE, Matsumoto K (2017) An empirical comparison of model validation techniques for defect prediction models. IEEE Trans Softw Eng 43(1):1–18
Article Google Scholar
Tian Y, Lo D, Sun C (2012) Information retrieval based nearest neighbor classification for fine-grained bug severity prediction. In: 2012 19th working conference on reverse engineering (WCRE), IEEE, pp 215–224
Van Solingen R, Basili V, Caldiera G, Rombach HD (2002) Goal question metric (gqm) approach. Encyclopedia of software engineering
Vasilescu B, Filkov V, Serebrenik A (2015) Perceptions of diversity on github: a user survey. In: 8th international workshop on cooperative and human aspects of software engineering, IEEE Press, pp 50–56
Villarroel L, Bavota G, Russo B, Oliveto R, Di Penta M (2016) Release planning of mobile apps based on user reviews. In: 38th international conference on software engineering, ACM, pp 14–24
Xuan J, Jiang H, Ren Z, Zou W (2012) Developer prioritization in bug repositories. In: 2012 34th international conference on software engineering (ICSE), IEEE, pp 25–35
Yin RK (2013) Case study research: Design and methods. Sage publications
Yu L, Tsai WT, Zhao W, Wu F (2010) Predicting defect priority based on neural networks. In: International conference on advanced data mining and applications, Springer, pp 356–367
Yu Y, Wang H, Filkov V, Devanbu P, Vasilescu B (2015) Wait for it: determinants of pull request evaluation latency on github. In: 2015 IEEE/ACM 12th working conference on mining software repositories (MSR), IEEE, pp 367–371
Zanatta AL, Steinmacher I, Machado LS, de Souza CR, Prikladnicki R (2017) Barriers faced by newcomers to software-crowdsourcing projects. IEEE Softw 34(2):37–43
Article Google Scholar
Zhang F, Mockus A, Keivanloo I, Zou Y (2015) Towards building a universal defect prediction model with rank transformed predictors. Empir Softw Eng 21(5):2107–2145
Article Google Scholar

Download references

Acknowledgments

We thank the anonymous reviewers who reviewed our paper and the associated editor for their valuable feedback.

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, Queen’s University, Kingston, Canada
Ehsan Noei & Ying Zou
School of Computing, Queen’s University, Kingston, Canada
Feng Zhang
Department of Informatics, New Jersey Institute of Technology, Newark, NJ, USA
Shaohua Wang

Authors

Ehsan Noei
View author publications
You can also search for this author in PubMed Google Scholar
Feng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Shaohua Wang
View author publications
You can also search for this author in PubMed Google Scholar
Ying Zou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ehsan Noei.

Additional information

Communicated by: Miryung Kim

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Noei, E., Zhang, F., Wang, S. et al. Towards prioritizing user-related issue reports of mobile applications. Empir Software Eng 24, 1964–1996 (2019). https://doi.org/10.1007/s10664-019-09684-y

Download citation

Published: 29 January 2019
Issue Date: 15 August 2019
DOI: https://doi.org/10.1007/s10664-019-09684-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Towards prioritizing user-related issue reports of mobile applications

Abstract

Access this article

Similar content being viewed by others

The Turing test of online reviews: Can we tell the difference between human-written and GPT-4-written online reviews?

Personalized mobile marketing strategies

Recommender systems and their ethical challenges

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Towards prioritizing user-related issue reports of mobile applications

Abstract

Access this article

Similar content being viewed by others

The Turing test of online reviews: Can we tell the difference between human-written and GPT-4-written online reviews?

Personalized mobile marketing strategies

Recommender systems and their ethical challenges

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation