skip to main content
10.1145/3084226.3084241acmotherconferencesArticle/Chapter ViewAbstractPublication PageseaseConference Proceedingsconference-collections
research-article

Automatic Classification of Non-Functional Requirements from Augmented App User Reviews

Published: 15 June 2017 Publication History

Abstract

Context: The leading App distribution platforms, Apple App Store, Google Play, and Windows Phone Store, have over 4 million Apps. Research shows that user reviews contain abundant useful information which may help developers to improve their Apps. Extracting and considering Non-Functional Requirements (NFRs), which describe a set of quality attributes wanted for an App and are hidden in user reviews, can help developers to deliver a product which meets users' expectations. Objective: Developers need to be aware of the NFRs from massive user reviews during software maintenance and evolution. Automatic user reviews classification based on an NFR standard provides a feasible way to achieve this goal. Method: In this paper, user reviews were automatically classified into four types of NFRs (reliability, usability, portability, and performance), Functional Requirements (FRs), and Others. We combined four classification techniques BoW, TF-IDF, CHI2, and AUR-BoW (proposed in this work) with three machine learning algorithms Naive Bayes, J48, and Bagging to classify user reviews. We conducted experiments to compare the F-measures of the classification results through all the combinations of the techniques and algorithms. Results: We found that the combination of AUR-BoW with Bagging achieves the best result (a precision of 71.4%, a recall of 72.3%, and an F-measure of 71.8%) among all the combinations. Conclusion: Our finding shows that augmented user reviews can lead to better classification results, and the machine learning algorithm Bagging is more suitable for NFRs classification from user reviews than Naïve Bayes and J48.

References

[1]
W. Maalej and H. Nabil. 2015. Bug report feature request or simply praise? On automatically classifying app reviews. In Proceedings of the 23rd IEEE International Requirements Engineering Conference (RE'15). IEEE, 116--125.
[2]
D. Pagano and W. Maalej. 2013. User feedback in the appstore: an empirical study. In Proceedings of the 21st IEEE International Requirements Engineering Conference (RE'13). IEEE, 125--134.
[3]
C. Iacob and R. Harrison. 2013. Retrieving and analyzing mobile apps feature requests from online reviews. In Proceeding of the 10th IEEE Working Conference on Mining Software Repositories (MSR'13). IEEE, 41--44.
[4]
R. Chandy and H. Gu. 2012. Identifying spam in the IOS app store. In Proceedings of the 2nd Joint WICOW/AIRWeb Workshop on Web Quality (WebQuality'12). ACM, 56--59.
[5]
Y. Yang and J. P. Pedersen. 1997. A comparative study on feature selection in text categorization. In Proceedings of the 14th International Conference on Machine Learning (ICML'97). Morgan Kaufmann, 412--420.
[6]
N. Chen, J. Lin, Steven C. H. Hoi, X. Xiao, and B. Zhang. 2014. AR-miner: mining informative reviews for developers from mobile app marketplace. In Proceedings of the 36th International Conference on Software Engineering (ICSE'14). ACM, 767--778.
[7]
D. M. Blei, A. Y. Ng, and M. I. Jordan. 2003. Latent dirichlet allocation. Journal of Machine Learning Research 3, (2003), 993--1022.
[8]
S. Di Panichella, A. Sorbo, E. Guzman, C. A. Visaggio, G. Canfora, and H. C. Gall. 2015. How can I improve my app? Classifying user reviews for software maintenance and evolution. In Proceedings of the 31st IEEE International Conference on Software Maintenance and Evolution (ICSME'15). IEEE, 281--290.
[9]
X. Gu and S. Kim. What parts of your apps are loved by users? 2015. In Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering (ASE'15). IEEE, 760--770.
[10]
P. M. Vu, T. T. Nguyen, and H. V. Pham. 2015. Mining user opinions in mobile app reviews: a keyword-based approach. In Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering (ASE'15). IEEE, 749--759.
[11]
T. Mikolov, K. Chen, G. Corrado, and J. Dean. 2013. Efficient estimation of word representations in vector space. In Workshop of 1st International Conference on Learning Representations (ICLR'13).
[12]
S. McIlroy, N. Ali, H. Khalid, and A. E. Hassan. 2016. Analyzing and automatically labelling the types of user issues that are raised in mobile app reviews. Empirical Software Engineering 21, 3 (2016), 1067--1106.
[13]
Y. Zhang, R. Jin, and Z. H. Zhou. 2010. Understanding bag-of-words model: a statistical framework. International Journal of Machine Learning and Cybernetics 1, 1--4 (2010), 43--52.
[14]
P. Liang, P. Avgeriou, K. He, and L. Xu. 2010 From collective knowledge to intelligence: pre-requirements analysis of large and complex systems. In Proceedings of the 1st Workshop on Web 2.0 for Software Engineering (Web2SE'10), ACM, 26--30.
[15]
G. Forman. 2003. An extensive empirical study of feature selection metrics for text classification. Journal of Machine Learning Research 3, 3 (2003), 1289--1305.
[16]
C. H. Li, J. C. Yang, and S. C. Park. 2012. Text categorization algorithms using semantic approaches corpus-based thesaurus and WordNet. Expert Systems with Applications 39, 1 (2012), 765--772.
[17]
Y. Zhou, Y. Tong, R. Gu and H. Gall. 2014. Combining text mining and data mining for bug report classification? In Proceedings of the 30th IEEE International Conference on Software Maintenance and Evolution (ICSME'14). IEEE, 311--320.
[18]
W. Maalej, M. Nayebi, T. Johann, and G. Ruhe. 2016. Toward data-driven requirements engineering. IEEE Software 33, 1 (2016), 48--54.
[19]
C. Gao, H. Xu, J. Hu, and Y. Zhou. 2015. Ar-tracker: track the dynamics of mobile apps via user review mining. In Proceedings of the 10th IEEE Symposium on Service-Oriented System Engineering (SOSE'15). IEEE, 284--290.
[20]
S. Xie, G. Wang, S. Lin, and P. S. Yu. 2012. Review spam detection via temporal pattern discovery. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'12). ACM, 823--831.
[21]
J. Oh, D. Kim, U. Lee, J. G. Lee, and J. Song. 2013. Facilitating developer-user interactions with mobile app review digests. In CHI'13 Extended Abstracts on Human Factors in Computing Systems (CHI'13). ACM, 1809--1814.
[22]
A. Di Sorbo, S. Panichella, C. V. Alexandru, J. Shimagaki, C. A. Visaggio, G. Canfora, and H. Gall. 2016. What would users change in my app? summarizing app reviews for recommending software changes. In Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE'16). ACM, 499--510.
[23]
S. Rastkar, G. C. Murphy, and G. Murray. 2014. Automatic summarization of bug reports. IEEE Transactions on Software Engineering 40, 4 (2014), 366--380.
[24]
L. V. Galvis Carreno and K. Winbladh. 2013. Analysis of user comments: an approach for software requirements evolution. In Proceedings of the 35th International Conference on Software Engineering (ICSE'13). IEEE, 582--591.
[25]
J. Cleland-Huang, R. Settimi, X. Zou, and P. Solc. 2007. Automated classification of non-functional requirements. Requirements Engineering 12, 2 (2007), 103--120.
[26]
A. Mahmoud and W. Grant. 2016. Detecting classifying and tracing non-functional software requirements. Requirements Engineering 21, 3 (2016), 1--25.
[27]
S. McIlroy, W. Shang, N. Ali, and A. Hassan. 2015. Is it worth responding to reviews? A case study of the top free apps in the Google Play store. IEEE Software.
[28]
W. Martin, F. Sarro, Y. Jia, Y. Zhang, and M. Harman. 2016. A Survey of app store analysis for software engineering. IEEE Transactions on Software Engineering.
[29]
Y. Tian, M. Nagappan, D. Lo, and A. E. Hassan. 2015. What are the characteristics of high-rated apps? A case study on free Android applications. In Proceedings of the 31th IEEE International Conference on Software Maintenance and Evolution (ICSME'15). IEEE, 301--310.
[30]
A. A. Al-Subaihin, F. Sarro, S. Black, L. Capra, M. Harman, Y. Jia, and Y. ZhangTavecchia. 2016. Clustering mobile apps based on mined textual features. In Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM'16). ACM, 1--38.
[31]
Number of apps available in leading app stores as of June 2016, http://www.statista.com/statistics/276623/number-of-apps-available-inleading-app-stores/, accessed on 2016-07-01.
[32]
J. R. Quinlan. 1996. Bagging boosting and C4.5. In Proceedings of the 13th AAAI Conference on Artificial Intelligence (AAAI'96). AAAI Press, 725--730.
[33]
F. Shull, J. Singer, and D. I. Sjøberg. 2008. Guide to advanced empirical software engineering. Springer-Verlag, London.
[34]
W. Zhang, Y. Yang, Q. Wang, and F. Shu. 2015. An empirical study on classification of non-functional requirements. In Proceedings of the 23rd International Conference on Software Engineering and Knowledge Engineering (SEKE'15). Knowledge Systems Institute, 190--195.
[35]
ISO, ISO/IEC 25010, 2011. Systems and software engineering --- Systems and software Quality Requirements and Evaluation (SQuaRE) --- System and software quality models. In ISO/IEC FDIS 25010, 2011, 1--34.
[36]
L. Hoon, M. A. Rodriguez-García, R. Vasa, R. Valencia-García, and J. G. Schneider. 2016 App reviews: breaking the user and developer language barrier. In Trends and Applications in Software Engineering. Springer International Publishing, 223--233.
[37]
T. Dietterich. 1995. Overfitting and undercomputing in machine learning. ACM computing surveys 27, 3 (1995), 326--327.
[38]
P. Liang and H. Yang. 2015. Identification and classification of requirements from app user reviews. In Proceedings of the 27th International Conference on Software Engineering and Knowledge Engineering (SEKE'15). Knowledge Systems Institute, 7--12.
[39]
L. Villarroel, G. Bavota, B. Russo, R. Oliveto, and M. Di Penta. 2016. Release planning of mobile apps based on user reviews. In Proceedings of the 38th International Conference on Software Engineering (ICSE'16). ACM, 14--24.
[40]
G. B. Chen and H. Y. Kao. 2015. Word co-occurrence augmented topic model in short text. International Journal of Computational Linguistics and Chinese Language Processing 20, 2 (2015), 45--64.
[41]
Emitza Guzman, Omar Aly, and Bernd Bruegge. 2015. Retrieving diverse opinions from app reviews. In Proceedings of the 9th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM'15). ACM, 1--10.
[42]
B. Wallace, K. Small, C. Brodley, and T. Trikalinos. 2011 Class imbalance, redux. In Proceedings of the 11th IEEE International Conference on Data Mining (ICDM'11). IEEE, 754--763.

Cited By

View all
  • (2024)Mobile app review analysis for crowdsourcing of software requirements: a mapping study of automated and semi-automated toolsPeerJ Computer Science10.7717/peerj-cs.240110(e2401)Online publication date: 5-Nov-2024
  • (2024)Detecting Ambiguities in Requirement Documents Written in Arabic Using Machine Learning AlgorithmsInternational Journal of Cloud Applications and Computing10.4018/IJCAC.33956314:1(1-19)Online publication date: 9-Apr-2024
  • (2024)Exploring the Software Quality Maze: Detecting Scattered and Tangled Crosscutting Quality Concerns in Source Code in Support of Maintenance Tasksundefined10.12794/metadc2332577Online publication date: May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
EASE '17: Proceedings of the 21st International Conference on Evaluation and Assessment in Software Engineering
June 2017
405 pages
ISBN:9781450348041
DOI:10.1145/3084226
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • School of Computing, BTH: Blekinge Institute of Technology - School of Computing

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 June 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Automatic Classification
  2. Non-Functional Requirements
  3. Textual Semantics
  4. User Reviews

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

EASE'17

Acceptance Rates

Overall Acceptance Rate 71 of 232 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)115
  • Downloads (Last 6 weeks)6
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Mobile app review analysis for crowdsourcing of software requirements: a mapping study of automated and semi-automated toolsPeerJ Computer Science10.7717/peerj-cs.240110(e2401)Online publication date: 5-Nov-2024
  • (2024)Detecting Ambiguities in Requirement Documents Written in Arabic Using Machine Learning AlgorithmsInternational Journal of Cloud Applications and Computing10.4018/IJCAC.33956314:1(1-19)Online publication date: 9-Apr-2024
  • (2024)Exploring the Software Quality Maze: Detecting Scattered and Tangled Crosscutting Quality Concerns in Source Code in Support of Maintenance Tasksundefined10.12794/metadc2332577Online publication date: May-2024
  • (2024)Interpretable App Review Classification with Transformers2024 IEEE 32nd International Requirements Engineering Conference Workshops (REW)10.1109/REW61692.2024.00009(26-34)Online publication date: 24-Jun-2024
  • (2024)A Systematic Review of AI-Enabled Frameworks in Requirements ElicitationIEEE Access10.1109/ACCESS.2024.347529312(154310-154336)Online publication date: 2024
  • (2024)A deep learning framework for non-functional requirement classificationScientific Reports10.1038/s41598-024-52802-014:1Online publication date: 8-Feb-2024
  • (2024)How to effectively mine app reviews concerning software ecosystem? A survey of review characteristicsJournal of Systems and Software10.1016/j.jss.2024.112040213(112040)Online publication date: Jul-2024
  • (2024)Classification of functional and nonfunctional requirements based on convolutional neural network with flower pollination optimizerInnovations in Systems and Software Engineering10.1007/s11334-024-00592-zOnline publication date: 4-Nov-2024
  • (2024)The application of AI techniques in requirements classification: a systematic mappingArtificial Intelligence Review10.1007/s10462-023-10667-157:3Online publication date: 15-Feb-2024
  • (2024)Enhancing Software Requirements Classification with Machine Learning and Feature Selection TechniquesSoftware and Data Engineering10.1007/978-3-031-75201-8_2(14-30)Online publication date: 19-Oct-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media