ABSTRACT
Requirements engineering (RE) is the process of defining, documenting, and maintaining software requirements. Crowd-based RE (CrowdRE) involves large scale user participation in requirements engineering tasks. It improves the quality of software requirements and helps in reducing the cost. Manual extraction of useful insights from a large body of unstructured, and noisy natural language data produced during CrowdRE is an expensive, error prone and time consuming task. Thus, automated techniques are required for processing the CrowdRE data. We focus on the problem of automatic classification of crowd-based software requirements into sectors. We propose three different approaches for sector classification of crowd-based software requirements. These approaches are based on supervised machine learning (ML) models, neural networks, and bidirectional encoder representations from transformers (BERT), respectively. We apply our classification approaches to a large-sized requirements document, i.e. a CrowdRE dataset with around 3000 crowd-generated requirements for smart home applications. To evaluate the quality of our classification algorithms we use the publicly available ground truth data for computing precision, recall, and F-score. We compare the performance of several classification algorithms and our detailed experiments indicate that these algorithms can be very useful for categorizing crowd-based requirements into sectors.
- Sallam Abualhaija, Chetan Arora, Mehrdad Sabetzadeh, Lionel C. Briand, and Eduardo Vaz. 2019. A Machine Learning-Based Approach for Demarcating Requirements in Textual Specifications. In 27th IEEE International Requirements Engineering Conference, RE 2019, Jeju Island, Korea (South), September 23--27, 2019. IEEE, 51--62.Google ScholarCross Ref
- Adedamola Adepetu, Khaja Altaf Ahmed, Yousif Al Abd, Aaesha Al Zaabi, and Davor Svetinovic. 2012. CrowdREquire: A Requirements Engineering Crowd-sourcing Platform. In Wisdom of the Crowd, Papers from the 2012 AAAI Spring Symposium, Palo Alto, California, USA, March 26--28, 2012 (AAAI Technical Report), Vol. SS-12-06. AAAI.Google Scholar
- C. Arora, M. Sabetzadeh, L. Briand, and F. Zimmer. 2017. Automated Extraction and Clustering of Requirements Glossary Terms. IEEE Transactions on Software Engineering 43, 10 (October 2017), 918--945.Google ScholarCross Ref
- Kushagra Bhatia, Siba Mishra, and Arpit Sharma. 2020. Clustering Glossary Terms Extracted from Large-Sized Software Requirements using FastText. In ISEC 2020: 13th Innovations in Software Engineering Conference, Jabalpur, India, February 27--29, 2020. ACM, 5:1--5:11.Google ScholarDigital Library
- Travis D. Breaux and Florian Schaub. 2014. Scaling requirements extraction to the crowd: Experiments with privacy policies. In IEEE 22nd International Requirements Engineering Conference, RE 2014, Karlskrona, Sweden, August 25--29, 2014. IEEE Computer Society, 163--172.Google Scholar
- Agustin Casamayor, Daniela Godoy, and Marcelo R. Campo. 2010. Identification of non-functional requirements in textual specifications: A semi-supervised learning approach. Inf. Softw. Technol. 52, 4 (2010), 436--445.Google ScholarDigital Library
- Daniel Cer, Yinfei Yang, Sheng-yi Kong, Nan Hua, Nicole Limtiaco, Rhomni St John, Noah Constant, Mario Guajardo-Cespedes, Steve Yuan, Chris Tar, et al. 2018. Universal sentence encoder. arXiv preprint arXiv:1803.11175 (2018).Google Scholar
- Abhijnan Chakraborty, Bhargavi Paranjape, Sourya Kakarla, and Niloy Ganguly. 2016. Stop clickbait: Detecting and preventing clickbaits in online news media. In 2016 ieee/acm international conference on advances in social networks analysis and mining (asonam). IEEE, 9--16.Google Scholar
- Jane Cleland-Huang, Raffaella Settimi, Xuchang Zou, and Peter Solc. 2007. Automated classification of non-functional requirements. Requir. Eng. 12, 2 (2007), 103--120.Google ScholarDigital Library
- Fabiano Dalpiaz, Davide Dell'Anna, Fatma Basak Aydemir, and Sercan Çevikol. 2019. Requirements Classification with Interpretable Machine Learning and Dependency Parsing. In 27th IEEE International Requirements Engineering Conference, RE 2019, Jeju Island, Korea (South), September 23--27, 2019. IEEE, 142--152.Google Scholar
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv:cs.CL/1810.04805Google Scholar
- Soufiane El Mrabti, Mohammed Al Achhab, and Mohamed Lazaar. 2018. Comparison of feature selection methods for sentiment analysis. In International Conference on Big Data, Cloud and Applications. Springer, 261--272.Google ScholarCross Ref
- Alessio Ferrari, Beatrice Donati, and Stefania Gnesi. 2017. Detecting Domain-Specific Ambiguities: An NLP Approach Based on Wikipedia Crawling and Word Embeddings. In 25th IEEE International Requirements Engineering Conference Workshops (REW). 393--399.Google Scholar
- Alessio Ferrari, Andrea Esuli, and Stefania Gnesi. 2018. Identification of Cross-Domain Ambiguity with Language Models. In 5th International Workshop on Artificial Intelligence for Requirements Engineering (AIRE). 31--38.Google Scholar
- Jannik Fischbach, Maximilian Junker, Andreas Vogelsang, and Dietmar Freudenstein. 2019. Automated Generation of Test Models from Semi-Structured Requirements. In 27th IEEE International Requirements Engineering Conference Workshops, RE 2019 Workshops, Jeju Island, Korea (South), September 23--27, 2019. IEEE, 263--269.Google ScholarCross Ref
- Tim Gemkow, Miro Conzelmann, Kerstin Hartig, and Andreas Vogelsang. 2018. Automatic Glossary Term Extraction from Large-Scale Requirements Specifications. In 26th IEEE International Requirements Engineering Conference. IEEE Computer Society, 412--417.Google ScholarCross Ref
- Emitza Guzman, Mohamed Ibrahim, and Martin Glinz. 2017. A Little Bird Told Me: Mining Tweets for Requirements and Software Evolution. In 25th IEEE International Requirements Engineering Conference, RE 2017, Lisbon, Portugal, September 4--8, 2017. IEEE Computer Society, 11--20.Google ScholarCross Ref
- M. Elizabeth C. Hull, Ken Jackson, and Jeremy Dick. 2005. Requirements Engineering (second ed.). Springer.Google Scholar
- Edilson Anselmo Corrêa Júnior, Vanessa Queiroz Marinho, and Leandro Borges dos Santos. 2017. NILC-USP at semeval-2017 task 4: A multi-view ensemble for twitter sentiment analysis. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017). 611--615.Google ScholarCross Ref
- Yoon Kim. 2014. Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014).Google Scholar
- Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
- Zijad Kurtanovic and Walid Maalej. 2017. Automatically Classifying Functional and Non-functional Requirements Using Supervised Machine Learning. In 25th IEEE International Requirements Engineering Conference, RE 2017, Lisbon, Portugal, September 4--8, 2017. IEEE Computer Society, 490--495.Google Scholar
- Zijad Kurtanovic and Walid Maalej. 2017. Mining User Rationale from Software Reviews. In 25th IEEE International Requirements Engineering Conference, RE 2017, Lisbon, Portugal, September 4--8, 2017. IEEE Computer Society, 61--70.Google ScholarCross Ref
- Chuanyi Li, Liguo Huang, Jidong Ge, Bin Luo, and Vincent Ng. 2018. Automatically classifying user requests in crowdsourcing requirements engineering. J. Syst. Softw. 138 (2018), 108--123.Google ScholarCross Ref
- Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv:cs.CL/1907.11692Google Scholar
- Ilya Loshchilov and Frank Hutter. 2017. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017).Google Scholar
- Walid Maalej, Zijad Kurtanovic, Hadeer Nabil, and Christoph Stanik. 2017. On the Automatic Classification of App Reviews. In Software Engineering 2017, Fachtagung des GI-Fachbereichs Softwaretechnik, 21.-24. Februar 2017, Hannover, Deutschland (LNI), Vol. P-267. GI, 61--62.Google Scholar
- Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111--3119.Google Scholar
- S. Mishra and A. Sharma. 2019. On the Use of Word Embeddings for Identifying Domain Specific Ambiguities in Requirements. In 2019 IEEE 27th International Requirements Engineering Conference Workshops (REW). 234--240.Google Scholar
- Siba Mishra and Arpit Sharma. 2020. Automatic Word Embeddings-Based Glossary Term Extraction from Large-Sized Software Requirements. In Requirements Engineering: Foundation for Software Quality - 26th International Working Conference, REFSQ 2020, Pisa, Italy, March 24--27, 2020 (LNCS 12045). Springer, 203--218.Google Scholar
- Pradeep K. Murukannaiah, Nirav Ajmeri, and Munindar P. Singh. 2016. Acquiring Creative Requirements from the Crowd: Understanding the Influences of Individual Personality and Creative Potential in Crowd RE. In 24th IEEE International Requirements Engineering Conference (RE). 176--185.Google Scholar
- P. K. Murukannaiah, N. Ajmeri, and M. P. Singh. 2017. Toward Automating Crowd RE. In 25th IEEE International Requirements Engineering Conference (RE). 512--515.Google Scholar
- Klaus Pohl. 2010. Requirements Engineering - Fundamentals, Principles, and Techniques (first ed.). Springer.Google Scholar
- Maria Riaz, Jason Tyler King, John Slankas, and Laurie A. Williams. 2014. Hidden in plain sight: Automatically identifying security requirements from natural language artifacts. In IEEE 22nd International Requirements Engineering Conference, RE 2014, Karlskrona, Sweden, August 25--29, 2014. IEEE Computer Society, 183--192.Google ScholarCross Ref
- Paige Rodeghero, Siyuan Jiang, Ameer Armaly, and Collin McMillan. 2017. Detecting user story information in developer-client conversations to generate extractive summaries. In Proceedings of the 39th International Conference on Software Engineering, ICSE 2017, Buenos Aires, Argentina, May 20--28, 2017. IEEE / ACM, 49--59.Google ScholarDigital Library
- Gerard Salton and Christopher Buckley. 1988. Term-weighting approaches in automatic text retrieval. Information processing & management 24, 5 (1988), 513--523.Google Scholar
- Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019).Google Scholar
- Jonas Winkler and Andreas Vogelsang. 2016. Automatic Classification of Requirements Based on Convolutional Neural Networks. In 24th IEEE International Requirements Engineering Conference, RE 2016, Beijing, China, September 12--16, 2016. IEEE Computer Society, 39--45.Google Scholar
- Jonas Paul Winkler, Jannis Grönberg, and Andreas Vogelsang. 2019. Predicting How to Test Requirements: An Automated Approach. In 27th IEEE International Requirements Engineering Conference, RE 2019, Jeju Island, Korea (South), September 23--27, 2019. IEEE, 120--130.Google ScholarCross Ref
- Liping Zhao, Waad Alhoshan, Alessio Ferrari, Keletso J. Letsholo, Muideen A. Ajagbe, Erol-Valeriu Chioasca, and Riza Theresa Batista-Navarro. 2020. Natural Language Processing (NLP) for Requirements Engineering: A Systematic Mapping Study. CoRR abs/2004.01099 (2020). arXiv:2004.01099 https://arxiv.org/abs/2004.01099Google Scholar
Index Terms
- Sector classification for crowd-based software requirements
Recommendations
To apply Data Mining for Classification of Crowd sourced Software Requirements
ICSIE '19: Proceedings of the 8th International Conference on Software and Information EngineeringNow a day's main focus of developers is to build quality software that works according to customer needs and for this reason it is necessary to gather right requirements as requirement elicitation is the critical step that impacts on the success of ...
Improving BERT model for requirements classification by bidirectional LSTM-CNN deep model
Highlights- This research paper developed BERT-BiCNN based deep learning model to capture software requirements from the formal requirements.
- Since, traditional classification model suffers from poor generalization problem. Therefore, proposed ...
AbstractIn the last decade, requirements classification has emerged as hot research topic in Requirements Engineering (RE). Early identification of software requirements helps the development team in the design of software systems. Manual identification ...
Graphical abstractDisplay Omitted
Automated classification of non-functional requirements
AbstractThis paper describes a technique for automating the detection and classification of non-functional requirements related to properties such as security, performance, and usability. Early detection of non-functional requirements enables them to be ...
Comments