skip to main content
10.1145/3350768.3350776acmotherconferencesArticle/Chapter ViewAbstractPublication PagessbesConference Proceedingsconference-collections
research-article

Software Engineering Repositories: Expanding the PROMISE Database

Published: 23 September 2019 Publication History

Abstract

Defining and classifying software requirements are critical tasks for determining software functionality and overall software architecture. In this sense, several types of research are being developed aiming to automate the classification of software requirements through the use of machine learning algorithms. However, the feasibility of such studies runs counter to the existence of a public database that is adequate in terms of quantity and quality of sample requirements. A requirement base widely used in this type of task is the PROMISE. However, the number of requirements is considered low for practical applications involving machine learning. This research presents an expansion of the PROMISE corpus. New software requirements were incorporated, and the resulting dataset was evaluated through the use of well-known machine learning algorithms. We observed some improvement in the performance of these algorithms regarding the identification of some types of software requirements.

References

[1]
Zahra Shakeri Hossein Abad, Oliver Karras, Parisa Ghazi, Martin Glinz, Günther Ruhe, and Kurt Schneider. 2017. What Works Better? A Study of Classifying Requirements. 2017 IEEE 25th International Requirements Engineering Conference (RE) (2017), 496--501.
[2]
Rana Alkadhi, Teodora Lata, Emitza Guzmany, and Bernd Bruegge. 2017. Rationale in development chat messages: an exploratory study. IEEE.
[3]
Rana Alkadhi, Manuel Nonnenmacher, Emitza Guzman, and Bernd Bruegge. 2018. How do developers discuss rationale?. In 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, IEEE, Campobasso, Italy, 357--369.
[4]
Elisa Baniassad, Paul C Clements, Joao Araujo, Ana Moreira, Awais Rashid, and Bedir Tekinerdogan. 2006. Discovering early aspects. IEEE software 23, 1 (2006), 61--70.
[5]
Anna L Buczak and Erhan Guven. 2016. A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Communications Surveys & Tutorials 18, 2 (2016), 1153--1176.
[6]
Agustín Casamayor, Daniela Godoy, and Marcelo Campo. 2009. Semi-Supervised Classification of Non-Functional Requirements: An Empirical Analysis. Inteligencia artificial: Revista Iberoamericana de Inteligencia Artificial, ISSN 1137-3601, Vol. 13, N°. 44, 2009, pags. 35-44 (05 2009). https://doi.org/10.4114/ia.v13i44.1044
[7]
Min Chen, Yixue Hao, Kai Hwang, Lu Wang, and Lin Wang. 2017. Disease prediction by machine learning over big data from healthcare communities. Ieee Access 5 (2017), 8869--8879.
[8]
Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, et al. 2016. Wide & deep learning for recommender systems. In Proceedings of the 1st workshop on deep learning for recommender systems. ACM, ACM, Boston, MA, USA, 7--10.
[9]
J. Cleland-Huang, R. Settimi, X. Zou, and P. Solc. 2006. The Detection and Classification of Non-Functional Requirements with Application to Early Aspects. In 14th IEEE International Requirements Engineering Conference (RE'06). IEEE, Minneapolis/St. Paul, MN, USA, 39--48. https://doi.org/10.1109/RE.2006.65
[10]
Jacob Cohen. 1960. A coefficient of agreement for nominal scales. Educational and psychological measurement 20, 1 (1960), 37--46.
[11]
Bradley Efron. 2013. Bayes' theorem in the 21st century. Science 340, 6137 (2013), 1177--1178.
[12]
Katti Faceli et al. 2011. Inteligência Artificial: Uma Abordagem de Aprendizagem de Máquina. LTC.
[13]
Yeongsu Kim et. al. 2018. Improving Classifiers for Semantic Annotation of Software Requirements with Elaborate Syntatic Structure. International Journal of Advanced Science and Technology, ISSN 2005-4238 IJAST, Vol. 112, N°. 44, 2009, pags. 123--136 (2018), 14. https://doi.org/10.14257/ijast.2018.112.12
[14]
Aurélien Géron. 2017. Hands-on machine learning with Scikit-Learn and Tensor-Flow: concepts, tools, and techniques to build intelligent systems. "O'Reilly Media, Inc.", USA.
[15]
Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H Witten. 2009. The WEKA data mining software: an update. ACM SIGKDD explorations newsletter 11, 1 (2009), 10--18.
[16]
Ahmed E Hassan and Tao Xie. 2010. Mining software engineering data. IEEE.
[17]
IEEE. 1998. IEEE Recommended Practice for Software Requirements Specifications. (1998), 37. https://doi.org/10.1109/IEEESTD.1998.88286
[18]
Justin Johnson, Andrej Karpathy, and Li Fei-Fei. 2016. Densecap: Fully convo-lutional localization networks for dense captioning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Las Vegas, NV, USA, 4565--4574.
[19]
Reyes Ju, Guillermo Licea, et al. 2017. Towards supporting software engineering using deep learning: A case of software requirements classification. In 2017 5th International Conference in Software Engineering Research and Innovation (CONISOFT). IEEE, IEEE, Mérida, Mexico, 116--120.
[20]
Qadeem Khan, Usman Akram, Wasi Haider Butt, and Saad Rehman. 2016. Implementation and evaluation of optimized algorithm for software architectures analysis through unsupervised learning (clustering). In 2016 17th International Conference on Sciences and Techniques of Automatic Control and Computer Engineering (STA). IEEE, IEEE, Sousse, Tunisian, 266--276.
[21]
Sotiris B Kotsiantis, I Zaharakis, and P Pintelas. 2007. Supervised machine learning: A review of classification techniques. Emerging artificial intelligence applications in computer engineering 160 (2007), 3--24.
[22]
J Richard Landis and Gary G Koch. 1977. The measurement of observer agreement for categorical data. biometrics (1977), 159--174.
[23]
R. Navarro-Almanza, R. Juárez-Ramírez, and G. Licea. 2017. Towards Supporting Software Engineering Using Deep Learning: A Case of Software Requirements Classification. In 2017 5th International Conference in Software Engineering Research and Innovation (CONISOFT). IEEE, Mérida, Mexico, 116--120. https://doi.org/10.1109/CONISOFT.2017.00021
[24]
Mohd Hafeez Osman and Mohd Firdaus Zaharin. 2018. Ambiguous software requirement specification detection: an automated approach. In 2018 IEEE/ACM 5th International Workshop on Requirements Engineering and Testing (RET). IEEE, IEEE, Gothenburg, Sweden, Sweden, 33--40.
[25]
Fabrizio Sebastiani. 2002. Machine Learning in Automated Text Categorization. ACM Comput. Surv. 34, 1 (March 2002), 1--47. https://doi.org/10.1145/505282.505283
[26]
I. Sommerville. 2011. Engenharia de software. PEARSON BRASIL.
[27]
Jason Van Hulse, Taghi M. Khoshgoftaar, and Amri Napolitano. 2007. Experimental Perspectives on Learning from Imbalanced Data. In Proceedings of the 24th International Conference on Machine Learning (ICML '07). ACM, New York, NY, USA, 935--942. https://doi.org/10.1145/1273496.1273614
[28]
C. J. van Rijsbergen. 1979. Information Retrieval. http://www.dcs.gla.ac.uk/Keith/Preface.html. Acessado em 8 de maio de 2019.
[29]
Peter Willett. 2006. The Porter stemming algorithm: then and now. Program 40, 3 (2006), 219--223.
[30]
David H Wolpert, William G Macready, et al. 1997. No free lunch theorems for optimization. IEEE transactions on evolutionary computation 1, 1 (1997), 67--82.
[31]
Ong Shu Yee, Saravanan Sagadevan, and Nurul Hashimah Ahamed Hassain Malim. 2018. Credit card fraud detection using machine learning as data mining technique. Journal of Telecommunication, Electronic and Computer Engineering (JTEC) 10, 1-4 (2018), 23--27.

Cited By

View all
  • (2025)Enhancing Software Sustainability: Leveraging Large Language Models to Evaluate Security Requirements Fulfillment in Requirements EngineeringSystems10.3390/systems1302011413:2(114)Online publication date: 12-Feb-2025
  • (2025)Machine Learning for Requirements ClassificationHandbook on Natural Language Processing for Requirements Engineering10.1007/978-3-031-73143-3_2(19-59)Online publication date: 6-Mar-2025
  • (2024)Detecting Ambiguities in Requirement Documents Written in Arabic Using Machine Learning AlgorithmsInternational Journal of Cloud Applications and Computing10.4018/IJCAC.33956314:1(1-19)Online publication date: 9-Apr-2024
  • Show More Cited By

Index Terms

  1. Software Engineering Repositories: Expanding the PROMISE Database

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    SBES '19: Proceedings of the XXXIII Brazilian Symposium on Software Engineering
    September 2019
    583 pages
    ISBN:9781450376518
    DOI:10.1145/3350768
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    • SBC: Sociedade Brasileira de Computação

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 September 2019

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. machine learning
    2. requirements classification
    3. software repositories

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    • Coordenação de Aperfeiçoamento de Pessoal de Nível Superior
    • Fundação de Amparo à Pesquisa do Estado do Amazonas

    Conference

    SBES 2019

    Acceptance Rates

    SBES '19 Paper Acceptance Rate 67 of 153 submissions, 44%;
    Overall Acceptance Rate 147 of 427 submissions, 34%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)113
    • Downloads (Last 6 weeks)10
    Reflects downloads up to 03 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Enhancing Software Sustainability: Leveraging Large Language Models to Evaluate Security Requirements Fulfillment in Requirements EngineeringSystems10.3390/systems1302011413:2(114)Online publication date: 12-Feb-2025
    • (2025)Machine Learning for Requirements ClassificationHandbook on Natural Language Processing for Requirements Engineering10.1007/978-3-031-73143-3_2(19-59)Online publication date: 6-Mar-2025
    • (2024)Detecting Ambiguities in Requirement Documents Written in Arabic Using Machine Learning AlgorithmsInternational Journal of Cloud Applications and Computing10.4018/IJCAC.33956314:1(1-19)Online publication date: 9-Apr-2024
    • (2024)Unveiling the Correlation between Nonfunctional Requirements and Sustainable Environmental Factors Using a Machine Learning ModelSustainability10.3390/su1614590116:14(5901)Online publication date: 11-Jul-2024
    • (2024)Enhancing Software Requirements Classification with Semisupervised GAN‐BERT TechniqueJournal of Electrical and Computer Engineering10.1155/2024/49556912024:1Online publication date: 30-Jul-2024
    • (2024)Exploring the Use of Large Language Models in Requirements Engineering Education: An Experience Report with ChatGPT 3.5Proceedings of the XXIII Brazilian Symposium on Software Quality10.1145/3701625.3701687(624-634)Online publication date: 5-Nov-2024
    • (2024)A Systematic Review of AI-Enabled Frameworks in Requirements ElicitationIEEE Access10.1109/ACCESS.2024.347529312(154310-154336)Online publication date: 2024
    • (2024)EnsCL-CatBoost: A Strategic Framework for Software Requirements ClassificationIEEE Access10.1109/ACCESS.2024.345201112(127614-127628)Online publication date: 2024
    • (2024)A deep learning framework for non-functional requirement classificationScientific Reports10.1038/s41598-024-52802-014:1Online publication date: 8-Feb-2024
    • (2024)Extracting goal models from natural language requirement specificationsJournal of Systems and Software10.1016/j.jss.2024.111981211(111981)Online publication date: May-2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media