extended-abstract

Efficiency of Large Language Models to scale up Ground Truth: Overview of the IRSE Track at Forum for Information Retrieval 2023

Authors:

Srijoni Majumdar,

Ayan Bandyopadhyay,

Samiran Chattopadhyay,

Prasenjit MajumderAuthors Info & Claims

FIRE '23: Proceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation

Pages 16 - 18

https://doi.org/10.1145/3632754.3633480

Published: 12 February 2024 Publication History

FIRE '23: Proceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation

Efficiency of Large Language Models to scale up Ground Truth: Overview of the IRSE Track at Forum for Information Retrieval 2023

Pages 16 - 18

Abstract
References

Abstract

The Software Engineering Information Retrieval (IRSE) track aims to devise solutions for the automated evaluation of code comments within a machine learning framework, with labels generated by both humans and large language models. Within this track, there is a binary classification task: discerning comments as either useful or not useful. The dataset includes 9,048 pairs of code comments and surrounding code snippets drawn from open-source C-based projects on GitHub and an additional dataset generated by teams employing large language models. In total, 17 teams representing various universities and software companies have contributed 56 experiments. These experiments were assessed through quantitative metrics, primarily the F1-Score, and qualitative evaluations based on the features developed, the supervised learning models employed, and their respective hyperparameters. It is worth noting that labels generated by large language models introduce bias into the prediction model but lead to less over-fitted results.

References

[1]

Amiangshu Bosu, Michaela Greiler, and Christian Bird. 2015. Characteristics of useful code reviews: An empirical study at microsoft(Working Conference on Mining Software Repositories). IEEE, 146–156.

[2]

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.

[3]

Nachiketa Chatterjee, Srijoni Majumdar, Shila Rani Sahoo, and Partha Pratim Das. 2015. Debugging multi-threaded applications using pin-augmented gdb (pgdb). In International conference on software engineering research and practice (SERP). Springer. 109–115.

[4]

Srijoni Majumdar, Ayan Bandyopadhyay, Samiran Chattopadhyay, Partha Pratim Das, Paul D Clough, and Prasenjit Majumder. 2022. Overview of the IRSE track at FIRE 2022: Information Retrieval in Software Engineering. In Forum for Information Retrieval Evaluation, ACM.

[5]

Srijoni Majumdar, Ayan Bandyopadhyay, Partha Pratim Das, Paul Clough, Samiran Chattopadhyay, and Prasenjit Majumder. 2022. Can we predict useful comments in source codes?-Analysis of findings from Information Retrieval in Software Engineering Track@ FIRE 2022. In Proceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation. 15–17.

Digital Library

[6]

Srijoni Majumdar, Ayush Bansal, Partha Pratim Das, Paul D Clough, Kausik Datta, and Soumya Kanti Ghosh. 2022. Automated evaluation of comments to aid software maintenance. Journal of Software: Evolution and Process 34, 7 (2022), e2463.

[7]

Srijoni Majumdar, Nachiketa Chatterjee, Partha Pratim Das, and Amlan Chakrabarti. 2021. A mathematical framework for design discovery from multi-threaded applications using neural sequence solvers. Innovations in Systems and Software Engineering 17, 3 (2021), 289–307.

Digital Library

[8]

Srijoni Majumdar, Nachiketa Chatterjee, Partha Pratim Das, and Amlan Chakrabarti. 2021. Dcube_ NN D cube NN: Tool for Dynamic Design Discovery from Multi-threaded Applications Using Neural Sequence Models. Advanced Computing and Systems for Security: Volume 14 (2021), 75–92.

[9]

Srijoni Majumdar, Nachiketa Chatterjee, Shila Rani Sahoo, and Partha Pratim Das. 2016. D-cube: tool for dynamic design discovery from multi-threaded applications using pin. In 2016 IEEE International Conference on Software Quality, Reliability and Security (QRS). IEEE, 25–32.

[10]

Srijoni Majumdar, Shakti Papdeja, Partha Pratim Das, and Soumya Kanti Ghosh. 2019. Smartkt: a search framework to assist program comprehension using smart knowledge transfer. In 2019 IEEE 19th International Conference on Software Quality, Reliability and Security (QRS). IEEE, 97–108.

[11]

Srijoni Majumdar, Shakti Papdeja, Partha Pratim Das, and Soumya Kanti Ghosh. 2020. Comment-Mine - A Semantic Search Approach to Program Comprehension from Code Comments. In Advanced Computing and Systems for Security. Springer, 29–42.

[12]

Srijoni Majumdar, Ashutosh Varshney, Partha Pratim Das, Paul D Clough, and Samiran Chattopadhyay. 2022. An Effective Low-Dimensional Software Code Representation using BERT and ELMo. In 2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS). IEEE, 763–774.

[13]

Michael P O’brien. 2003. Software comprehension–a review and research direction. Technical Report Technical Report. Department of Computer Science & Information Systems University of Limerick, Ireland.

[14]

Mohammad Masudur Rahman, Chanchal K Roy, and Raula G Kula. 2017. Predicting usefulness of code review comments using textual features and developer experience(International Conference on Mining Software Repositories (MSR)). IEEE, 215–226.

[15]

Daniela Steidl, Benjamin Hummel, and Elmar Juergens. 2013. Quality analysis of source code comments(International Conference on Program Comprehension (ICPC)). IEEE, 83–92.

Index Terms

Efficiency of Large Language Models to scale up Ground Truth: Overview of the IRSE Track at Forum for Information Retrieval 2023

Index terms have been assigned to the content through auto-classification.

Recommendations

Can we predict useful comments in source codes? - Analysis of findings from Information Retrieval in Software Engineering Track @ FIRE 2022
FIRE '22: Proceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation

The Information Retrieval in Software Engineering (IRSE) track aims to develop solutions for automated evaluation of code comments in a machine learning framework. In this track, there is a binary classification task to classify comments as useful and ...
Transformer-Based Language Models for Software Vulnerability Detection
ACSAC '22: Proceedings of the 38th Annual Computer Security Applications Conference

The large transformer-based language models demonstrate excellent performance in natural language processing. By considering the transferability of the knowledge gained by these models in one domain to other related domains, and the closeness of ...
Parsimonious language models for information retrieval
SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval

We systematically investigate a new approach to estimating the parameters of language models for information retrieval, called parsimonious language models. Parsimonious language models explicitly address the relation between levels of language models ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

FIRE '23: Proceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation

December 2023

170 pages

ISBN:9798400716324

DOI:10.1145/3632754

Copyright © 2023 Owner/Author.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 February 2024

Check for updates

Author Tags

Qualifiers

Extended-abstract
Research
Refereed limited

Conference

FIRE 2023

FIRE 2023: Forum for Information Retrieval Evaluation

December 15 - 18, 2023

Panjim, India

Acceptance Rates

Overall Acceptance Rate 19 of 64 submissions, 30%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
41
Total Downloads

Downloads (Last 12 months)38
Downloads (Last 6 weeks)6

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten