research-article

Code template inference using language models

Authors:

Ferosh Jacob,

Robert TairasAuthors Info & Claims

ACMSE '10: Proceedings of the 48th annual ACM Southeast Conference

Article No.: 104, Pages 1 - 6

https://doi.org/10.1145/1900008.1900143

Published: 15 April 2010 Publication History

Get Access

Abstract

This paper investigates the use of a natural language processing technique that automatically detects project-specific code templates (i.e., frequently used code blocks), which can be made available to software developers within an integrated development environment. During software development, programmers often and in some cases unknowingly rewrite the same code block that represents some functionality. These frequently used code blocks can inform the existence and possible use of code templates. Many existing code editors support code templates, but programmers are expected to manually define these templates and subsequently add them as templates in the editor. Furthermore, the support of editors to provide templates based on the editing context is still limited. The use of n-gram language models within the context of software development is described and evaluated to overcome these restrictions. The technique can search for project-specific code templates and present these templates to the programmer based on the current editing context.

References

[1]

C. Brockett, W. B. Dolan, and M. Gamon. Correcting ESL errors using phrasal SMT techniques. In ACL '06: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, pages 249--256, July 2006.

Digital Library

Google Scholar

[2]

R. Hill and J. Rideout. Automatic method completion. In ASE '04: Proceedings of the 19th IEEE International Conference on Automated Software Engineering, pages 228--235, 2004.

Digital Library

Google Scholar

[3]

R. Holmes and G. C. Murphy. Using structural context to recommend source code examples. In ICSE '05: Proceedings of the 27th International Conference on Software Engineering, pages 117--125. ACM, 2005.

Digital Library

Google Scholar

[4]

J.-L. Hsu, A. L. P. Chen, and C.-C. Liu. Efficient repeating pattern finding in music databases. In CIKM '98: Proceedings of the seventh International Conference on Information and Knowledge Management, pages 281--288, 1998.

Digital Library

Google Scholar

[5]

D. Jurafsky and J. H. Martin. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. 2000.

Digital Library

Google Scholar

[6]

B. Livshits. Turning eclipse against itself: Improving the quality of eclipse plugins, 2005.

Google Scholar

[7]

C. D. Manning and H. Schütze. Foundations of Statistical Natural Language Processing. 1999.

Digital Library

Google Scholar

[8]

R. Tairas. Centralizing clone group representation and maintenance. In OOPSLA '09: Companion to the 24th International Conference on Object-Oriented Programming, Systems, Languages, and Applications, pages 781--782. ACM, 2009.

Digital Library

Google Scholar

[9]

D. Zhang and J. J. P. Tsai. Machine learning and software engineering. In ICTAI '02: Proceedings of the 14th IEEE International Conference on Tools with Artificial Intelligence, page 22, 2002.

Digital Library

Google Scholar

Cited By

View all

Galasso JFamelis MSahraoui H(2022)Fine-Grained Analysis of Similar Code SnippetsReuse and Software Quality10.1007/978-3-031-08129-3_1(3-21)Online publication date: 15-Jun-2022
https://dl.acm.org/doi/10.1007/978-3-031-08129-3_1
Nguyen TNguyen T(2021)PERSONA: A personalized model for code recommendationPLOS ONE10.1371/journal.pone.025983416:11(e0259834)Online publication date: 16-Nov-2021
https://doi.org/10.1371/journal.pone.0259834
Xie RKong XWang LZhou YLi B(2019)HiRec: API Recommendation using Hierarchical Context2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE)10.1109/ISSRE.2019.00044(369-379)Online publication date: Oct-2019
https://doi.org/10.1109/ISSRE.2019.00044
Show More Cited By

Index Terms

Code template inference using language models
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Language resources
      2. Natural language generation
2. Software and its engineering
  1. Software notations and tools
    1. Development frameworks and environments

Recommendations

Do bugs lead to unnaturalness of source code?
ESEC/FSE 2022: Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Texts in natural languages are highly repetitive and predictable because of the naturalness of natural languages. Recent research validated that source code in programming languages is also repetitive and predictable, and naturalness is an inherent ...
Template-based AADL automatic code generation

Embedded real-time systems employ a variety of operating system platforms. Consequently, for automatic code generation, considerable redevelopment is needed when the platform changes. This results in major challenges with respect to the automatic code ...
On the naturalness of auto-generated code: can we identify auto-generated code automatically?
ICPC '18: Proceedings of the 26th Conference on Program Comprehension

Recently, a variety of studies have been conducted on source code analysis. If auto-generated code is included in the target source code, it is usually removed in a preprocessing phase because the presence of auto-generated code may have negative effects ...

Comments

Information & Contributors

Information

Published In

ACMSE '10: Proceedings of the 48th annual ACM Southeast Conference

April 2010

488 pages

ISBN:9781450300643

DOI:10.1145/1900008

Conference Chair:
H. Conrad Cunningham
University of Mississippi
,
Program Chairs:
Paul Ruth,
Nicholas A. Kraft

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 April 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Science Foundation

Conference

ACM SE '10

Sponsor:

ACM SE '10: ACM Southeast Regional Conference

April 15 - 17, 2010

Mississippi, Oxford

Acceptance Rates

ACMSE '10 Paper Acceptance Rate 48 of 94 submissions, 51%;

Overall Acceptance Rate 502 of 1,023 submissions, 49%

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

17
Total Citations
View Citations
253
Total Downloads

Downloads (Last 12 months)5
Downloads (Last 6 weeks)1

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Galasso JFamelis MSahraoui H(2022)Fine-Grained Analysis of Similar Code SnippetsReuse and Software Quality10.1007/978-3-031-08129-3_1(3-21)Online publication date: 15-Jun-2022
https://dl.acm.org/doi/10.1007/978-3-031-08129-3_1
Nguyen TNguyen T(2021)PERSONA: A personalized model for code recommendationPLOS ONE10.1371/journal.pone.025983416:11(e0259834)Online publication date: 16-Nov-2021
https://doi.org/10.1371/journal.pone.0259834
Xie RKong XWang LZhou YLi B(2019)HiRec: API Recommendation using Hierarchical Context2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE)10.1109/ISSRE.2019.00044(369-379)Online publication date: Oct-2019
https://doi.org/10.1109/ISSRE.2019.00044
Tran NTran HNguyen SNguyen HNguyen TGuéhéneuc YKhomh FSarro F(2019)Does BLEU score work for code migration?Proceedings of the 27th International Conference on Program Comprehension10.1109/ICPC.2019.00034(165-176)Online publication date: 25-May-2019
https://dl.acm.org/doi/10.1109/ICPC.2019.00034
Nguyen ANguyen TPhan HNguyen T(2018)A deep neural network language model with contexts for source code2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER.2018.8330220(323-334)Online publication date: Mar-2018
https://doi.org/10.1109/SANER.2018.8330220
Fowkes JChanthirasegaran PRanca RAllamanis MLapata MSutton C(2017)Autofolding for Source Code SummarizationIEEE Transactions on Software Engineering10.1109/TSE.2017.266483643:12(1095-1109)Online publication date: 1-Dec-2017
https://dl.acm.org/doi/10.1109/TSE.2017.2664836
Hindle ABarr EGabel MSu ZDevanbu P(2016)On the naturalness of softwareCommunications of the ACM10.1145/290236259:5(122-131)Online publication date: 26-Apr-2016
https://dl.acm.org/doi/10.1145/2902362
Nguyen TPham HVu PNguyen TDillon LVisser WWilliams L(2016)Learning API usages from bytecodeProceedings of the 38th International Conference on Software Engineering10.1145/2884781.2884873(416-427)Online publication date: 14-May-2016
https://dl.acm.org/doi/10.1145/2884781.2884873
Nguyen ATu ZNguyen T(2016)Do Contexts Help in Phrase-Based, Statistical Source Code Migration?2016 IEEE International Conference on Software Maintenance and Evolution (ICSME)10.1109/ICSME.2016.89(155-165)Online publication date: Oct-2016
https://doi.org/10.1109/ICSME.2016.89
Nguyen ANguyen TBertolino ACanfora GElbaum S(2015)Graph-based statistical language model for codeProceedings of the 37th International Conference on Software Engineering - Volume 110.5555/2818754.2818858(858-868)Online publication date: 16-May-2015
https://dl.acm.org/doi/10.5555/2818754.2818858
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Index Terms

Recommendations

Do bugs lead to unnaturalness of source code?

Template-based AADL automatic code generation

On the naturalness of auto-generated code: can we identify auto-generated code automatically?

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations