Code completion of multiple keywords from abbreviated input

Han, Sangmok; Wallace, David R.; Miller, Robert C.

doi:10.1007/s10515-011-0083-2

Code completion of multiple keywords from abbreviated input

Published: 12 April 2011

Volume 18, pages 363–398, (2011)
Cite this article

Automated Software Engineering Aims and scope Submit manuscript

Sangmok Han¹,
David R. Wallace¹ &
Robert C. Miller¹

335 Accesses
10 Citations
3 Altmetric
Explore all metrics

Abstract

Abbreviation Completion is a novel technique to improve the efficiency of code-writing by supporting code completion of multiple keywords based on non-predefined abbreviated input—a different approach from conventional code completion that finds one keyword at a time based on an exact character match. Abbreviated input consisting of abbreviated keywords and non-alphanumeric characters between each abbreviated keyword (e.g. pb st nm) is expanded into a full expression (e.g. public String name) by a Hidden Markov Model learned from a corpus of existing code and abbreviation examples. The technique does not require the user to memorize abbreviations and provides incremental feedback of the most likely completions.

In addition to code completion by disabbreviation of multiple keywords, abbreviation completion supports prediction of the next keywords and non-alphanumeric characters of a code completion candidate, a technique called code completion by extrapolation. The system finds the most likely next keywords and non-alphanumeric characters using an n-gram model of programming language. This enables a code completion scenario in which a user first types a short abbreviated expression to complete the beginning part of a desired full expression and then uses the extrapolation feature to complete the remaining part without further typing.

This paper presents the algorithm for abbreviation completion, integrated with a new user interface for multiple-keyword code completion. We tested the system by sampling 4919 code lines from open source projects and found that more than 99% of the code lines could be resolved from acronym-like abbreviations. The system could also extrapolate code completion candidates to complete the next one or two keywords with the accuracy of 96% and 82%, respectively. A user study of code completion by disabbreviation found 30% reduction in time usage and 41% reduction of keystrokes over conventional code completion.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Abbrevs, GNU Emacs manual, http://www.gnu.org/software/emacs/manual/emacs.html (2010)
Amazon Mechanical Turk, Amazon Mechanical Turk user’s guide, http://www.mturk.com (2010)
Bickel, S., Haider, P., Scheffer, T.: Predicting sentences using N-gram language models. In: Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language, pp. 193–?00 (2005)
Chapter Google Scholar
Brown, P.F., deSouza, P.V., Mercer, R.L., Della Pietra, V.J., Lai, J.C.: Class-based n-gram models of natural language. Comput. Linguist. 18, 467–479 (1992)
Google Scholar
Bruch, M., Monperrus, M., Mezini, M.: Learning from examples to improve code completion systems. In: Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, pp. 213–222 (2009)
Google Scholar
Code Assist, Eclipse Ganymede documentation, http://help.eclipse.org/ganymede/index.jsp (2009)
Complete Word, Visual Studio 2010 documentation, http://msdn.microsoft.com/en-us/library/1thxcsd9.aspx (2010)
Han, S., Wallace, D.R., Miller, R.C.: Code completion from abbreviated Input. In: Proceedings of International Conference on Automated Software Engineering, pp. 332–343 (2009)
Chapter Google Scholar
Hill, R., Rideout, J.: Automatic method completion. In: Proceedings of Automated Software Engineering, pp. 228–235 (2004)
Google Scholar
Kersten, M., Murphy, G.C.: Using task context to improve programmer productivity. In: Proceedings of ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 1–11 (2006)
Google Scholar
Little, G., Miller, R.C.: Keyword programming in Java. In: Proceedings of International Conference on Automated Software Engineering, vol. 16, pp. 37–71 (2007)
Google Scholar
Mandelin, D., Xu, L., Bodik, R., Kimelman, D.: Jungloid mining: Helping to navigate the API jungle. In: Proceedings of Conference on Programming Language Design and Implementation, vol. 40, pp. 48–61 (2005)
Chapter Google Scholar
Murphy, G.C., Kersten, M., Findlater, L.: How are java software developers using the eclipse IDE? IEEE Softw., 23(4), 78–83 (2006)
Article Google Scholar
Nandi, A., Jagadish, H.V.: Effective phrase prediction. In: Proceedings of International Conference on Very Large Data Bases, pp. 219–230 (2007)
Google Scholar
Nilsson, D., Goldberger, J.: An efficient algorithm for sequentially finding the n-best list. In: Proceedings of International Joint Conference on Artificial Intelligence, pp. 1280–1285 (2001)
Google Scholar
Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. In: Proceedings of the IEEE, pp. 257–286 (1989)
Google Scholar
Robbes, R., Lanza, M.: How Program history can improve code completion. In: Proceedings of Automated Software Engineering, pp. 181–212 (2008)
Google Scholar
Sahavechaphan, N., Claypool, K.: XSnippet: Mining for sample code. In: Proceedings of ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA), vol. 16, pp. 413–430 (2006)
Google Scholar
Shieber, S.M., Nelken, R.: Abbreviated text input using language modeling. Nat. Lang. Eng. 13, 137–163 (2007)
Google Scholar
Soong, F.K., Huang, E.F.: A tree-trellis based fast search for finding the n-best sentence hypotheses in continuous speech recognition. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 705–708 (1991)
Google Scholar
Template, Eclipse Ganymede documentation, http://help.eclipse.org/ganymede/index.jsp (2009)
Willis, T., Pain, H., Trewin, S., Clark, S.: Probabilistic flexible abbreviation expansion for users with motor disabilities. In: Proceedings of Accessible Design in the Digital World (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

MIT, 77 Massachusetts Avenue, Cambridge, MA, 02139, USA
Sangmok Han, David R. Wallace & Robert C. Miller

Authors

Sangmok Han
View author publications
You can also search for this author in PubMed Google Scholar
David R. Wallace
View author publications
You can also search for this author in PubMed Google Scholar
Robert C. Miller
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sangmok Han.

Additional information

This research was supported by Samsung Scholarship Foundation.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Han, S., Wallace, D.R. & Miller, R.C. Code completion of multiple keywords from abbreviated input. Autom Softw Eng 18, 363–398 (2011). https://doi.org/10.1007/s10515-011-0083-2

Download citation

Received: 16 June 2010
Accepted: 23 March 2011
Published: 12 April 2011
Issue Date: December 2011
DOI: https://doi.org/10.1007/s10515-011-0083-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Code completion of multiple keywords from abbreviated input

Abstract

Access this article

Similar content being viewed by others

Intelligent Code Completion

A methodology for refined evaluation of neural code completion approaches

Statistical Approach to Increase Source Code Completion Accuracy

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Code completion of multiple keywords from abbreviated input

Abstract

Access this article

Similar content being viewed by others

Intelligent Code Completion

A methodology for refined evaluation of neural code completion approaches

Statistical Approach to Increase Source Code Completion Accuracy

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation