Skip to main content
Log in

Code completion of multiple keywords from abbreviated input

  • Published:
Automated Software Engineering Aims and scope Submit manuscript

Abstract

Abbreviation Completion is a novel technique to improve the efficiency of code-writing by supporting code completion of multiple keywords based on non-predefined abbreviated input—a different approach from conventional code completion that finds one keyword at a time based on an exact character match. Abbreviated input consisting of abbreviated keywords and non-alphanumeric characters between each abbreviated keyword (e.g. pb st nm) is expanded into a full expression (e.g. public String name) by a Hidden Markov Model learned from a corpus of existing code and abbreviation examples. The technique does not require the user to memorize abbreviations and provides incremental feedback of the most likely completions.

In addition to code completion by disabbreviation of multiple keywords, abbreviation completion supports prediction of the next keywords and non-alphanumeric characters of a code completion candidate, a technique called code completion by extrapolation. The system finds the most likely next keywords and non-alphanumeric characters using an n-gram model of programming language. This enables a code completion scenario in which a user first types a short abbreviated expression to complete the beginning part of a desired full expression and then uses the extrapolation feature to complete the remaining part without further typing.

This paper presents the algorithm for abbreviation completion, integrated with a new user interface for multiple-keyword code completion. We tested the system by sampling 4919 code lines from open source projects and found that more than 99% of the code lines could be resolved from acronym-like abbreviations. The system could also extrapolate code completion candidates to complete the next one or two keywords with the accuracy of 96% and 82%, respectively. A user study of code completion by disabbreviation found 30% reduction in time usage and 41% reduction of keystrokes over conventional code completion.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Abbrevs, GNU Emacs manual, http://www.gnu.org/software/emacs/manual/emacs.html (2010)

  • Amazon Mechanical Turk, Amazon Mechanical Turk user’s guide, http://www.mturk.com (2010)

  • Bickel, S., Haider, P., Scheffer, T.: Predicting sentences using N-gram language models. In: Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language, pp. 193–?00 (2005)

    Chapter  Google Scholar 

  • Brown, P.F., deSouza, P.V., Mercer, R.L., Della Pietra, V.J., Lai, J.C.: Class-based n-gram models of natural language. Comput. Linguist. 18, 467–479 (1992)

    Google Scholar 

  • Bruch, M., Monperrus, M., Mezini, M.: Learning from examples to improve code completion systems. In: Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, pp. 213–222 (2009)

    Google Scholar 

  • Code Assist, Eclipse Ganymede documentation, http://help.eclipse.org/ganymede/index.jsp (2009)

  • Complete Word, Visual Studio 2010 documentation, http://msdn.microsoft.com/en-us/library/1thxcsd9.aspx (2010)

  • Han, S., Wallace, D.R., Miller, R.C.: Code completion from abbreviated Input. In: Proceedings of International Conference on Automated Software Engineering, pp. 332–343 (2009)

    Chapter  Google Scholar 

  • Hill, R., Rideout, J.: Automatic method completion. In: Proceedings of Automated Software Engineering, pp. 228–235 (2004)

    Google Scholar 

  • Kersten, M., Murphy, G.C.: Using task context to improve programmer productivity. In: Proceedings of ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 1–11 (2006)

    Google Scholar 

  • Little, G., Miller, R.C.: Keyword programming in Java. In: Proceedings of International Conference on Automated Software Engineering, vol. 16, pp. 37–71 (2007)

    Google Scholar 

  • Mandelin, D., Xu, L., Bodik, R., Kimelman, D.: Jungloid mining: Helping to navigate the API jungle. In: Proceedings of Conference on Programming Language Design and Implementation, vol. 40, pp. 48–61 (2005)

    Chapter  Google Scholar 

  • Murphy, G.C., Kersten, M., Findlater, L.: How are java software developers using the eclipse IDE? IEEE Softw., 23(4), 78–83 (2006)

    Article  Google Scholar 

  • Nandi, A., Jagadish, H.V.: Effective phrase prediction. In: Proceedings of International Conference on Very Large Data Bases, pp. 219–230 (2007)

    Google Scholar 

  • Nilsson, D., Goldberger, J.: An efficient algorithm for sequentially finding the n-best list. In: Proceedings of International Joint Conference on Artificial Intelligence, pp. 1280–1285 (2001)

    Google Scholar 

  • Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. In: Proceedings of the IEEE, pp. 257–286 (1989)

    Google Scholar 

  • Robbes, R., Lanza, M.: How Program history can improve code completion. In: Proceedings of Automated Software Engineering, pp. 181–212 (2008)

    Google Scholar 

  • Sahavechaphan, N., Claypool, K.: XSnippet: Mining for sample code. In: Proceedings of ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA), vol. 16, pp. 413–430 (2006)

    Google Scholar 

  • Shieber, S.M., Nelken, R.: Abbreviated text input using language modeling. Nat. Lang. Eng. 13, 137–163 (2007)

    Google Scholar 

  • Soong, F.K., Huang, E.F.: A tree-trellis based fast search for finding the n-best sentence hypotheses in continuous speech recognition. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 705–708 (1991)

    Google Scholar 

  • Template, Eclipse Ganymede documentation, http://help.eclipse.org/ganymede/index.jsp (2009)

  • Willis, T., Pain, H., Trewin, S., Clark, S.: Probabilistic flexible abbreviation expansion for users with motor disabilities. In: Proceedings of Accessible Design in the Digital World (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sangmok Han.

Additional information

This research was supported by Samsung Scholarship Foundation.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Han, S., Wallace, D.R. & Miller, R.C. Code completion of multiple keywords from abbreviated input. Autom Softw Eng 18, 363–398 (2011). https://doi.org/10.1007/s10515-011-0083-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10515-011-0083-2

Keywords

Navigation