Abstract
Abbreviation Completion is a novel technique to improve the efficiency of code-writing by supporting code completion of multiple keywords based on non-predefined abbreviated input—a different approach from conventional code completion that finds one keyword at a time based on an exact character match. Abbreviated input consisting of abbreviated keywords and non-alphanumeric characters between each abbreviated keyword (e.g. pb st nm) is expanded into a full expression (e.g. public String name) by a Hidden Markov Model learned from a corpus of existing code and abbreviation examples. The technique does not require the user to memorize abbreviations and provides incremental feedback of the most likely completions.
In addition to code completion by disabbreviation of multiple keywords, abbreviation completion supports prediction of the next keywords and non-alphanumeric characters of a code completion candidate, a technique called code completion by extrapolation. The system finds the most likely next keywords and non-alphanumeric characters using an n-gram model of programming language. This enables a code completion scenario in which a user first types a short abbreviated expression to complete the beginning part of a desired full expression and then uses the extrapolation feature to complete the remaining part without further typing.
This paper presents the algorithm for abbreviation completion, integrated with a new user interface for multiple-keyword code completion. We tested the system by sampling 4919 code lines from open source projects and found that more than 99% of the code lines could be resolved from acronym-like abbreviations. The system could also extrapolate code completion candidates to complete the next one or two keywords with the accuracy of 96% and 82%, respectively. A user study of code completion by disabbreviation found 30% reduction in time usage and 41% reduction of keystrokes over conventional code completion.
Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Abbrevs, GNU Emacs manual, http://www.gnu.org/software/emacs/manual/emacs.html (2010)
Amazon Mechanical Turk, Amazon Mechanical Turk user’s guide, http://www.mturk.com (2010)
Bickel, S., Haider, P., Scheffer, T.: Predicting sentences using N-gram language models. In: Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language, pp. 193–?00 (2005)
Brown, P.F., deSouza, P.V., Mercer, R.L., Della Pietra, V.J., Lai, J.C.: Class-based n-gram models of natural language. Comput. Linguist. 18, 467–479 (1992)
Bruch, M., Monperrus, M., Mezini, M.: Learning from examples to improve code completion systems. In: Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, pp. 213–222 (2009)
Code Assist, Eclipse Ganymede documentation, http://help.eclipse.org/ganymede/index.jsp (2009)
Complete Word, Visual Studio 2010 documentation, http://msdn.microsoft.com/en-us/library/1thxcsd9.aspx (2010)
Han, S., Wallace, D.R., Miller, R.C.: Code completion from abbreviated Input. In: Proceedings of International Conference on Automated Software Engineering, pp. 332–343 (2009)
Hill, R., Rideout, J.: Automatic method completion. In: Proceedings of Automated Software Engineering, pp. 228–235 (2004)
Kersten, M., Murphy, G.C.: Using task context to improve programmer productivity. In: Proceedings of ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 1–11 (2006)
Little, G., Miller, R.C.: Keyword programming in Java. In: Proceedings of International Conference on Automated Software Engineering, vol. 16, pp. 37–71 (2007)
Mandelin, D., Xu, L., Bodik, R., Kimelman, D.: Jungloid mining: Helping to navigate the API jungle. In: Proceedings of Conference on Programming Language Design and Implementation, vol. 40, pp. 48–61 (2005)
Murphy, G.C., Kersten, M., Findlater, L.: How are java software developers using the eclipse IDE? IEEE Softw., 23(4), 78–83 (2006)
Nandi, A., Jagadish, H.V.: Effective phrase prediction. In: Proceedings of International Conference on Very Large Data Bases, pp. 219–230 (2007)
Nilsson, D., Goldberger, J.: An efficient algorithm for sequentially finding the n-best list. In: Proceedings of International Joint Conference on Artificial Intelligence, pp. 1280–1285 (2001)
Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. In: Proceedings of the IEEE, pp. 257–286 (1989)
Robbes, R., Lanza, M.: How Program history can improve code completion. In: Proceedings of Automated Software Engineering, pp. 181–212 (2008)
Sahavechaphan, N., Claypool, K.: XSnippet: Mining for sample code. In: Proceedings of ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA), vol. 16, pp. 413–430 (2006)
Shieber, S.M., Nelken, R.: Abbreviated text input using language modeling. Nat. Lang. Eng. 13, 137–163 (2007)
Soong, F.K., Huang, E.F.: A tree-trellis based fast search for finding the n-best sentence hypotheses in continuous speech recognition. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 705–708 (1991)
Template, Eclipse Ganymede documentation, http://help.eclipse.org/ganymede/index.jsp (2009)
Willis, T., Pain, H., Trewin, S., Clark, S.: Probabilistic flexible abbreviation expansion for users with motor disabilities. In: Proceedings of Accessible Design in the Digital World (2005)
Author information
Authors and Affiliations
Corresponding author
Additional information
This research was supported by Samsung Scholarship Foundation.
Rights and permissions
About this article
Cite this article
Han, S., Wallace, D.R. & Miller, R.C. Code completion of multiple keywords from abbreviated input. Autom Softw Eng 18, 363–398 (2011). https://doi.org/10.1007/s10515-011-0083-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10515-011-0083-2