skip to main content
10.1145/3358502.3361270acmconferencesArticle/Chapter ViewAbstractPublication PagessplashConference Proceedingsconference-collections
research-article

Ambiguous, informal, and unsound: metaprogramming for naturalness

Published: 20 October 2019 Publication History

Abstract

Program code needs to be understood by both machines and programmers. While the goal of executing programs requires the unambiguity of a formal language, programmers use natural language within these formal constraints to explain implemented concepts to each other. This so called naturalness – the property of programs to resemble human communication – motivated many statistical and machine learning (ML) approaches with the goal to improve software engineering activities.
The metaprogramming facilities of most programming environments model the formal elements of a program (meta-objects). If ML is used to support engineering or analysis tasks, complex infrastructure needs to bridge the gap between meta-objects and ML models, changes are not reflected in the ML model, and the mapping from an ML output back into the program’s meta-object domain is laborious.
In the scope of this work, we propose to extend metaprogramming facilities to give tool developers access to the representations of program elements within an exchangeable ML model. We demonstrate the usefulness of this abstraction in two case studies on test prioritization and refactoring. We conclude that aligning ML representations with the program’s formal structure lowers the entry barrier to exploit statistical properties in tool development.

References

[1]
Miltiadis Allamanis, Earl T. Barr, Premkumar Devanbu, and Charles Sutton. 2018. A Survey of Machine Learning for Big Code and Naturalness. ACM Comput. Surv. 51, 4 (July 2018), 81:1–81:37.
[2]
David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet Allocation. J. Mach. Learn. Res. 3 (March 2003), 993–1022.
[3]
Gilad Bracha and David Ungar. 2004. Mirrors: Design Principles for Meta-Level Facilities of Object-Oriented Programming Languages. In Proceedings of the 19th Annual ACM SIGPLAN Conference on ObjectOriented Programming, Systems, Languages, and Applications (OOPSLA ’04). ACM, New York, NY, USA, 331–344.
[4]
Marcus Denker, Tudor Gîrba, Adrian Lienhard, Oscar Nierstrasz, Lukas Renggli, and Pascal Zumkehr. 2007. Encapsulating and Exploiting Change with Changeboxes. In Proceedings of the 2007 International Conference on Dynamic Languages: In Conjunction with the 15th International Smalltalk Joint Conference 2007 (ICDL ’07). ACM, New York, NY, USA, 25–49.
[5]
Zellig S. Harris. 1954. Distributional Structure. WORD 10, 2-3 (Aug. 1954), 146–162.
[6]
Abram Hindle, Earl T. Barr, Zhendong Su, Mark Gabel, and Premkumar Devanbu. 2012. On the Naturalness of Software. In Proceedings of the 34th International Conference on Software Engineering (ICSE ’12). IEEE Press, Piscataway, NJ, USA, 837–847.
[7]
Robert Hirschfeld, Pascal Costanza, and Oscar Nierstrasz. 2008. Context-Oriented Programming. Journal of Object Technology, MarchApril 2008, ETH Zurich 7, 3 (2008), 125–151.
[8]
Dan Ingalls, Ted Kaehler, John Maloney, Scott Wallace, and Alan Kay. 1997. Back to the Future: The Story of Squeak, a Practical Smalltalk Written in Itself. In Proceedings of the 12th ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA ’97). ACM, New York, NY, USA, 318–326.
[9]
D. E. Knuth. 1984. Literate Programming. Comput. J. 27, 2 (Jan. 1984), 97–111.
[10]
Adrian Kuhn, Stéphane Ducasse, and Tudor Gîrba. 2007. Semantic Clustering: Identifying Topics in Source Code. Information and Software Technology 49, 3 (March 2007), 230–243.
[11]
Michele Lanza and Radu Marinescu. 2007. Object-Oriented Metrics in Practice: Using Software Metrics to Characterize, Evaluate, and Improve the Design of Object-Oriented Systems. Springer Science & Business Media.
[12]
Erik Linstead, Paul Rigor, Sushil Bajracharya, Cristina Lopes, and Pierre Baldi. 2007. Mining Concepts from Code with Probabilistic Topic Models. In Proceedings of the Twenty-Second IEEE/ACM International Conference on Automated Software Engineering (ASE ’07). ACM, Atlanta, GA, USA, 461–464.
[13]
Toni Mattis. 2017. Concept-Aware Live Programming: Integrating Topic Models for Program Comprehension into Live Programming Environments. In Companion to the First International Conference on the Art, Science and Engineering of Programming (Programming ’17). ACM, New York, NY, USA, 36:1–36:2.
[14]
Toni Mattis, Falco Dürsch, and Robert Hirschfeld. 2019. Faster Feedback Through Lexical Test Prioritization. In Proceedings of the Conference Companion of the 3rd International Conference on Art, Science, and Engineering of Programming (Programming ’19). ACM, New York, NY, USA, Article 21, 10 pages.
[15]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. arXiv:1301.3781 [cs] (Jan. 2013). arXiv: cs/1301.3781
[16]
Don Roberts, John Brant, and Ralph Johnson. 1997. A Refactoring Tool for Smalltalk. Theory and Practice of Object Systems 3, 4 (1997), 253–263.
[17]
Amir Saeidi, Jurriaan Hage, Ravi Khadka, and Slinger Jansen. 2019. Applications of Multi-View Learning Approaches for Software Comprehension. The Art, Science, and Engineering of Programming 3, 3 (2019), (to appear).
[18]
Marcel Taeumel, Michael Perscheid, Bastian Steinert, Jens Lincke, and Robert Hirschfeld. 2014. Interleaving of Modification and Use in Data-Driven Tool Development. In Proceedings of the 2014 ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming & Software (Onward! 2014). ACM, New York, NY, USA, 185–200.
[19]
Martin von Löwis, Marcus Denker, and Oscar Nierstrasz. 2007. ContextOriented Programming: Beyond Layers. (2007).

Cited By

View all
  • (2020)A theory of dual channel constraintsProceedings of the ACM/IEEE 42nd International Conference on Software Engineering: New Ideas and Emerging Results10.1145/3377816.3381720(25-28)Online publication date: 27-Jun-2020

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
META 2019: Proceedings of the 4th ACM SIGPLAN International Workshop on Meta-Programming Techniques and Reflection
October 2019
39 pages
ISBN:9781450369855
DOI:10.1145/3358502
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 October 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. machine learning
  2. meta-objects
  3. metaprogramming
  4. naturalness

Qualifiers

  • Research-article

Conference

SPLASH '19
Sponsor:

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2020)A theory of dual channel constraintsProceedings of the ACM/IEEE 42nd International Conference on Software Engineering: New Ideas and Emerging Results10.1145/3377816.3381720(25-28)Online publication date: 27-Jun-2020

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media