skip to main content
10.1145/3167110acmconferencesArticle/Chapter ViewAbstractPublication PagessplashConference Proceedingsconference-collections
research-article

Towards concept-aware programming environments for guiding software modularity

Published: 22 October 2017 Publication History

Abstract

To design and implement a program, programmers choose analogies and metaphors to explain and understand programmatic concepts. In source code, they manifest themselves as a particular choice of names. During program comprehension, reading such names is an important starting point to understand the meaning of modules and guide the exploration process.
On the one hand, understanding a program in depth by looking for names that suggest a particular analogy can be a time-consuming process. On the other hand, a lack of awareness which concepts are present and which analogies have been chosen can lead to modularity issues, such as redundancy and architectural drift if concepts are misaligned with respect to the current module decomposition.
In this work-in-progress paper, we propose to integrate first-class concepts into the programming environment. We assign meaning to names by labeling them with a color corresponding to the metaphor or analogy this name was derived from. We hypothesize that aggregating labels upwards along the module hierarchy helps to understand how concepts are distributed across the program, collecting names belonging to a specific concept helps programmers to recognize which metaphor has been chosen, and presenting relations between concepts can summarize complex interactions between program parts. We argue that continuous feedback and awareness of how names are grouped into concepts and where they are located can help preventing modularity issues and ease program comprehension.
As a first step towards an implementation, we define criteria that help to detect names belonging to the same concept. We then investigate how techniques from natural language processing can be re-used and modified to compute an initial concept allocation with respect to these criteria. Eventually, we show design sketches how we plan to arrange and present concepts to programmers through tools, and what kind of information they can provide to help programmers make informed implementation decisions.

References

[1]
Harold Abelson, Gerald Jay Sussman, and Julie Sussman. 1996. Structure and Interpretation of Computer Programs - 2nd Edition. MIT Press.
[2]
Edoardo M Airoldi, David M. Blei, Stephen E. Fienberg, and Eric P. Xing. 2009. Mixed Membership Stochastic Blockmodels. In Advances in Neural Information Processing Systems 21, D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou (Eds.). Curran Associates, Inc., 33-40.
[3]
Hazeline U. Asuncion, Arthur U. Asuncion, and Richard N. Taylor. 2010. Software Traceability with Topic Modeling. In Proceedings of the 32Nd ACM/IEEE International Conference on Software Engineering - Volume 1 (ICSE '10). ACM, Cape Town, South Africa, 95-104.
[4]
T. J. Biggerstaff, B. G. Mitbander, and D. Webster. 1993. The Concept Assignment Problem in Program Understanding. In Proceedings of 1993 15th International Conference on Software Engineering. 482-498.
[5]
David Binkley, Daniel Heinz, Dawn Lawrie, and Justin Overfelt. 2014. Understanding LDA in Source Code Analysis. In Proceedings of the 22nd International Conference on Program Comprehension (ICPC 2014). ACM, Hyderabad, India, 26-36.
[6]
David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet Allocation. J. Mach. Learn. Res. 3 (March 2003), 993-1022.
[7]
Tom Griffiths. 2011. Gibbs Sampling in the Generative Model of Latent Dirichlet Allocation. (2011).
[8]
Dan Ingalls, Ted Kaehler, John Maloney, Scott Wallace, and Alan Kay. 1997. Back to the Future: The Story of Squeak, a Practical Smalltalk Written in Itself. In Proceedings of the 12th ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA '97). ACM, 318-326.
[9]
Erik Linstead, Paul Rigor, Sushil Bajracharya, Cristina Lopes, and Pierre Baldi. 2007. Mining Concepts from Code with Probabilistic Topic Models. In Proceedings of the Twenty-Second IEEE/ACM International Conference on Automated Software Engineering (ASE '07). ACM, Atlanta, GA, USA, 461-464.
[10]
Peter Naur. 1985. Programming as Theory Building. Microprocessing and Microprogramming 15, 5 (May 1985), 253-261.
[11]
Amir M. Saeidi, Jurriaan Hage, Ravi Khadka, and Slinger Jansen. 2015. ITMViz: Interactive Topic Modeling for Source Code Analysis. In Proceedings of the 2015 IEEE 23rd International Conference on Program Comprehension (ICPC '15). IEEE Press, Piscataway, NJ, USA, 295-298.
[12]
Mahdi Shafiei, Katherine A. Dunn, Hugh Chipman, Hong Gu, and Joseph P. Bielawski. 2014. BiomeNet: A Bayesian Model for Inference of Metabolic Divergence among Microbial Communities. 10, 11 (2014), e1003918.
[13]
Yee Whye Teh, Michael I. Jordan, Matthew J. Beal, and David M. Blei. 2004. Hierarchical Dirichlet Processes. J. Amer. Statist. Assoc. 101 (2004).
[14]
Stephen W. Thomas, Bram Adams, Ahmed E. Hassan, and Dorothea Blostein. 2011. Modeling the Evolution of Topics in Source Code Histories. In Proceedings of the 8th Working Conference on Mining Software Repositories (2011) (MSR '11). ACM, 173-182.
[15]
Alvaro Videla. 2017. Metaphors We Compute By. Commun. ACM 60, 10 (2017), 42-45.
[16]
Xiaohui Yan, Jiafeng Guo, Yanyan Lan, and Xueqi Cheng. 2013. A Biterm Topic Model for Short Texts. In Proceedings of the 22Nd International Conference on World Wide Web (WWW '13). ACM, 1445-1456.

Cited By

View all
  • (2019)Faster feedback through lexical test prioritizationCompanion Proceedings of the 3rd International Conference on the Art, Science, and Engineering of Programming10.1145/3328433.3328455(1-10)Online publication date: 1-Apr-2019

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PX/17.2: Proceedings of the 3rd ACM SIGPLAN International Workshop on Programming Experience
October 2017
45 pages
ISBN:9781450355223
DOI:10.1145/3176645
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 October 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. analogies
  2. first-class concepts
  3. integrated development environments
  4. program comprehension
  5. topic models

Qualifiers

  • Research-article

Conference

SPLASH '17
Sponsor:

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2019)Faster feedback through lexical test prioritizationCompanion Proceedings of the 3rd International Conference on the Art, Science, and Engineering of Programming10.1145/3328433.3328455(1-10)Online publication date: 1-Apr-2019

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media