skip to main content
10.1145/3510003.3510061acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article
Artifacts Evaluated & Functional / v1.1

SnR: constraint-based type inference for incomplete Java code snippets

Published:05 July 2022Publication History

ABSTRACT

Code snippets are prevalent on websites such as Stack Overflow and are effective in demonstrating API usages concisely. However they are usually difficult to be used directly because most code snippets not only are syntactically incomplete but also lack dependency information, and thus do not compile. For example, Java snippets usually do not have import statements or required library names; only 6.88% of Java snippets on Stack Overflow include import statements necessary for compilation.

This paper proposes SnR, a precise, efficient, constraint-based technique to automatically infer the exact types used in code snippets and the libraries containing the inferred types, to compile and therefore reuse the code snippets. Initially, SnR builds a knowledge base of APIs, i.e., various facts about the available APIs, from a corpus of Java libraries. Given a code snippet with missing import statements, SnR automatically extracts typing constraints from the snippet, solves the constraints against the knowledge base, and returns a set of APIs that satisfies the constraints to be imported into the snippet.

We have evaluated SnR on a benchmark of 267 code snippets from Stack Overflow. SnR significantly outperforms the state-of-the-art tool Coster. SnR correctly infers 91.0% of the import statements, which makes 73.8% of the snippets compile, compared to 36.0% of the import statements and 9.0% of the snippets by Coster.

References

  1. Rabe Abdalkareem, Emad Shihab, and Juergen Rilling. 2017. On code reuse from StackOverflow: An exploratory study on Android apps. Information and Software Technology 88 (2017), 148--158. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Alexander Aiken and Edward L. Wimmers. 1993. Type Inclusion Constraints and Type Inference. In Proceedings of the Conference on Functional Programming Languages and Computer Architecture (Copenhagen, Denmark) (FPCA '93). Association for Computing Machinery, New York, NY, USA, 31--41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Nicholas Allen, Padmanabhan Krishnan, and Bernhard Scholz. 2015. Combining Type-Analysis with Points-to Analysis for Analyzing Java Library Source-Code. In Proceedings of the 4th ACM SIGPLAN International Workshop on State Of the Art in Program Analysis (Portland, OR, USA) (SOAP 2015). Association for Computing Machinery, New York, NY, USA, 13--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Sebastian Baltes and Stephan Diehl. 2019. Usage and attribution of Stack Overflow code snippets in GitHub projects. Empirical Software Engineering 24, 3 (2019), 1259--1295.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Martin Bravenboer and Yannis Smaragdakis. 2009. Strictly Declarative Specification of Sophisticated Points-to Analyses. In Proceedings of the 24th ACM SIGPLAN Conference on Object Oriented Programming Systems Languages and Applications (Orlando, Florida, USA) (OOPSLA '09). Association for Computing Machinery, New York, NY, USA, 243--262. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. cianBuckley. 2013. java - Joda Time converting time zoned date time to milliseconds - Stack Overflow. Retrieved December 22, 2020 from https://web.archive.org/web/20170227042935/http://stackoverflow.com/questions/18274902/jodatime-converting-time-zoned-date-time-to-millisGoogle ScholarGoogle Scholar
  7. Barthélémy Dagenais and Martin P. Robillard. 2012. Recovering Traceability Links between an API and Its Learning Resources. In Proceedings of the 34th International Conference on Software Engineering (Zurich, Switzerland) (ICSE '12). IEEE Press, 47--57.Google ScholarGoogle Scholar
  8. Steven Dawson, C. R. Ramakrishnan, and David S. Warren. 1996. Practical Program Analysis Using General Purpose Logic Programming Systems---a Case Study. In Proceedings of the ACM SIGPLAN 1996 Conference on Programming Language Design and Implementation (Philadelphia, Pennsylvania, USA) (PLDI '96). Association for Computing Machinery, New York, NY, USA, 117--126. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Oege De Moor, Georg Gottlob, Tim Furche, and Andrew Sellers. 2012. Datalog Reloaded: First International Workshop, Datalog 2010, Oxford, UK, March 16--19, 2010. Revised Selected Papers. Vol. 6702. Springer.Google ScholarGoogle Scholar
  10. David Greenfieldboyce and Jeffrey S. Foster. 2007. Type Qualifier Inference for Java. In Proceedings of the 22nd Annual ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages and Applications (Montreal, Quebec, Canada) (OOPSLA '07). Association for Computing Machinery, New York, NY, USA, 321--336. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Vincent J. Hellendoorn, Christian Bird, Earl T. Barr, and Miltiadis Allamanis. 2018. Deep Learning Type Inference. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Lake Buena Vista, FL, USA) (ESEC/FSE 2018). Association for Computing Machinery, New York, NY, USA, 152--162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Shan Shan Huang, Todd Jeffrey Green, and Boon Thau Loo. 2011. Datalog and Emerging Applications: An Interactive Tutorial. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data (Athens, Greece) (SIGMOD '11). Association for Computing Machinery, New York, NY, USA, 1213--1216. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Herbert Jordan, Bernhard Scholz, and Pavle Subotić. 2016. Soufflé: On Synthesis of Program Analyzers. In Computer Aided Verification, Swarat Chaudhuri and Azadeh Farzan (Eds.). Springer International Publishing, Cham, 422--430.Google ScholarGoogle Scholar
  14. Iman Keivanloo, Juergen Rilling, and Ying Zou. 2014. Spotting Working Code Examples. In Proceedings of the 36th International Conference on Software Engineering (Hyderabad, India) (ICSE 2014). Association for Computing Machinery, New York, NY, USA, 664--675. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. C. Le Goues, T. Nguyen, S. Forrest, and W. Weimer. 2012. GenProg: A Generic Method for Automatic Software Repair. IEEE Transactions on Software Engineering 38, 1 (2012), 54--72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. S. Malik, J. Patra, and M. Pradel. 2019. NL2Type: Inferring JavaScript Function Types from Natural Language Information. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). 304--315.Google ScholarGoogle Scholar
  17. S. S. Manes and O. Baysal. 2019. How Often and What StackOverflow Posts Do Developers Reference in Their GitHub Projects?. In 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR). 235--239. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Pedro Martins, Rohan Achar, and Cristina V. Lopes. 2018. 50K-C: A Dataset of Compilable, and Compiled, Java Projects. In Proceedings of the 15th International Conference on Mining Software Repositories (Gothenburg, Sweden) (MSR '18). Association for Computing Machinery, New York, NY, USA, 1--5. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Ali Mesbah, Andrew Rice, Emily Johnston, Nick Glorioso, and Eddie Aftandilian. 2019. DeepDelta: Learning to Repair Compilation Errors.Google ScholarGoogle Scholar
  20. Mayur Naik, Alex Aiken, and John Whaley. 2006. Effective Static Race Detection for Java. In Proceedings of the 27th ACM SIGPLAN Conference on Programming Language Design and Implementation (Ottawa, Ontario, Canada) (PLDI '06). Association for Computing Machinery, New York, NY, USA, 308--319. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. S. M. Nasehi, J. Sillito, F. Maurer, and C. Burns. 2012. What makes a good code example?: A study of programming Q A in StackOverflow. In 2012 28th IEEE International Conference on Software Maintenance (ICSM). 25--34. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Nicholas Oxhøj, Jens Palsberg, and Michael I. Schwartzbach. 1992. Making type inference practical. In ECOOP '92 European Conference on Object-Oriented Programming, Ole Lehrmann Madsen (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 329--349.Google ScholarGoogle Scholar
  23. Jens Palsberg and Michael I. Schwartzbach. 1991. Object-Oriented Type Inference. In Conference Proceedings on Object-Oriented Programming Systems, Languages, and Applications (Phoenix, Arizona, USA) (OOPSLA '91). Association for Computing Machinery, New York, NY, USA, 146--161. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Y. Pei, C. A. Furia, M. Nordio, Y. Wei, B. Meyer, and A. Zeller. 2014. Automated Fixing of Programs with Contracts. IEEE Transactions on Software Engineering 40, 5 (2014), 427--449. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. H. Phan, H. A. Nguyen, N. M. Tran, L. H. Truong, A. T. Nguyen, and T. N. Nguyen. 2018. Statistical Learning of API Fully Qualified Names in Code Snippets of Online Forums. In 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE). 632--642.Google ScholarGoogle Scholar
  26. Luca Ponzanelli, Alberto Bacchelli, and Michele Lanza. 2013. Seahawk: Stack overflow in the ide. In 2013 35th International Conference on Software Engineering (ICSE). IEEE, 1295--1298.Google ScholarGoogle ScholarCross RefCross Ref
  27. Luca Ponzanelli, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, and Michele Lanza. 2014. Mining stackoverflow to turn the ide into a self-confident programming prompter. In Proceedings of the 11th Working Conference on Mining Software Repositories. 102--111.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. C. Ragkhitwetsagul, J. Krinke, M. Paixao, G. Bianco, and R. Oliveto. 2019. Toxic Code Snippets on Stack Overflow. IEEE Transactions on Software Engineering (2019), 1--1. Google ScholarGoogle ScholarCross RefCross Ref
  29. C. M. K. Saifullah. 2020. COSTER. Retrieved May 18, 2020 from https://github.com/khaledkucse/COSTERGoogle ScholarGoogle Scholar
  30. C. M. K. Saifullah. 2020. COSTER: A Tool for Finding Fully Qualified Names of API Elements in Online Code Snippets. Retrieved December 22, 2020 from https://youtu.be/oDZtw9MzUWM?t=208Google ScholarGoogle Scholar
  31. C M Khaled Saifullah, Muhammad Asaduzzaman, and Chanchal Roy. 2021. COSTER: A Tool for Finding Fully Qualified Names of API Elements in Online Code Snippets (ICSE '21 DEMO).Google ScholarGoogle Scholar
  32. C. M. K. Saifullah, M. Asaduzzaman, and C. K. Roy. 2019. Learning from Examples to Find Fully Qualified Names of API Elements in Code Snippets. In 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). 243--254.Google ScholarGoogle Scholar
  33. Daniel Smith and Robert Cartwright. 2008. Java Type Inference is Broken: Can We Fix It?. In Proceedings of the 23rd ACM SIGPLAN Conference on Object-Oriented Programming Systems Languages and Applications (Nashville, TN, USA) (OOPSLA '08). Association for Computing Machinery, New York, NY, USA, 505--524. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Michael Stonebraker. 1988. Readings in database systems. Morgan Kaufmann Publishers Inc.Google ScholarGoogle Scholar
  35. Siddharth Subramanian, Laura Inozemtseva, and Reid Holmes. 2014. Live API Documentation. In Proceedings of the 36th International Conference on Software Engineering (Hyderabad, India) (ICSE 2014). Association for Computing Machinery, New York, NY, USA, 643--652. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Valerio Terragni, Yepang Liu, and Shing-Chi Cheung. 2016. CSNIPPEX: Automated Synthesis of Compilable Code Snippets from Q&A Sites. In Proceedings of the 25th International Symposium on Software Testing and Analysis (Saarbrücken, Germany) (ISSTA 2016). Association for Computing Machinery, New York, NY, USA, 118--129. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Valerio Terragni and Pasquale Salza. 2021. APIzation: Generating Reusable APIs from StackOverflow Code Snippets. In 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE). 542--554. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Tiejun Wang and Scott F. Smith. 2001. Precise Constraint-Based Type Inference for Java. In ECOOP 2001 --- Object-Oriented Programming, Jørgen Lindskov Knudsen (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 99--117.Google ScholarGoogle Scholar
  39. Ying Wang, Ming Wen, Zhenwei Liu, Rongxin Wu, Rui Wang, Bo Yang, Hai Yu, Zhiliang Zhu, and Shing-Chi Cheung. 2018. Do the Dependency Conflicts in My Project Matter?. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Lake Buena Vista, FL, USA) (ESEC/FSE 2018). Association for Computing Machinery, New York, NY, USA, 319--330. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. A. W. Wong, A. Salimi, S. Chowdhury, and A. Hindle. 2019. Syntax and Stack Overflow: A Methodology for Extracting a Corpus of Syntax Errors and Fixes. In 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME). 318--322.Google ScholarGoogle Scholar
  41. E. Wong, Jinqiu Yang, and Lin Tan. 2013. AutoComment: Mining question and answer sites for automatic comment generation. In 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE). 562--567. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Di Yang, Aftab Hussain, and Cristina Videira Lopes. 2016. From Query to Usable Code: An Analysis of Stack Overflow Code Snippets. In Proceedings of the 13th International Conference on Mining Software Repositories (Austin, Texas) (MSR '16). Association for Computing Machinery, New York, NY, USA, 391--402. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. D. Yang, P. Martins, V. Saini, and C. Lopes. 2017. Stack Overflow in Github: Any Snippets There?. In 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR). 280--290. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. T. Zhang, D. Yang, C. Lopes, and M. Kim. 2019. Analyzing and Supporting Adaptation of Online Code Examples. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). 316--327. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. SnR: constraint-based type inference for incomplete Java code snippets

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          ICSE '22: Proceedings of the 44th International Conference on Software Engineering
          May 2022
          2508 pages
          ISBN:9781450392211
          DOI:10.1145/3510003

          Copyright © 2022 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 5 July 2022

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate276of1,856submissions,15%

          Upcoming Conference

          ICSE 2025

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader