skip to main content
10.1145/3194104.3194107acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Codecatch: extracting source code snippets from online sources

Published:28 May 2018Publication History

ABSTRACT

Nowadays, developers rely on online sources to find example snippets that address the programming problems they are trying to solve. However, contemporary API usage mining methods are not suitable for locating easily reusable snippets, as they provide usage examples for specific APIs, thus requiring the developer to know which library to use beforehand. On the other hand, the approaches that retrieve snippets from online sources usually output a list of examples, without aiding the developer to distinguish among different implementations and without offering any insight on the quality and the reusability of the proposed snippets. In this work, we present CodeCatch, a system that receives queries in natural language and extracts snippets from multiple online sources. The snippets are assessed both for their quality and for their usefulness/preference by the developers, while they are also clustered according to their API calls to allow the developer to select among the different implementations. Preliminary evaluation of CodeCatch in a set of indicative programming problems indicates that it can be a useful tool for the developer.

References

  1. Charu C. Aggarwal and ChengXiang Zhai. 2012. A Survey of Text Clustering Algorithms. Springer US, Boston, MA, 77--128.Google ScholarGoogle Scholar
  2. Karan Aggarwal, Abram Hindle, and Eleni Stroulia. 2014. Co-evolution of Project Documentation and Popularity Within Github. In Proceedings of the 11th Working Conference on Mining Software Repositories (MSR '14). ACM, New York, NY, USA, 360--363. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Joel Brandt, Mira Dontcheva, Marcos Weskamp, and Scott R. Klemmer. 2010. Example-centric Programming: Integrating Web Search into the Development Environment. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '10). ACM, New York, NY, USA, 513--522. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Raymond P. L. Buse and Westley Weimer. 2012. Synthesizing API Usage Examples. In Proceedings of the 34th International Conference on Software Engineering (ICSE '12). IEEE Press, Piscataway, NJ, USA, 782--792. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Raymond P. L. Buse and Westley R. Weimer. 2010. Learning a Metric for Code Readability. IEEE Trans. Softw. Eng. 36, 4 (2010), 546--558. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Themistoklis Diamantopoulos and Andreas L. Symeonidis. 2015. Employing Source Code Information to Improve Question-answering in Stack Overflow. In Proceedings of the 12th Working Conference on Mining Software Repositories (MSR '15). IEEE Press, Piscataway, NJ, USA, 454--457. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Valasia Dimaridou, Alexandros-Charalampos Kyprianidis, Michail Papamichail, Themistoklis Diamantopoulos, and Andreas Symeonidis. 2017. Towards Modeling the User-Perceived Quality of Source Code using Static Analysis Metrics. In Proceedings of the 12th International Joint Conference on Software Technologies (ICSOFT 2017). SciTePress, Setúbal, Portugal, 73--84.Google ScholarGoogle ScholarCross RefCross Ref
  8. Jaroslav Fowkes and Charles Sutton. 2016. Parameter-free probabilistic API mining across GitHub. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2016). ACM, New York, NY, USA, 254--265. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Lingxiao Jiang, Ghassan Misherghi, Zhendong Su, and Stephane Glondu. 2007. DECKARD: Scalable and Accurate Tree-Based Detection of Code Clones. In Proceedings of the 29th International Conference on Software Engineering (ICSE '07). IEEE Computer Society, Washington, DC, USA, 96--105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Iman Keivanloo, Juergen Rilling, and Ying Zou. 2014. Spotting Working Code Examples. In Proceedings of the 36th International Conference on Software Engineering (ICSE '14). ACM, New York, NY, USA, 664--675. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Jinhan Kim, Sanghoon Lee, Seung-won Hwang, and Sunghun Kim. 2010. Towards an Intelligent Code Search Engine. In Proceedings of the 24th AAAI Conference on Artificial Intelligence (AAAI '10). AAAI Press, Palo Alto, CA, USA, 1358--1363. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. David Mandelin, Lin Xu, Rastislav Bodík, and Doug Kimelman. 2005. Jungloid Mining: Helping to Navigate the API Jungle. SIGPLAN Not. 40, 6 (2005), 48--61. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. João Eduardo Montandon, Hudson Borges, Daniel Felix, and Marco Tulio Valente. 2013. Documenting APIs with examples: Lessons learned with the APIMiner platform. In Proceedings of the 20th Working Conference on Reverse Engineering (WCRE 2013). IEEE Computer Society, Piscataway, NJ, USA, 401--408.Google ScholarGoogle ScholarCross RefCross Ref
  14. Laura Moreno, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, and Andrian Marcus. 2015. How Can I Use This Method?. In Proceedings of the 37th International Conference on Software Engineering - Volume 1 (ICSE '15). IEEE Press, Piscataway, NJ, USA, 880--890. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Michail Papamichail, Themistoklis Diamantopoulos, and Andreas L. Symeonidis. 2016. User-Perceived Source Code Quality Estimation based on Static Analysis Metrics. In Proceedings of the 2016 IEEE International Conference on Software Quality, Reliability and Security (QRS 2016). IEEE, Piscataway, NJ, USA, 100--107.Google ScholarGoogle ScholarCross RefCross Ref
  16. Luca Ponzanelli, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, and Michele Lanza. 2014. Mining StackOverflow to Turn the IDE into a Self-confident Programming Prompter. In Proceedings of the 11th Working Conference on Mining Software Repositories (MSR '14). ACM, New York, NY, USA, 102--111. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Suresh Thummalapenta and Tao Xie. 2007. PARSEWeb: A Programmer Assistant for Reusing Open Source Code on the Web. In Proceedings of the 22nd IEEE/ACM International Conference on Automated Software Engineering (ASE 07). ACM, New York, NY, USA, 204--213. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Jue Wang, Yingnong Dang, Hongyu Zhang, Kai Chen, Tao Xie, and Dongmei Zhang. 2013. Mining succinct and high-coverage API usage patterns from source code. In Proceedings of the 10th Working Conference on Mining Software Repositories (MSR '13). IEEE Press, Piscataway, NJ, USA, 319--328. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Jianyong Wang and Jiawei Han. 2004. BIDE: Efficient Mining of Frequent Closed Sequences. In Proceedings of the 20th International Conference on Data Engineering (ICDE '04). IEEE Computer Society, Washington, DC, USA, 79--90. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Yi Wei, Nirupama Chandrasekaran, Sumit Gulwani, and Youssef Hamadi. 2015. Building Bing Developer Assistant. Technical Report MSR-TR-2015--36. Microsoft Research.Google ScholarGoogle Scholar
  21. Doug Wightman, Zi Ye, Joel Brandt, and Roel Vertegaal. 2012. SnipMatch: Using Source Code Context to Enhance Snippet Retrieval and Parameterization. In Proceedings of the 25th Annual ACM Symposium on User Interface Software and Technology (UIST '12). ACM, New York, NY, USA, 219--228. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Tao Xie and Jian Pei. 2006. MAPO: Mining API Usages from Open Source Repositories. In Proceedings of the 2006 International Workshop on MiningSoftware Repositories (MSR '06). ACM, New York, NY, USA, 54--57. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Codecatch: extracting source code snippets from online sources

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          RAISE '18: Proceedings of the 6th International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering
          May 2018
          67 pages
          ISBN:9781450357234
          DOI:10.1145/3194104

          Copyright © 2018 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 28 May 2018

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Upcoming Conference

          ICSE 2025

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader