skip to main content
10.1145/3540250.3549124acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article

API recommendation for machine learning libraries: how far are we?

Published:09 November 2022Publication History

ABSTRACT

Application Programming Interfaces (APIs) are designed to help developers build software more effectively. Recommending the right APIs for specific tasks is gaining increasing attention among researchers and developers. However, most of the existing approaches are mainly evaluated for general programming tasks using statically typed programming languages such as Java. Little is known about their practical effectiveness and usefulness for machine learning (ML) programming tasks with dynamically typed programming languages such as Python, whose paradigms are fundamentally different from general programming tasks. This is of great value considering the increasing popularity of ML and the large number of new questions appearing on question answering websites. In this work, we set out to investigate the effectiveness of existing API recommendation approaches for Python-based ML programming tasks from Stack Overflow (SO). Specifically, we conducted an empirical study of six widely-used Python-based ML libraries using two state-of-the-art API recommendation approaches, i.e., BIKER and DeepAPI. We found that the existing approaches perform poorly for two main reasons: (1) Python-based ML tasks often require significant long API sequences; and (2) there are common API usage patterns in Python-based ML programming tasks that existing approaches cannot handle. Inspired by our findings, we proposed a simple but effective frequent itemset mining-based approach, i.e., FIMAX, to boost API recommendation approaches, i.e., enhance existing API recommendation approaches for Python-based ML programming tasks by leveraging the common API usage information from SO questions. Our evaluation shows that FIMAX improves existing state-of-the-art API recommendation approaches by up to 54.3% and 57.4% in MRR and MAP, respectively. Our user study with 14 developers further demonstrates the practicality of FIMAX for API recommendation.

References

  1. 2021. PyPI Download Stats. https://pypistats.org/ Google ScholarGoogle Scholar
  2. 2021. Python Package Index - PyPI. https://pypi.org/ Google ScholarGoogle Scholar
  3. 2021. Query stackoverflow - Stack Exchange data explorer. Available at. https://data.stackexchange.com/stackoverflow/query/new Google ScholarGoogle Scholar
  4. Aniya Aggarwal, Pranay Lohia, Seema Nagar, Kuntal Dey, and Diptikalyan Saha. 2019. Black box fairness testing of machine learning models. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 625–635. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Rakesh Agrawal and Ramakrishnan Srikant. 1994. Fast algorithms for mining association rules. In Proc. 20th int. conf. very large data bases, VLDB. 1215, 487–499. Google ScholarGoogle Scholar
  6. Miltos Allamanis, Daniel Tarlow, Andrew Gordon, and Yi Wei. 2015. Bimodal modelling of source code and natural language. In International conference on machine learning. 2123–2132. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Saleema Amershi, Andrew Begel, Christian Bird, Robert DeLine, Harald Gall, Ece Kamar, Nachiappan Nagappan, Besmira Nushi, and Thomas Zimmermann. 2019. Software engineering for machine learning: A case study. In 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). 291–300. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Joel Brandt, Mira Dontcheva, Marcos Weskamp, and Scott R Klemmer. 2010. Example-centric programming: integrating web search into the development environment. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 513–522. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Brock Angus Campbell and Christoph Treude. 2017. NLP2Code: Code snippet content assist via natural language tasks. In 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME). 628–632. Google ScholarGoogle ScholarCross RefCross Ref
  10. Wing-Kwan Chan, Hong Cheng, and David Lo. 2012. Searching connected API subgraph via text phrases. In Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering. 1–11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Shaunak Chatterjee, Sudeep Juvekar, and Koushik Sen. 2009. Sniff: A search engine for java using free-form queries. In International Conference on Fundamental Approaches to Software Engineering. 385–400. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Tapajit Dey, Andrey Karnauch, and Audris Mockus. 2021. Representation of developer expertise in open source software. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). 995–1007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Joshua V Dillon, Ian Langmore, Dustin Tran, Eugene Brevdo, Srinivas Vasudevan, Dave Moore, Brian Patton, Alex Alemi, Matt Hoffman, and Rif A Saurous. 2017. Tensorflow distributions. arXiv preprint arXiv:1711.10604. Google ScholarGoogle Scholar
  14. Tomasz Drabas and Denny Lee. 2017. Learning PySpark. Packt Publishing Ltd. Google ScholarGoogle Scholar
  15. Sanghamitra Dutta, Dennis Wei, Hazar Yueksel, Pin-Yu Chen, Sijia Liu, and Kush Varshney. 2020. Is there a trade-off between fairness and accuracy? a perspective using mismatched hypothesis testing. In International Conference on Machine Learning. 2803–2813. Google ScholarGoogle Scholar
  16. Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, and Daxin Jiang. 2020. Codebert: A pre-trained model for programming and natural languages. arXiv preprint arXiv:2002.08155. Google ScholarGoogle Scholar
  17. Xiaodong Gu, Hongyu Zhang, Dongmei Zhang, and Sunghun Kim. 2016. Deep API learning. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. 631–642. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Xincheng He, Lei Xu, Xiangyu Zhang, Rui Hao, Yang Feng, and Baowen Xu. 2021. PyART: Python API Recommendation in Real-Time. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). 1634–1645. Google ScholarGoogle Scholar
  19. Qiao Huang, Xin Xia, Zhenchang Xing, David Lo, and Xinyu Wang. 2018. API method recommendation without worrying about the task-API knowledge gap. In 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE). 293–304. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. He Jiang, Jingxuan Zhang, Zhilei Ren, and Tao Zhang. 2017. An unsupervised approach for discovering relevant tutorial fragments for APIs. In 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE). 38–48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Nikhil Ketkar. 2017. Introduction to keras. In Deep learning with Python. Springer, 97–111. Google ScholarGoogle Scholar
  22. An Ngoc Lam, Anh Tuan Nguyen, Hoan Anh Nguyen, and Tien N Nguyen. 2015. Combining deep learning with information retrieval to localize buggy files for bug reports (n). In 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE). 476–481. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Fei Lv, Hongyu Zhang, Jian-guang Lou, Shaowei Wang, Dongmei Zhang, and Jianjun Zhao. 2015. Codehow: Effective code search based on api understanding and extended boolean model (e). In 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE). 260–270. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Wes McKinney. 2012. Python for data analysis: Data wrangling with Pandas, NumPy, and IPython. " O’Reilly Media, Inc.". Google ScholarGoogle Scholar
  25. Collin McMillan, Mark Grechanik, Denys Poshyvanyk, Qing Xie, and Chen Fu. 2011. Portfolio: finding relevant functions and their usage. In Proceedings of the 33rd International Conference on Software Engineering. 111–120. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Sandy Moens, Emin Aksehirli, and Bart Goethals. 2013. Frequent itemset mining for big data. In 2013 IEEE international conference on big data. 111–118. Google ScholarGoogle ScholarCross RefCross Ref
  27. Laura Moreno, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, and Andrian Marcus. 2015. How can I use this method? In 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering. 1, 880–890. Google ScholarGoogle ScholarCross RefCross Ref
  28. Seyed Mehdi Nasehi, Jonathan Sillito, Frank Maurer, and Chris Burns. 2012. What makes a good code example?: A study of programming Q&A in StackOverflow. In 2012 28th IEEE International Conference on Software Maintenance (ICSM). 25–34. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Giang Nguyen, Stefan Dlugolinsky, Martin Bobák, Viet Tran, Álvaro López García, Ignacio Heredia, Peter Malík, and Ladislav Hluchỳ. 2019. Machine learning and deep learning frameworks and libraries for large-scale data mining: a survey. Artificial Intelligence Review, 52, 1 (2019), 77–124. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Trong Duc Nguyen, Anh Tuan Nguyen, Hung Dang Phan, and Tien N Nguyen. 2017. Exploring API embedding for API usages and applications. In 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE). 438–449. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Travis E Oliphant. 2006. A guide to NumPy. 1, Trelgol Publishing USA. Google ScholarGoogle Scholar
  32. Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, and Vincent Dubourg. 2011. Scikit-learn: Machine learning in Python. the Journal of machine Learning research, 12 (2011), 2825–2830. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Gayane Petrosyan, Martin P Robillard, and Renato De Mori. 2015. Discovering information explaining API types using text classification. In 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering. 1, 869–879. Google ScholarGoogle ScholarCross RefCross Ref
  34. Mukund Raghothaman, Yi Wei, and Youssef Hamadi. 2016. Swim: Synthesizing what i mean-code search and idiomatic snippet synthesis. In 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE). 357–367. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Mohammad Masudur Rahman, Chanchal K Roy, and David Lo. 2016. Rack: Automatic api recommendation using crowdsourced knowledge. In 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER). 1, 349–359. Google ScholarGoogle ScholarCross RefCross Ref
  36. Ripon K Saha, Matthew Lease, Sarfraz Khurshid, and Dewayne E Perry. 2013. Improving bug localization using structured information retrieval. In 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE). 345–355. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Hinrich Schütze, Christopher D Manning, and Prabhakar Raghavan. 2008. Introduction to information retrieval. 39, Cambridge University Press Cambridge. Google ScholarGoogle Scholar
  38. Ferdian Thung, Shaowei Wang, David Lo, and Julia Lawall. 2013. Automatic recommendation of API methods from feature requests. In 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE). 290–300. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Christoph Treude and Martin P Robillard. 2016. Augmenting api documentation with insights from stack overflow. In 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE). 392–403. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Bogdan Vasilescu, Vladimir Filkov, and Alexander Serebrenik. 2013. Stackoverflow and github: Associations between software development and crowdsourced knowledge. In 2013 International Conference on Social Computing. 188–195. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Pauli Virtanen, Ralf Gommers, Travis E Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, and Jonathan Bright. 2020. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nature methods, 17, 3 (2020), 261–272. Google ScholarGoogle Scholar
  42. Shaowei Wang, David Lo, and Lingxiao Jiang. 2013. An empirical study on developer interactions in stackoverflow. In Proceedings of the 28th Annual ACM Symposium on Applied Computing. 1019–1024. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Song Wang, Nishtha Shrestha, Abarna Kucheri Subburaman, Junjie Wang, Moshi Wei, and Nachiappan Nagappan. 2021. Automatic Unit Test Generation for Machine Learning Libraries: How Far Are We? In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). 1548–1560. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Xin Xia and David Lo. 2017. An effective change recommendation approach for supplementary bug fixes. automated software engineering, 24, 2 (2017), 455–498. Google ScholarGoogle Scholar
  45. Wenkai Xie, Xin Peng, Mingwei Liu, Christoph Treude, Zhenchang Xing, Xiaoxin Zhang, and Wenyun Zhao. 2020. API method recommendation via explicit matching of functionality verb phrases. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1015–1026. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Bowen Xu, Zhenchang Xing, Xin Xia, and David Lo. 2017. AnswerBot: Automated generation of answer summary to developers’ technical questions. In 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE). 706–716. Google ScholarGoogle ScholarCross RefCross Ref
  47. Bowen Xu, Zhenchang Xing, Xin Xia, David Lo, and Shanping Li. 2018. Domain-specific cross-language relevant question retrieval. Empirical Software Engineering, 23, 2 (2018), 1084–1122. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Xinli Yang, David Lo, Xin Xia, Lingfeng Bao, and Jianling Sun. 2016. Combining word embedding with information retrieval to recommend similar bug reports. In 2016 IEEE 27Th international symposium on software reliability engineering (ISSRE). 127–137. Google ScholarGoogle ScholarCross RefCross Ref
  49. Hongyu Zhang, Anuj Jain, Gaurav Khandelwal, Chandrashekhar Kaushik, Scott Ge, and Wenxiang Hu. 2016. Bing developer assistant: improving developer productivity by recommending sample code. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. 956–961. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Tianyi Zhang, Cuiyun Gao, Lei Ma, Michael Lyu, and Miryung Kim. 2019. An empirical study of common challenges in developing deep learning applications. In 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE). 104–115. Google ScholarGoogle ScholarCross RefCross Ref
  51. Hao Zhong, Tao Xie, Lu Zhang, Jian Pei, and Hong Mei. 2009. MAPO: Mining and recommending API usage patterns. In European Conference on Object-Oriented Programming. 318–343. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. API recommendation for machine learning libraries: how far are we?
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              ESEC/FSE 2022: Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering
              November 2022
              1822 pages
              ISBN:9781450394130
              DOI:10.1145/3540250

              Copyright © 2022 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 9 November 2022

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

              Acceptance Rates

              Overall Acceptance Rate112of543submissions,21%

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader