Abstract
Searching application programming interfaces (APIs) is very important for developers to reuse software projects. Existing natural language based API search mainly faces the following challenges. 1) More accurate results are required as software projects evolve to be more heterogeneous and complex. 2) The semantic relationships between APIs (e.g., inheritances between classes, and invocations between methods) need to be illustrated so that developers can better understand their usage scenarios. To deal with these issues, we propose GeAPI, a novel graph embedding based approach for API graph search and recommendation in this paper. First, we build a software project’s API graph automatically from its source code and represent each API using graph embedding methods. Second, we search the API graph with a question in natural language, and return the corresponding subgraph that is composed of relevant code elements and their associated relationships, as the best answer of the question. In experiments, we select three well-known open source projects, JodaTime, Apache Lucene and POI, as examples to perform API search tasks. The experimental results show that our approach GeAPI improves F1-score by 10% compared with the existing shortest path based API search approach, while reduces the average response time about 60 times.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Moreno L, Bavota G, di Penta M et al. How can I use this method? In Proc. the 37th International Conference on Software Engineering, May 2015, pp.880-890.
Sirres R, Bissyandé T F, Kim D, Lo D, Klein J, Kim K, Traon L Y. Augmenting and structuring user queries to support efficient free-form code search. Empirical Softwarengineering, 2018, 23(5): 2622-2654.
Stylos J, Myers B A. Mica: A web-search tool for finding API components and examples. In Proc. the 2006 IEEE Symposium on Visual Languages and Human-Centric Computing, Sept. 2006, pp.195-202.
Linstead E, Bajracharya S, Ngo T, Rigor P, Lopes C, Baldi P. Sourcerer: Mining and searching internet-scale software repositories. Data Mining and Knowledge Discovery, 2009, 18(2): 300-336.
Baeza-Yates R, Ribeiro-Neto B. Modern Information Retrieval — The Concepts and Technology Behind Search (2nd edition). Addison-Wesley Professional, 2011.
Lv F, Zhang H, Lou J,Wang S, Zhang D, Zhao J. CodeHow: Effective code search based on API understanding and extended Boolean model (E). In Proc. the 30th IEEE/ACM International Conference on Automated Software Engineering, Nov. 2015, pp.260-270.
Hill E, Pollock L, Vijay-Shanker K. Improving source code search with natural language phrasal representations of method signatures. In Proc. the 26th IEEE/ACM International Conference on Automated Software Engineering, Nov. 2011, pp.524-527.
Rahman M M, Roy C K. QUICKAR: Automatic query reformulation for concept location using crowdsourced knowledge. In Proc. the 31st IEEE/ACM International Conference on Automated Software Engineering, Aug. 2016, pp.220-225.
McMillan C, Grechanik M, Poshyvanyk D, Xie Q, Fu C. Portfolio: Finding relevant functions and their usage. In Proc. the 33rd International Conference on Software Engineering, May 2011, pp.111-120.
Chan W K, Cheng H, Lo D. Searching connected API subgraph via text phrases. In Proc. the 20th ACM SIGSOFT International Symposium on the Foundations of Software Engineering, Nov. 2012, Article No. 10.
Goyal P, Ferrara E. Graph embedding techniques, applications, and performance: A survey. Knowledge-Based Systems, 2018, 151: 78-94.
Belkin M, Niyogi P. Laplacian eigenmaps and spectral techniques for embedding and clustering. In Proc. the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic, Dec. 2002, pp.585-591.
Ahmed A, Shervashidze N, Narayanamurthy S, Josifovski V, Smola A J. Distributed large-scale natural graph factorization. In Proc. the 22nd International World Wide Web Conference, May 2013, pp.37-48.
Perozzi B, Al-Rfou R, Skiena S. DeepWalk: Online learning of social representations. In Proc. the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug. 2014, pp.701-710.
Wang D, Cui P, Zhu W. Structural deep network embedding. In Proc. the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug. 2016, pp.1225-1234.
Kipf T N, Welling M. Semi-supervised classification with graph convolutional networks. arXiv:1609.02907, 2016. https://arXiv.org/abs/1609.02907, July 2019.
Bordes A, Usunier N, Garcia-Durán A, Weston J, Yakhnenko O. Translating embeddings for modeling multirelational data. In Proc. the 27th Annual Conference on Neural Information Processing Systems, Dec. 2013, pp.2787-2795.
Lin Y, Liu Z, Sun M, Liu Y, Zhu X. Learning entity and relation embeddings for knowledge graph completion. In Proc. the 29th AAAI Conference on Artificial Intelligence, Jan. 2015, pp.2181-2187.
Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q. LINE: Large-scale information network embedding. In Proc. the 24th International Conference on World Wide Web, May 2015, pp.1067-1077.
Cui P, Wang X, Pei J, Zhu W. A survey on network embedding. IEEE Transactions on Knowledge and Data Engineering, 2018, 31(5): 833-852.
Zou Y, Ling C, Lin Z, Xie B. Graph embedding based code search in software project. In Proc. the 10th Asia-Pacific Symposium on Internetware, Sept. 2018, Article No. 1.
Chatterjee S, Juvekar S, Sen K. SNIFF: A search engine for Java using free-form queries. In Proc. the 12th International Conference on Fundamental Approaches to Software Engineering, Mar. 2009, pp.385-400.
Tian Y, Lo D, Lawall J. Automated construction of a software specific word similarity database. In Proc. the 2014 IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering, Feb. 2014, pp.44-53.
Yang J, Tan L. Inferring semantically related words from software context. In Proc. the 9th IEEE Working Conference of Mining Software Repositories, June 2012, pp.161-170.
Sridhara G, Hill E, Pollock L, Vijay-Shanker K. Identifying word relations in software: A comparative study of semantic similarity tools. In Proc. the 16th IEEE International Conference on Program Comprehension, June 2008, pp.123-132.
Wang S, Lo D, Jiang L. Active code search: Incorporating user feedback to improve code search relevance. In Proc. the 29th IEEE/ACM International Conference on Automated Software Engineering, Sept. 2014, pp.677-682.
Haiduc S, Bavota G, Marcus A, Oliveto R, de Lucia A, Menzies T. Automatic query reformulations for text retrieval in software engineering. In Proc. the 35th International Conference on Software Engineering, May 2013, pp.842-851.
Jiang H, Nie L, Sun Z, Kong W, Zhang T, Luo X. ROSF: Leveraging information retrieval and supervised learning for recommending code snippets. IEEE Transactions on Services Computing, 2019, 12(1): 34-46.
Gu X, Zhang H, Zhang D, Kim S. Deep API learning. In Proc. the 24th ACM SIGSOFT International Symposium on the Foundations of Software Engineering, Nov. 2016, pp.631-642.
Richardson K, Kuhn J. Function assistant: A tool for NL querying of APIs. arXiv:1706.00468, 2017. https://arXiv.org/abs/-1706.00468, July 2019.
Nguyen T D, Nguyen A T, Phan H D, Nguyen T N. Exploring API embedding for API usages and applications. In Proc. the 39th IEEE/ACM International Conference on Software Engineering, May 2017, pp.438-449.
Huang Q, Xia X, Xing Z, Lo D, Wang X. API method recommendation without worrying about the Task-API knowledge gap. In Proc. the 33rd ACM/IEEE International Conference on Automated Software Engineering, Sept. 2018, pp.292-303.
Sillito J, Murphy G C, de Volder K. Asking and answering questions during a programming change task. IEEE Transactions on Software Engineering, 2008, 34(4): 434-451.
Li X, Wang Z, Wang Q, Yan S, Xie T, Mei H. Relationshipaware code search for JavaScript frameworks. In Proc. the 24th ACM SIGSOFT International Symposium on the Foundations of Software Engineering, Nov. 2016, pp.690-701.
Fu K, Qian W Y, Peng X, Zhao W. Feature location method based on call chain analysis. Computer Science, 2014, 41(11): 36-39. (in Chinese)
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
ESM 1
(PDF 497 kb)
Rights and permissions
About this article
Cite this article
Ling, CY., Zou, YZ., Lin, ZQ. et al. Graph Embedding Based API Graph Search and Recommendation. J. Comput. Sci. Technol. 34, 993–1006 (2019). https://doi.org/10.1007/s11390-019-1956-2
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-019-1956-2