research-article

XCoS: Explainable Code Search Based on Query Scoping and Knowledge Graph

Authors:
Chong Wang

Fudan University

Fudan University

0000-0003-1424-6290
View Profile

,
Xin Peng

Fudan University

Fudan University

0000-0003-3376-2581
View Profile

,
Zhenchang Xing

CSIRO’s Data61 & Australian National University

CSIRO’s Data61 & Australian National University

0000-0001-7663-1421
View Profile

,
Yue Zhang

Fudan University

Fudan University

0009-0001-9261-5699
View Profile

,
Mingwei Liu

Fudan University

Fudan University

0000-0002-3462-997X
View Profile

,
Rong Luo

Fudan University

Fudan University

0009-0004-1764-0693
View Profile

,
Xiujie Meng

Fudan University

Fudan University

0009-0009-8778-7400
View Profile

ACM Transactions on Software Engineering and Methodology Volume 32 Issue 629 September 2023Article No.: 140pp 1–28https://doi.org/10.1145/3593800

Published:29 September 2023Publication History

ACM Transactions on Software Engineering and Methodology

Abstract

When searching code, developers may express additional constraints (e.g., functional constraints and nonfunctional constraints) on the implementations of desired functionalities in the queries. Existing code search tools treat the queries as a whole and ignore the different implications of different parts of the queries. Moreover, these tools usually return a ranked list of candidate code snippets without any explanations. Therefore, the developers often find it hard to choose the desired results and build confidence on them. In this article, we conduct a developer survey to better understand and address these issues and induct some insights from the survey results. Based on the insights, we propose XCoS, an explainable code search approach based on query scoping and knowledge graph. XCoS extracts a background knowledge graph from general knowledge bases like Wikidata and Wikipedia. Given a code search query, XCoS identifies different parts (i.e., functionalities, functional constraints, nonfunctional constraints) from it and use the expressions of functionalities and functional constraints to search the codebase. It then links both the query and the candidate code snippets to the concepts in the background knowledge graph and generates explanations based on the association paths between these two parts of concepts together with relevant descriptions. XCoS uses an interactive user interface that allows the user to better understand the associations between candidate code snippets and the query from different aspects and choose the desired results. Our evaluation shows that the quality of the extracted background knowledge and the concept linkings in codebase is generally high. Furthermore, the generated explanations are considered complete, concise, and readable, and the approach can help developers find the desired code snippets more accurately and confidently.

REFERENCES

[1] Internet Archive. 2021. Stack Overflow Data Dump Version from March 4, 2021. Retrieved September 4, 2021 from https://archive.org/download/stackexchange/.Google Scholar
[2] Wikimedia. 2021. Wikidata Data Dump Version from November 24, 2021. Retrieved November 24, 2021 from https://dumps.wikimedia.org/wikidatawiki/entities/.Google Scholar
[3] Wikimedia. 2021. Wikipedia Data Dump Version from December 20, 2021. Retrieved December 20, 2021 from https://dumps.wikimedia.org/enwiki/.Google Scholar
[4] Elastic Stack. 2022. Elasticsearch. Retrieved March 5, 2022 from https://www.elastic.co/elasticsearch/.Google Scholar
[5] GitHub. 2022. Javalang. Retrieved March 5, 2022 from https://github.com/c2nes/javalang.Google Scholar
[6] GitHub. 2022. Neuralcoref. Retrieved March 5, 2022 from https://github.com/huggingface/neuralcoref.Google Scholar
[7] Replication Package. 2022. Replication Package. Retrieved March 5, 2022 from https://xcos-replicationpackage.github.io/.Google Scholar
[8] Searchcode. 2022. Searchcode. Retrieved March 5, 2022 from https://searchcode.com/.Google Scholar
[9] SpaCy. 2022. SpaCy. Retrieved March 5, 2022 from https://spacy.io/.Google Scholar
[10] GitHub. Spiral. Retrieved March 5, 2022 from https://github.com/casics/spiral.Google Scholar
[11] Stack Overflow. 2022. Stack Overflow Question 13269606. Retrieved March 5, 2022 from https://stackoverflow.com/questions/13269606.Google Scholar
[12] Wikidata. 2022. Wikidata. Retrieved March 5, 2022 from https://www.wikidata.org/.Google Scholar
[13] Wikipedia. 2022. Wikipedia. Retrieved March 5, 2022 from https://en.wikipedia.org/.Google Scholar
[14] Wikipedia2Vec. 2022. Wikipedia2Vec. Retrieved March 5, 2022 from https://wikipedia2vec.github.io/wikipedia2vec/.Google Scholar
[15] Bajracharya Sushil Krishna, Ngo Trung Chi, Linstead Erik, Dou Yimeng, Rigor Paul, Baldi Pierre, and Lopes Cristina Videira. 2006. Sourcerer: A search engine for open source code supporting structure-based search. In Companion to the 21th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA’06). ACM, New York, NY, 681–682.Google ScholarDigital Library
[16] Biggers Lauren R., Bocovich Cecylia, Capshaw Riley, Eddy Brian P., Etzkorn Letha H., and Kraft Nicholas A.. 2014. Configuring latent Dirichlet allocation based feature location. Empirical Software Engineering 19, 3 (2014), 465–500. Google ScholarDigital Library
[17] Dalpiaz Fabiano, Dell’Anna Davide, Aydemir Fatma Basak, and Çevikol Sercan. 2019. Requirements classification with interpretable machine learning and dependency parsing. In Proceedings of the 2019 IEEE 27th International Requirements Engineering Conference (RE’19). IEEE, Los Alamitos, CA, 142–152.Google ScholarCross Ref
[18] Devlin Jacob, Chang Ming-Wei, Lee Kenton, and Toutanova Kristina. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies(NAACL-HLT’19)—Volume 1 (Long and Short Papers). 4171–4186.Google Scholar
[19] Dit Bogdan, Revelle Meghan, Gethers Malcom, and Poshyvanyk Denys. 2013. Feature location in source code: A taxonomy and survey. Journal of Software: Evolution and Process 25, 1 (2013), 53–95. Google ScholarCross Ref
[20] Dit Bogdan, Revelle Meghan, and Poshyvanyk Denys. 2013. Integrating information retrieval, execution and link analysis algorithms to improve feature location in software. Empirical Software Engineering 18, 2 (2013), 277–309. Google ScholarDigital Library
[21] Eddy Brian P., Kraft Nicholas A., and Gray Jeff. 2018. Impact of structural weighting on a latent Dirichlet allocation-based feature location technique. Journal of Software: Evolution and Process 30, 1 (2018), e1892. Google ScholarCross Ref
[22] Feng Zhangyin, Guo Daya, Tang Duyu, Duan Nan, Feng Xiaocheng, Gong Ming, Shou Linjun, et al. 2020. CodeBERT: A pre-trained model for programming and natural languages. In Findings of the Association for Computational Linguistics: EMNLP 2020, Cohn Trevor, He Yulan, and Liu Yang (Eds.). Association for Computational Linguistics, 1536–1547.Google Scholar
[23] Fu Michael and Tantithamthavorn Chakkrit. 2022. LineVul: A transformer-based line-level vulnerability prediction. In Proceedings of 19th IEEE/ACM International Conference on Mining Software Repositories (MSR’22). ACM, New York, NY, 608–620. Google ScholarDigital Library
[24] Gallardo-Valencia Rosalva E. and Sim Susan Elliott. 2009. Internet-scale code search. In Proceedings of the 2009 ICSE Workshop on Search-Driven Development-Users, Infrastructure, Tools and Evaluation. 49–52. Google ScholarDigital Library
[25] Gu Xiaodong, Zhang Hongyu, and Kim Sunghun. 2018. Deep code search. In Proceedings of the 40th International Conference on Software Engineering (ICSE’18). ACM, New York, NY, 933–944.Google ScholarDigital Library
[26] Guo Daya, Ren Shuo, Lu Shuai, Feng Zhangyin, Tang Duyu, Liu Shujie, Zhou Long, et al. 2021. GraphCodeBERT: Pre-training code representations with data flow. In Proceedings of the 9th International Conference on Learning Representations (ICLR’21). https://openreview.net/forum?id=jLoC4ez43PZ.Google Scholar
[27] Gupta Samir, Malik Sana, Pollock Lori L., and Vijay-Shanker K.. 2013. Part-of-speech tagging of program identifiers for improved text-based software engineering tools. In Proceedings of the IEEE 21st International Conference on Program Comprehension (ICPC’13). IEEE, Los Alamitos, CA, 3–12.Google ScholarCross Ref
[28] Haiduc Sonia, Bavota Gabriele, Marcus Andrian, Oliveto Rocco, Lucia Andrea De, and Menzies Tim. 2013. Automatic query reformulations for text retrieval in software engineering. In Proceedings of the 35th International Conference on Software Engineering (ICSE’13). IEEE, Los Alamitos, CA, 842–851.Google ScholarCross Ref
[29] Hill Emily, Pollock Lori L., and Vijay-Shanker K.. 2011. Improving source code search with natural language phrasal representations of method signatures. In Proceedings of the 26th IEEE/ACM International Conference on Automated Software Engineering (ASE’11). IEEE, Los Alamitos, CA, 524–527.Google ScholarDigital Library
[30] Husain Hamel, Wu Ho-Hsiang, Gazit Tiferet, Allamanis Miltiadis, and Brockschmidt Marc. 2019. CodeSearchNet challenge: Evaluating the state of semantic code search. arXiv preprint arXiv:1909.09436 (2019).Google Scholar
[31] Hussain Ishrar, Kosseim Leila, and Ormandjieva Olga. 2008. Using linguistic knowledge to classify non-functional requirements in SRS documents. In Natural Language and Information Systems. Lecture Notes in Computer Science, Vol. 5039. Springer, 287–298.Google Scholar
[32] Jiarpakdee Jirayus, Tantithamthavorn Chakkrit, and Grundy John C.. 2021. Practitioners’ perceptions of the goals and visual explanations of defect prediction models. In Proceedings of the 18th IEEE/ACM International Conference on Mining Software Repositories (MSR’21). IEEE, Los Alamitos, CA, 432–443. Google ScholarCross Ref
[33] Kagdi Huzefa H., Gethers Malcom, and Poshyvanyk Denys. 2013. Integrating conceptual and logical couplings for change impact analysis in software. Empirical Software Engineering 18, 5 (2013), 933–969. Google ScholarCross Ref
[34] Kim Dongsun, Tao Yida, Kim Sunghun, and Zeller Andreas. 2013. Where should we fix this bug? A two-phase recommendation model. IEEE Transactions on Software Engineering 39, 11 (2013), 1597–1610. Google ScholarDigital Library
[35] Kolluru Keshav, Adlakha Vaibhav, Aggarwal Samarth, Mausam, and Chakrabarti Soumen. 2020. OpenIE6: Iterative grid labeling and coordination analysis for open information extraction. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP’20). 3748–3761.Google ScholarCross Ref
[36] Le Phong and Titov Ivan. 2018. Improving entity linking by modeling latent relations between mentions. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL’18)—Volume 1: Long Papers. 1595–1604.Google ScholarCross Ref
[37] Li Bixin, Sun Xiaobing, and Leung Hareton. 2012. Combining concept lattice with call graph for impact analysis. Advances in Engineering Software 53 (2012), 1–13. Google ScholarDigital Library
[38] Li Xiaonan, Gong Yeyun, Shen Yelong, Qiu Xipeng, Zhang Hang, Yao Bolun, Qi Weizhen, Jiang Daxin, Chen Weizhu, and Duan Nan. 2022. CodeRetriever: Unimodal and bimodal contrastive learning. arXiv preprint arXiv:2201.10866 (2022).Google Scholar
[39] Li Xuan, Wang Zerui, Wang Qianxiang, Yan Shoumeng, Xie Tao, and Mei Hong. 2016. Relationship-aware code search for JavaScript frameworks. In Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE’16). ACM, New York, NY, 690–701.Google ScholarDigital Library
[40] Likert Rensis. 1932. A technique for the measurement of attitudes. Archives of Psychology 22, 140 (1932), 55.Google Scholar
[41] Lin Bin, Zampetti Fiorella, Bavota Gabriele, Penta Massimiliano Di, and Lanza Michele. 2019. Pattern-based mining of opinions in Q&A websites. In Proceedings of the 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE’19). IEEE, Los Alamitos, CA, 548–559.Google ScholarDigital Library
[42] Lin Bin, Zampetti Fiorella, Bavota Gabriele, Penta Massimiliano Di, and Lanza Michele. 2019. Pattern-based mining of opinions in Q&A websites. In Proceedings of the 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE’19). IEEE, Los Alamitos, CA, 548–559.Google ScholarDigital Library
[43] Ling Charles X. and Li Chenghui. 1998. Data mining for direct marketing: Problems and solutions. In Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining (KDD’98). 73–79. http://www.aaai.org/Library/KDD/1998/kdd98-011.php.Google Scholar
[44] Liu Jason, Kim Seohyun, Murali Vijayaraghavan, Chaudhuri Swarat, and Chandra Satish. 2019. Neural query expansion for code search. In Proceedings of the 3rd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages (MAPL@PLDI’19). ACM, New York, NY, 29–37.Google ScholarDigital Library
[45] Liu Mingwei, Peng Xin, Marcus Andrian, Xing Shuangshuang, Treude Christoph, and Zhao Chengyuan. 2022. API-related developer information needs in Stack Overflow. IEEE Transactions on Software Engineering 48, 11 (2022), 4485–4500.Google Scholar
[46] Liu Mingwei, Peng Xin, Marcus Andrian, Xing Zhenchang, Xie Wenkai, Xing Shuangshuang, and Liu Yang. 2019. Generating query-specific class API summaries. In Proceedings of the 27th ACM Joint Meeting on European Software Engineering Conference and the Symposium on the Foundations of Software Engineering (ESEC/SIGSOFT FSE’19).120–130.Google ScholarDigital Library
[47] Liu Shangqing, Xie Xiaofei, Ma Lei, Siow Jing Kai, and Liu Yang. 2021. GraphSearchNet: Enhancing GNNs via capturing global dependency for semantic code search. CoRR abs/2111.02671 (2021).Google Scholar
[48] Liu Wenjian, Peng Xin, Xing Zhenchang, Li Junyi, Xie Bing, and Zhao Wenyun. 2018. Supporting exploratory code search with differencing and visualization. In Proceedings of the 25th International Conference on Software Analysis, Evolution, and Reengineering (SANER’18). IEEE, Los Alamitos, CA, 300–310.Google ScholarCross Ref
[49] Lu Jinting, Wei Ying, Sun Xiaobing, Li Bin, Wen Wanzhi, and Zhou Cheng. 2018. Interactive query reformulation for source-code search with word relations. IEEE Access 6 (2018), 75660–75668.Google ScholarCross Ref
[50] Lu Meili, Sun Xiaobing, Wang Shaowei, Lo David, and Duan Yucong. 2015. Query expansion via WordNet for effective code search. In Proceedings of the 22nd IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER’15). IEEE, Los Alamitos, CA, 545–549.Google Scholar
[51] Lukins Stacy K., Kraft Nicholas A., and Etzkorn Letha H.. 2008. Source code retrieval for bug localization using latent Dirichlet allocation. In Proceedings of the 15th Working Conference on Reverse Engineering (WCRE’08). IEEE, Los Alamitos, CA, 155–164.Google ScholarDigital Library
[52] Lv Fei, Zhang Hongyu, Lou Jian-Guang, Wang Shaowei, Zhang Dongmei, and Zhao Jianjun. 2015. CodeHow: Effective code search based on API understanding and extended Boolean model (E). In Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering (ASE’15). IEEE, Los Alamitos, CA, 260–270.Google ScholarDigital Library
[53] McHugh Mary L.. 2012. Interrater reliability: The kappa statistic. Biochemia Medica 22, 3 (2012), 276–282.Google ScholarCross Ref
[54] McMillan Collin, Grechanik Mark, Poshyvanyk Denys, Xie Qing, and Fu Chen. 2011. Portfolio: Finding relevant functions and their usage. In Proceedings of the 33rd International Conference on Software Engineering (ICSE’11). ACM, New York, NY, 111–120.Google ScholarDigital Library
[55] Mills Chris, Bavota Gabriele, Haiduc Sonia, Oliveto Rocco, Marcus Andrian, and Lucia Andrea De. 2017. Predicting query quality for applications of text retrieval to software engineering tasks. ACM Transactions on Software Engineering and Methodology 26, 1 (2017), Article 3, 45 pages. Google ScholarDigital Library
[56] Moreno Laura, Aponte Jairo, Sridhara Giriprasad, Marcus Andrian, Pollock Lori, and Shanker Vijay. 2013. Automatic generation of natural language summaries for Java classes. In Proceedings of the 21st IEEE International Conference on Program Comprehension (ICPC’13). IEEE, Los Alamitos, CA, 23–32.Google ScholarCross Ref
[57] Nie Liming, Jiang He, Ren Zhilei, Sun Zeyi, and Li Xiaochen. 2016. Query expansion based on crowd knowledge for code search. IEEE Transactions on Services Computing 9, 5 (2016), 771–783.Google ScholarCross Ref
[58] Pandita Rahul, Taneja Kunal, Williams Laurie A., and Tung Teresa. 2016. ICON: Inferring temporal constraints from natural language API descriptions. In Proceedings of the 2016 IEEE International Conference on Software Maintenance and Evolution (ICSME’16). IEEE, Los Alamitos, CA, 378–388. Google ScholarCross Ref
[59] Pornprasit Chanathip and Tantithamthavorn Chakkrit. 2021. JITLine: A simpler, better, faster, finer-grained just-in-time defect prediction. In Proceedings of the 18th IEEE/ACM International Conference on Mining Software Repositories (MSR’21). IEEE, Los Alamitos, CA, 369–379. Google ScholarCross Ref
[60] Pornprasit Chanathip and Tantithamthavorn Chakkrit. 2022. DeepLineDP: Towards a deep learning approach for line-level defect prediction. IEEE Transactions on Software Engineering 49, 1 (2022), 84–98.Google ScholarCross Ref
[61] Pornprasit Chanathip, Tantithamthavorn Chakkrit, Jiarpakdee Jirayus, Fu Michael, and Thongtanunam Patanamon. 2021. PyExplainer: Explaining the predictions of just-in-time defect models. In Proceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering (ASE’21). IEEE, Los Alamitos, CA, 407–418. Google ScholarDigital Library
[62] Rahman Mohammad Masudur, Roy Chanchal K., and Lo David. 2019. Automatic query reformulation for code search using crowdsourced knowledge. Empirical Software Engineering 24, 4 (2019), 1869–1924.Google ScholarDigital Library
[63] Rajapaksha Dilini, Tantithamthavorn Chakkrit, Jiarpakdee Jirayus, Bergmeir Christoph, Grundy John, and Buntine Wray L.. 2022. SQAPlanner: Generating data-informed software quality improvement plans. IEEE Transactions on Software Engineering 48, 8 (2022), 2814–2835. Google ScholarDigital Library
[64] Reimers Nils and Gurevych Iryna. 2019. Sentence-BERT: Sentence embeddings using Siamese BERT-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 3980–3990.Google ScholarCross Ref
[65] Revelle Meghan, Gethers Malcom, and Poshyvanyk Denys. 2011. Using structural and textual information to capture feature coupling in object-oriented software. Empirical Software Engineering 16, 6 (2011), 773–811. Google ScholarDigital Library
[66] Risvik Knut Magne, Mikolajewski Tomasz, and Boros Peter. 2003. Query segmentation for web search. In Proceedings of the 12th International World Wide Web Conference—Posters (WWW’03). http://www2003.org/cdrom/papers/poster/p052/xhtml/querysegmentation.html.Google Scholar
[67] Robertson Stephen E. and Walker Steve. 1994. Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval. In Proceedings of the 17th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, NY, 232–241.Google ScholarDigital Library
[68] Ross Amanda and Willson Victor L.. 2017. One-sample T-test. In Basic and Advanced Statistical Tests. Springer, 9–12.Google ScholarCross Ref
[69] Sachdev Saksham, Li Hongyu, Luan Sifei, Kim Seohyun, Sen Koushik, and Chandra Satish. 2018. Retrieval on source code: A neural code search. In Proceedings of 2nd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages (MAPL@PLDI’18). ACM, New York, NY, 31–41.Google ScholarDigital Library
[70] Salton Gerard and Buckley Chris. 1988. Term-weighting approaches in automatic text retrieval. Information Processing & Management 24, 5 (1988), 513–523.Google ScholarDigital Library
[71] Singer Janice, Lethbridge Timothy C., Vinson Norman G., and Anquetil Nicolas. 1997. An examination of software engineering work practices. In Proceedings of the 1997 Conference of the Centre for Advanced Studies on Collaborative Research. 21.Google Scholar
[72] Singh Ravindra and Mangat Naurang Singh. 2013. Elements of Survey Sampling. Vol. 15. Springer Science & Business Media.Google Scholar
[73] Sridhara Giriprasad, Hill Emily, Muppaneni Divya, Pollock Lori, and Vijay-Shanker K.. 2010. Towards automatically generating summary comments for Java methods. In Proceedings of the 25th IEEE/ACM International Conference on Automated Software Engineering (ASE’10). 43–52.Google ScholarDigital Library
[74] Tantithamthavorn Chakkrit, Jiarpakdee Jirayus, and Grundy John. 2021. Actionable analytics: Stop telling me what it is; Please tell me what to do. IEEE Software 38, 4 (2021), 115–120. Google ScholarDigital Library
[75] Tantithamthavorn Chakkrit Kla and Jiarpakdee Jirayus. 2021. Explainable AI for software engineering. In Proceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering (ASE’21). IEEE, Los Alamitos, CA, 1–2. Google ScholarDigital Library
[76] Treude Christoph, Barzilay Ohad, and Storey Margaret-Anne D.. 2011. How do programmers ask and answer questions on the web? In Proceedings of the 33rd International Conference on Software Engineering (ICSE’11). ACM, New York, NY, 804–807.Google ScholarDigital Library
[77] Treude Christoph, Robillard Martin P., and Dagenais Barthélémy. 2014. Extracting development tasks to navigate software documentation. IEEE Transactions on Software Engineering 41, 6 (2014), 565–581.Google ScholarDigital Library
[78] Wan Yao, Shu Jingdong, Sui Yulei, Xu Guandong, Zhao Zhou, Wu Jian, and Yu Philip S.. 2019. Multi-modal attention network learning for semantic source code retrieval. In Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering (ASE’19). IEEE, Los Alamitos, CA, 13–25.Google ScholarDigital Library
[79] Wang Chong, Peng Xin, Liu Mingwei, Xing Zhenchang, Bai Xuefang, Xie Bing, and Wang Tuo. 2019. A learning-based approach for automatic construction of domain glossary from source code and documentation. In Proceedings of the ACM Joint Meeting on the European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/SIGSOFT FSE’19). ACM, New York, NY, 97–108.Google ScholarDigital Library
[80] Wang Jinshui, Peng Xin, Xing Zhenchang, and Zhao Wenyun. 2013. Improving feature location practice with multi-faceted interactive exploration. In Proceedings of the 35th International Conference on Software Engineering (ICSE’13). IEEE, Los Alamitos, CA, 762–771.Google ScholarCross Ref
[81] Wang Shaowei and Lo David. 2014. Version history, similar report, and structure: Putting them together for improved bug localization. In Proceedings of the 22nd International Conference on Program Comprehension (ICPC’14). ACM, New York, NY, 53–63. Google ScholarDigital Library
[82] Wattanakriengkrai Supatsara, Thongtanunam Patanamon, Tantithamthavorn Chakkrit, Hata Hideaki, and Matsumoto Kenichi. 2022. Predicting defective lines using a model-agnostic technique. IEEE Transactions on Software Engineering 48, 5 (2022), 1480–1496. Google ScholarDigital Library
[83] Welch Bernard L.. 1947. The generalization of Student’s problem when several different population variances are involved. Biometrika 34, 1-2 (1947), 28–35.Google ScholarCross Ref
[84] Ye Xin, Bunescu Razvan C., and Liu Chang. 2016. Mapping bug reports to relevant files: A ranking model, a fine-grained benchmark, and feature evaluation. IEEE Transactions on Software Engineering 42, 4 (2016), 379–402. Google ScholarDigital Library

Index Terms

XCoS: Explainable Code Search Based on Query Scoping and Knowledge Graph
1. Software and its engineering
  1. Software creation and management
    1. Software development techniques

Recommendations

Learning to rank code examples for code search engines

Source code examples are used by developers to implement unfamiliar tasks by learning from existing solutions. To better support developers in finding existing solutions, code search engines are designed to locate and rank code examples relevant to user'...
Read More
Neural query expansion for code search
MAPL 2019: Proceedings of the 3rd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages

Searching repositories of existing source code for code snippets is a key task in software engineering. Over the years, many approaches to this problem have been proposed. One recent tool called NCS, takes in a natural language query and outputs ...
Read More
Active code search: incorporating user feedback to improve code search relevance
ASE '14: Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering

Code search techniques return relevant code fragments given a user query. They typically work in a passive mode: given a user query, a static list of code fragments sorted by the relevance scores decided by a code search technique is returned to the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Software Engineering and Methodology Volume 32, Issue 6
November 2023
949 pages
ISSN:1049-331X
EISSN:1557-7392
DOI:10.1145/3625557
Editor:
Mauro Pezzè
USI Università della Svizzera italiana and SIT Schaffhausen Institute of Technology, Switzerland
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 29 September 2023
- Online AM: 22 April 2023
- Accepted: 20 March 2023
- Revised: 13 February 2023
- Received: 20 May 2022
Published in tosem Volume 32, Issue 6

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Code search
explainability
knowledge
concept
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 467
  Total Downloads
- Downloads (Last 12 months)467
- Downloads (Last 6 weeks)43
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

View Full Text

XCoS: Explainable Code Search Based on Query Scoping and Knowledge Graph

ACM Transactions on Software Engineering and Methodology

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

Learning to rank code examples for code search engines

Neural query expansion for code search

Active code search: incorporating user feedback to improve code search relevance