ABSTRACT
Partial code usually involves non-fully-qualified type names (non-FQNs) and undeclared receiving objects. Resolving the FQNs of these non-FQN types and undeclared receiving objects (referred to as type inference) is the prerequisite to effective search and reuse of partial code. Existing dictionary-lookup based methods build a symbolic knowledge base of API names and code contexts, which involve significant compilation overhead and are sensitive to unseen API names and code context variations. In this paper, we formulate type inference as a cloze-style fill-in-blank language task. Built on source code naturalness, our approach fine-tunes a code masked language model (MLM) as a neural knowledge base of code elements with a novel “pre-train, prompt and predict” paradigm from raw source code. Our approach is lightweight and has minimum requirements on code compilation. Unlike existing symbolic name and context matching for type inference, our prompt-tuned code MLM packs FQN syntax and usage in its parameters and supports fuzzy neural type inference. We systematically evaluate our approach on a large amount of source code from GitHub and Stack Overflow. Our results confirm the effectiveness of our approach design and the practicality for partial code type inference. As the first of its kind, our neural type inference method opens the door to many innovative ways of using partial code.
- Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, and Kai-Wei Chang. 2020. A transformer-based approach for source code summarization. arXiv preprint arXiv:2005.00653(2020).Google Scholar
- Raja Naeem Akram and Konstantinos Markantonakis. 2016. Challenges of security and trust of mobile devices as digital avionics component. In 2016 Integrated Communications Navigation and Surveillance (ICNS). 1C4–1–1C4–11. https://doi.org/10.1109/ICNSURV.2016.7486323Google Scholar
- Miltiadis Allamanis, Earl T. Barr, Premkumar T. Devanbu, and Charles Sutton. 2018. A Survey of Machine Learning for Big Code and Naturalness. ACM Computing Surveys (CSUR) 51 (2018), 1 – 37.Google ScholarDigital Library
- Miltiadis Allamanis, Daniel Tarlow, Andrew D. Gordon, and Yi Wei. 2015. Bimodal Modelling of Source Code and Natural Language. In ICML.Google Scholar
- Anonymous. 2021. A New Search Paradigm for Natural Language Code Search. (2021).Google Scholar
- Anonymous. 2022. Analyzing CodeBERT’s Performance on Natural Language Code Search. (2022).Google Scholar
- Sebastian Baltes and Stephan Diehl. 2019. Usage and attribution of Stack Overflow code snippets in GitHub projects. Empirical Software Engineering 24, 3 (2019), 1259–1295.Google ScholarDigital Library
- Kurt D. Bollacker, Colin Evans, Praveen K. Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD Conference.Google ScholarDigital Library
- Joel Brandt, Philip J Guo, Joel Lewenstein, Mira Dontcheva, and Scott R Klemmer. 2009. Two studies of opportunistic programming: interleaving web foraging, learning, and writing code. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 1589–1598.Google ScholarDigital Library
- Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, T. J. Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeff Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. ArXiv abs/2005.14165(2020).Google Scholar
- Luca Buratti, Saurabh Pujar, Mihaela A. Bornea, Scott McCarley, Yunhui Zheng, Gaetano Rossiello, Alessandro Morari, Jim Laredo, Veronika Thost, Yufan Zhuang, and Giacomo Domeniconi. 2020. Exploring Software Naturalness through Neural Language Models. ArXiv abs/2006.12641(2020).Google Scholar
- Barthélémy Dagenais and Laurie Hendren. 2008. Enabling static analysis for partial java programs. In Proceedings of the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications. 313–328.Google ScholarDigital Library
- Premkumar T. Devanbu. 2012. On the naturalness of software. 2012 34th International Conference on Software Engineering (ICSE) (2012), 837–847.Google Scholar
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805(2018).Google Scholar
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. ArXiv abs/1810.04805(2019).Google Scholar
- Ning Ding, Yulin Chen, Xu Han, Guangwei Xu, Pengjun Xie, Haitao Zheng, Zhiyuan Liu, Juan-Zi Li, and Hong-Gee Kim. 2021. Prompt-Learning for Fine-Grained Entity Typing. ArXiv abs/2108.10604(2021).Google Scholar
- Yiwen Dong, Tianxiao Gu, Yongqiang Tian, and Chengnian Sun. 2022. SnR: Constraint Based Type Inference for Incomplete Java Code Snippets. International Conference on Software Engineering (ICSE) (2022).Google Scholar
- Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, and Ming Zhou. 2020. CodeBERT: A Pre-Trained Model for Programming and Natural Languages. ArXiv abs/2002.08155(2020).Google Scholar
- Rosalva Gallardo-Valencia and Susan Sim. 2009. Internet-Scale Code Search. Proceedings - International Conference on Software Engineering, 49–52. https://doi.org/10.1109/SUITE.2009.5070022Google ScholarDigital Library
- Leo Gao, Stella Biderman, Sid Black, Laurence Golding, Travis Hoppe, Charles Foster, Jason Phang, Horace He, Anish Thite, Noa Nabeshima, 2020. The Pile: An 800GB Dataset of Diverse Text for Language Modeling. arXiv preprint arXiv:2101.00027(2020).Google Scholar
- Tianyu Gao, Adam Fisch, and Danqi Chen. 2021. Making Pre-trained Language Models Better Few-shot Learners. ArXiv abs/2012.15723(2021).Google Scholar
- Jian Gu, Pasquale Salza, and Harald C. Gall. 2022. Assemble Foundation Models for Automatic Code Summarization.Google Scholar
- Yuxian Gu, Xu Han, Zhiyuan Liu, and Minlie Huang. 2021. PPT: Pre-trained Prompt Tuning for Few-shot Learning. ArXiv abs/2109.04332(2021).Google Scholar
- Piyush Kumar Gupta, Nikita Mehrotra, and Rahul Purandare. 2020. JCoffee: Using Compiler Feedback to Make Partial Code Snippets Compilable. 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME) (2020), 810–813.Google ScholarCross Ref
- Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, and Ming-Wei Chang. 2020. REALM: Retrieval-Augmented Language Model Pre-Training. ArXiv abs/2002.08909(2020).Google Scholar
- Sonia Haiduc, Jairo Aponte, Laura Moreno, and Andrian Marcus. 2010. On the Use of Automated Text Summarization Techniques for Summarizing Source Code. 2010 17th Working Conference on Reverse Engineering (2010), 35–44.Google Scholar
- Xu Han, Weilin Zhao, Ning Ding, Zhiyuan Liu, and Maosong Sun. 2021. PTR: Prompt Tuning with Rules for Text Classification. ArXiv abs/2105.11259(2021).Google Scholar
- Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Doll’ar, and Ross B. Girshick. 2021. Masked Autoencoders Are Scalable Vision Learners. ArXiv abs/2111.06377(2021).Google Scholar
- Benjamin Heinzerling and Kentaro Inui. 2020. Language models as knowledge bases: On entity representations, storage capacity, and paraphrased queries. arXiv preprint arXiv:2008.09036(2020).Google Scholar
- Benjamin Heinzerling and Kentaro Inui. 2021. Language Models as Knowledge Bases: On Entity Representations, Storage Capacity, and Paraphrased Queries. ArXiv abs/2008.09036(2021).Google Scholar
- Vincent J. Hellendoorn, Charles Sutton, Rishabh Singh, Petros Maniatis, and David Bieber. 2020. Global Relational Models of Source Code. In ICLR.Google Scholar
- Qing Huang, An Qiu, Maosheng Zhong, and Yuan Wang. 2020. A Code-Description Representation Learning Model Based on Attention. In 2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER). 447–455. https://doi.org/10.1109/SANER48275.2020.9054830Google ScholarCross Ref
- Qing Huang and Guoqing Wu. 2019. Enhance code search via reformulating queries with evolving contexts. Automated Software Engineering 26, 4 (2019), 705–732.Google ScholarCross Ref
- Qing Huang and Huaiguang Wu. 2019. QE-integrating framework based on Github knowledge and SVM ranking. Science China Information Sciences 62, 5 (2019), 1–16.Google ScholarCross Ref
- Hamel Husain, Hongqi Wu, Tiferet Gazit, Miltiadis Allamanis, and Marc Brockschmidt. 2019. CodeSearchNet Challenge: Evaluating the State of Semantic Code Search. ArXiv abs/1909.09436(2019).Google Scholar
- Zhengbao Jiang, Frank F. Xu, J. Araki, and Graham Neubig. 2020. How Can We Know What Language Models Know?Transactions of the Association for Computational Linguistics 8 (2020), 423–438.Google Scholar
- Aditya Kanade, Petros Maniatis, Gogul Balakrishnan, and Kensen Shi. 2020. Learning and evaluating contextual embedding of source code. In International Conference on Machine Learning. PMLR, 5110–5121.Google Scholar
- Anjan Karmakar and Romain Robbes. 2021. What do pre-trained code models know about code?2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE) (2021), 1332–1336.Google Scholar
- Kisub Kim, Dongsun Kim, Tegawendé F. Bissyandé, Eunjong Choi, Li Li, Jacques Klein, and Yves Le Traon. 2018. FaCoY – A Code-to-Code Search Engine. In 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE). 946–957. https://doi.org/10.1145/3180155.3180187Google ScholarDigital Library
- Brian Lester, Rami Al-Rfou, and Noah Constant. 2021. The Power of Scale for Parameter-Efficient Prompt Tuning. ArXiv abs/2104.08691(2021).Google Scholar
- Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2020. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In ACL.Google Scholar
- Hongwei Li, Sirui Li, Jiamou Sun, Zhenchang Xing, Xin Peng, Mingwei Liu, and Xuejiao Zhao. 2018. Improving API Caveats Accessibility by Mining API Caveats Knowledge Graph. 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME) (2018), 183–193.Google ScholarCross Ref
- Xiang Lisa Li and Percy Liang. 2021. Prefix-Tuning: Optimizing Continuous Prompts for Generation. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) abs/2101.00190 (2021).Google ScholarCross Ref
- Noah Liebman, Michael Nagara, Jacek Spiewla, and Erin Zolkosky. 2010. Cuebert: A New Mixing Board Concept for Musical Theatre. In NIME.Google Scholar
- Chin-Yew Lin and Franz Josef Och. 2004. ORANGE: a Method for Evaluating Automatic Evaluation Metrics for Machine Translation. In COLING.Google Scholar
- Erik Linstead, Sushil Bajracharya, Trung Ngo, Paul Rigor, Cristina Lopes, and Pierre Baldi. 2009. Sourcerer: mining and searching internet-scale software repositories. Data Mining and Knowledge Discovery 18, 2 (2009), 300–336.Google ScholarDigital Library
- Mingwei Liu, Xin Peng, Andrian Marcus, Zhenchang Xing, Wenkai Xie, Shuangshuang Xing, and Yang Liu. 2019. Generating query-specific class API summaries. Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (2019).Google ScholarDigital Library
- Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. 2021. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. arXiv preprint arXiv:2107.13586(2021).Google Scholar
- Xiao Liu, Kaixuan Ji, Yicheng Fu, Zhengxiao Du, Zhilin Yang, and Jie Tang. 2021. P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks. ArXiv abs/2110.07602(2021).Google Scholar
- Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692(2019).Google Scholar
- Shuai Lu, Daya Guo, Shuo Ren, Junjie Huang, Alexey Svyatkovskiy, Ambrosio Blanco, Colin B. Clement, Dawn Drain, Daxin Jiang, Duyu Tang, Ge Li, Lidong Zhou, Linjun Shou, Long Zhou, Michele Tufano, Ming Gong, Ming Zhou, Nan Duan, Neel Sundaresan, Shao Kun Deng, Shengyu Fu, and Shujie Liu. 2021. CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation. ArXiv abs/2102.04664(2021).Google Scholar
- Subhadip Maji, Swapna Sourav Rout, and Sudeep Choudhary. 2021. DCoM: A Deep Column Mapper for Semantic Data Type Detection. ArXiv abs/2106.12871(2021).Google Scholar
- Leandro T. C. Melo, Rodrigo G. Ribeiro, Breno C. F. Guimarães, and Fernando Magno Quintão Pereira. 2020. Type Inference for C: Applications to the Static Analysis of Incomplete Programs. ACM Trans. Program. Lang. Syst.(2020).Google Scholar
- Patrick Morrison, Kim Herzig, Brendan Murphy, and Laurie Williams. 2015. Challenges with applying vulnerability prediction models. In Proceedings of the 2015 Symposium and Bootcamp on the Science of Security. 1–9.Google ScholarDigital Library
- Anh Tuan Nguyen, Tung Thanh Nguyen, and Tien Nhut Nguyen. 2013. Lexical statistical machine translation for language migration. In ESEC/FSE 2013.Google ScholarDigital Library
- Renaud Pawlak, Monperrus Martin, Nicolas Petitprez, Carlos Noguera, and Lionel Seinturier. 2016. SPOON: A library for implementing analyses and transformations of Java source code. Software: Practice and Experience 46 (2016), 1155 – 1179.Google ScholarDigital Library
- Hammond A. Pearce, Baleegh Ahmad, Benjamin Tan, Brendan Dolan-Gavitt, and Ramesh Karri. 2021. An Empirical Cybersecurity Evaluation of GitHub Copilot’s Code Contributions. ArXiv abs/2108.09293(2021).Google Scholar
- Fabio Petroni, Tim Rocktäschel, Patrick Lewis, Anton Bakhtin, Yuxiang Wu, Alexander H Miller, and Sebastian Riedel. 2019. Language models as knowledge bases?arXiv preprint arXiv:1909.01066(2019).Google Scholar
- Hung Dang Phan, Hoan Anh Nguyen, Ngoc M. Tran, Linh-Huyen Truong, Anh Tuan Nguyen, and Tien Nhut Nguyen. 2018. Statistical Learning of API Fully Qualified Names in Code Snippets of Online Forums. 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE) (2018), 632–642.Google Scholar
- Luca Piccolboni, Giuseppe Di Guglielmo, Luca P. Carloni, and Simha Sethumadhavan. 2021. CRYLOGGER: Detecting Crypto Misuses Dynamically. 2021 IEEE Symposium on Security and Privacy (SP) (2021), 1972–1989.Google Scholar
- Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, 2019. Language models are unsupervised multitask learners. OpenAI blog 1, 8 (2019), 9.Google Scholar
- Colin Raffel, Noam M. Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. ArXiv abs/1910.10683(2020).Google Scholar
- Joseph Redmon, Santosh Kumar Divvala, Ross B. Girshick, and Ali Farhadi. 2016. You Only Look Once: Unified, Real-Time Object Detection. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016), 779–788.Google ScholarCross Ref
- Xiaoxue Ren, Xinyuan Ye, Zhenchang Xing, Xin Xia, Xiwei Xu, Liming Zhu, and Jianling Sun. 2020. API-Misuse Detection Driven by Fine-Grained API-Constraint Knowledge Graph. 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE) (2020), 461–472.Google ScholarDigital Library
- Adam Roberts, Colin Raffel, and Noam M. Shazeer. 2020. How Much Knowledge Can You Pack into the Parameters of a Language Model?ArXiv abs/2002.08910(2020).Google Scholar
- C. M. Khaled Saifullah, Muhammad Asaduzzaman, and Chanchal Kumar Roy. 2019. Learning from Examples to Find Fully Qualified Names of API Elements in Code Snippets. 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE) (2019), 243–254.Google ScholarDigital Library
- Timo Schick and Hinrich Schütze. 2021. Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference. In EACL.Google Scholar
- Timo Schick and Hinrich Schütze. 2021. It’s Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners. ArXiv abs/2009.07118(2021).Google Scholar
- Taylor Shin, Yasaman Razeghi, Robert L Logan IV, Eric Wallace, and Sameer Singh. 2020. Autoprompt: Eliciting knowledge from language models with automatically generated prompts. arXiv preprint arXiv:2010.15980(2020).Google Scholar
- Siddharth Subramanian, Laura Inozemtseva, and Reid Holmes. 2014. Live API Documentation. International Conference on Software Engineering (ICSE) (2014).Google Scholar
- Jiamou Sun, Zhenchang Xing, Rui Chu, Heilai Bai, Jinshui Wang, and Xin Peng. 2019. Know-How in Programming Tasks: From Textual Tutorials to Task-Oriented Knowledge Graph. 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME) (2019), 257–268.Google ScholarCross Ref
- Yi Sun, Yu Zheng, Chao Hao, and Hangping Qiu. 2021. NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task-Next Sentence Prediction. ArXiv abs/2109.03564(2021).Google Scholar
- Tianyi Tang, Junyi Li, and Wayne Xin Zhao. 2022. Context-Tuning: Learning Contextualized Prompts for Natural Language Generation. ArXiv abs/2201.08670(2022).Google Scholar
- Suresh Thummalapenta and Tao Xie. 2007. Parseweb: a programmer assistant for reusing open source code on the web. In Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering. 204–213.Google ScholarDigital Library
- Sergey Troshin and Nadezhda Chirkova. 2022. Probing Pretrained Models of Source Code. ArXiv abs/2202.08975(2022).Google Scholar
- Sergey Troshin and Nadezhda Chirkova. 2022. Probing Pretrained Models of Source Code. arXiv preprint arXiv:2202.08975(2022).Google Scholar
- Michele Tufano, Cody Watson, Gabriele Bavota, Massimiliano Di Penta, Martin White, and Denys Poshyvanyk. 2019. An empirical study on learning bug-fixing patches in the wild via neural machine translation. ACM Transactions on Software Engineering and Methodology (TOSEM) 28, 4(2019), 1–29.Google ScholarDigital Library
- Medha Umarji, Susan Elliott Sim, and Crista Lopes. 2008. Archetypal internet-scale source code searching. In IFIP International Conference on Open Source Systems. Springer, 257–263.Google ScholarCross Ref
- Ashish Vaswani, Noam M. Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In NIPS.Google Scholar
- Yao Wan, Wei Zhao, Hongyu Zhang, Yulei Sui, Guandong Xu, and Hairong Jin. 2022. What Do They Capture? - A Structural Analysis of Pre-Trained Language Models for Source Code. ArXiv abs/2202.06840(2022).Google Scholar
- Yao Wan, Wei Zhao, Hongyu Zhang, Yulei Sui, Guandong Xu, and Hai Jin. 2022. What Do They Capture?–A Structural Analysis of Pre-Trained Language Models for Source Code. arXiv preprint arXiv:2202.06840(2022).Google Scholar
- Deze Wang, Zhouyang Jia, Shanshan Li, Yue Yu, Yun Xiong, Wei Dong, and Xiangke Liao. 2021. Bridging Pre-trained Models and Downstream Tasks for Source Code Understanding. ArXiv abs/2112.02268(2021).Google Scholar
- Wenhan Wang, Ge Li, Bo Ma, Xin Xia, and Zhi Jin. 2020. Detecting Code Clones with Graph Neural Network and Flow-Augmented Abstract Syntax Tree. In 2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 261–271.Google Scholar
- Yanlin Wang and Hui Li. 2021. Code completion by modeling flattened abstract syntax trees as graphs. Proceedings of AAAIConference on Artificial Intellegence (2021).Google ScholarCross Ref
- Yue Wang, Weishi Wang, Shafiq Joty, and Steven CH Hoi. 2021. Codet5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. arXiv preprint arXiv:2109.00859(2021).Google Scholar
- Yonghui Wu, Mike Schuster, Z. Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Lukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason R. Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Gregory S. Corrado, Macduff Hughes, and Jeffrey Dean. 2016. Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. ArXiv abs/1609.08144(2016).Google Scholar
- Yuhao Wu, Shaowei Wang, Cor-Paul Bezemer, and Katsuro Inoue. 2019. How do developers utilize source code from stack overflow?Empirical Software Engineering 24, 2 (2019), 637–673.Google Scholar
- Tianyi Zhang, Ganesha Upadhyaya, Anastasia Reinhardt, Hridesh Rajan, and Miryung Kim. 2018. Are Code Examples on an Online Q&A Forum Reliable?: A Study of API Misuse on Stack Overflow. 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE) (2018), 886–896.Google ScholarDigital Library
- Tianyi Zhang, Di Yang, Crista Lopes, and Miryung Kim. 2019. Analyzing and supporting adaptation of online code examples. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 316–327.Google ScholarDigital Library
- Wenxuan Zhou, Junyi Du, and Xiang Ren. 2019. Improving BERT fine-tuning with embedding normalization. arXiv preprint arXiv:1911.03918(2019).Google Scholar
- Yaqin Zhou, Shangqing Liu, J. Siow, Xiaoning Du, and Yang Liu. 2019. Devign: Effective Vulnerability Identification by Learning Comprehensive Program Semantics via Graph Neural Networks. ArXiv abs/1909.03496(2019).Google Scholar
Index Terms
- Prompt-tuned Code Language Model as a Neural Knowledge Base for Type Inference in Statically-Typed Partial Code
Recommendations
FQN Inference in Partial Code by Prompt-tuned Language Model of Code
Partial code usually involves non-fully-qualified type names (non-FQNs) and undeclared receiving objects. Resolving the FQNs of these non-FQN types and undeclared receiving objects (referred to as type inference) is the prerequisite to effective search ...
Polymorphic type inference for machine code
PLDI '16For many compiled languages, source-level types are erased very early in the compilation process. As a result, further compiler passes may convert type-safe source into type-unsafe machine code. Type-unsafe idioms in the original source and type-unsafe ...
Polymorphic type inference for machine code
PLDI '16: Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and ImplementationFor many compiled languages, source-level types are erased very early in the compilation process. As a result, further compiler passes may convert type-safe source into type-unsafe machine code. Type-unsafe idioms in the original source and type-unsafe ...
Comments