ABSTRACT
As modern software systems have grown in complexity and heterogeneity, static code analysis of these systems has become more difficult. Due to increasing heterogeneity of software systems and static analysis tools being language specific, multiple of these tools are typically required to fully analyze these systems. Due to this, current tools for static code analysis are no longer well-suited to analyzing these modern systems. This paper presents an approach to solving the problem of needing multiple language-specific static analysis tools to fully perform static code analysis of such systems. This can be done by converting the software system's source code into an intermediate representation or abstraction known as a Langage-Agnostic Abstract-Syntax Tree. This abstraction provides a common interface for static analysis tools to operate with, no longer requiring the usage of multiple language-specific static analysis tools for a given heterogeneous system. The methodology for creating such an abstraction is presented here, along with an evaluation of two microservice system testbeds, DeathStarBench and TrainTicket, written in C++ and Java, respectively. By utilizing a higher abstraction for source code representations, we are much better prepared to perform static code analysis of modern software systems.
- [n. d.]. Project Lombok. https://projectlombok.org/ (Accessed on 08/28/2021).Google Scholar
- [n. d.]. Prophet: code representation in graph database. https://github.com/cloudhubs/prophet (Accessed on 08/28/2021).Google Scholar
- 2017. ISO/IEC/IEEE International Standard - Systems and software engineering-Vocabulary. ISO/IEC/IEEE 24765:2017(E) (2017), 1--541. Google ScholarCross Ref
- Luca Ardito, Luca Barbato, Marco Castelluccio, Riccardo Coppola, Calixte Denizet, Sylvestre Ledru, and Michele Valsesia. 2020. rust-code-analysis: A Rust library to analyze and extract maintainability information from source codes. SoftwareX 12 (2020), 100635. Google ScholarCross Ref
- I.D. Baxter, A. Yahin, L. Moura, M. Sant'Anna, and L. Bier. 1998. Clone detection using abstract syntax trees. In Proceedings. International Conference on Software Maintenance (Cat. No. 98CB36272). 368--377. Google ScholarCross Ref
- João M.P. Cardoso, Tiago Carvalho, José G.F. Coutinho, Wayne Luk, Ricardo Nobre, Pedro Diniz, and Zlatko Petrov. 2012. LARA: An Aspect-Oriented Programming Language for Embedded Systems. In Proceedings of the 11th Annual International Conference on Aspect-Oriented Software Development (AOSD '12). Association for Computing Machinery, New York, NY, USA, 179--190. Google ScholarDigital Library
- Tomas Cerny, Michael J. Donahoo, and Michal Trnka. 2018. Contextual Understanding of Microservice Architecture: Current and Future Directions. SIGAPP Appl. Comput. Rev. 17, 4 (Jan. 2018), 29--45. Google ScholarDigital Library
- Tomas Cerny, Jan Svacina, Dipta Das, Vincent Bushong, Miroslav Bures, Pavel Tisnovsky, Karel Frajtak, Dongwan Shin, and Jun Huang. 2020. On Code Analysis Opportunities and Challenges for Enterprise Systems and Microservices. IEEE Access (2020), 1--22. Google ScholarCross Ref
- M. E. Fagan. 1976. Design and code inspections to reduce errors in program development. IBM Systems Journal 15, 3 (1976), 182--211.Google ScholarDigital Library
- FudanSELab. 2021. Train Ticket - A Benchmark Microservice System. Retrieved March 27, 2021 from https://github.com/FudanSELab/train-ticketGoogle Scholar
- Yu Gan, Yanqi Zhang, Dailun Cheng, Ankitha Shetty, Priyal Rathi, Nayan Katarki, Ariana Bruno, Justin Hu, Brian Ritchken, Brendon Jackson, Kelvin Hu, Meghna Pancholi, Yuan He, Brett Clancy, Chris Colen, Fukang Wen, Catherine Leung, Siyuan Wang, Leon Zaruvinsky, Mateo Espinosa, Rick Lin, Zhongling Liu, Jake Padilla, and Christina Delimitrou. 2019. An Open-Source Benchmark Suite for Microservices and Their Hardware-Software Implications for Cloud & Edge Systems. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (ASP-LOS '19). Association for Computing Machinery, New York, NY, USA, 3--18. Google ScholarDigital Library
- Pedro Pinto, Tiago Carvalho, João Bispo, Miguel António Ramalho, and João M.P. Cardoso. 2018. Aspect composition for multiple target languages using LARA. Computer Languages, Systems & Structures 53 (2018), 1--26. Google ScholarCross Ref
- Gordana Rakić, Zoran Budimac, and Miloš Savić. 2013. Language Independent Framework for Static Code Analysis. In Proceedings of the 6th Balkan Conference in Informatics (BCI '13). Association for Computing Machinery, New York, NY, USA, 236--243. Google ScholarDigital Library
- Gil Teixeira, João Bispo, and Filipe F. Correia. 2021. Multi-Language Static Code Analysis on the LARA Framework. Association for Computing Machinery, New York, NY, USA, 31--36. Google ScholarDigital Library
- Xiang Zhou, Xin Peng, Tao Xie, Jun Sun, Chenjie Xu, Chao Ji, and Wenyun Zhao. 2018. Benchmarking Microservice Systems for Software Engineering Research. In Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings (ICSE '18). ACM, New York, NY, USA, 323--324. Google ScholarDigital Library
- Daniel Zügner, Tobias Kirschstein, Michele Catasta, Jure Leskovec, and Stephan Günnemann. 2021. Language-agnostic representation learning of source code from structure and context. arXiv preprint arXiv:2103.11318 (2021).Google Scholar
Index Terms
- On language-agnostic abstract-syntax trees: student research abstract
Recommendations
Mapping of Dynamic Language Constructs into Static Abstract Syntax Trees
ICIS '10: Proceedings of the 2010 IEEE/ACIS 9th International Conference on Computer and Information ScienceSoftware solutions performing automatic code analysis are very important, especially for code assistance capabilities or for extracting semantic metadata from the source code. These methods gather syntactic information from the source code and then in ...
Abstract syntax based programming environments
AdaTEC '82: Proceedings of the AdaTEC Conference on AdaA program development environment based on a high-level semantic representation of programs rather than a textual representation was investigated. Several programming languages are supported through the use of language parameterized tools. These tools ...
The Zephyr abstract syntax description language
DSL'97: Proceedings of the Conference on Domain-Specific Languages on Conference on Domain-Specific Languages (DSL), 1997The Zephyr Abstract Syntax Description Language (ASDL) describes the abstract syntax of compiler intermediate representations (IRs) and other tree-like data structures. Just as the lexical and syntactic structures of programming languages are described ...
Comments