Skip to main content

Collaborative Analysis on Code Structure and Semantics

  • Conference paper
  • First Online:
Computer Supported Cooperative Work and Social Computing (ChineseCSCW 2022)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1682))

  • 340 Accesses

Abstract

In this paper, we propose the collaborative method that analyzes both code structure and function semantics for code comparison. First, we create the function call graph of code and use it to obtain the structure semantics with the graph auto-encoder. Then the function semantics are obtained with the names and definition of the used library functions and built-in classes in code. Finally, we integrate the structure and function semantics to collaboratively analyze the similarity of codes. We adopt several real code datasets to validate our method and the experimental results show that it outperforms other baselines. The ablation experiments show that the function call structure contributes the most to the performance. We also visualize the semantics of function structures to illustrate that the proposed method can extract the correlations and differences between codes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://code.google.com/codejam/past-contests.

  2. 2.

    http://noi.openjudge.cn/.

References

  1. Liao, Z., Zhao, Y., Liu, S., et al.: The measurement of the software ecosystem’s productivity with github. Comput. Syst. Sci. Eng. 36(1), 239–258 (2021)

    Article  Google Scholar 

  2. Wu, Y., Zou, D., Dou, S., et al.: SCDetector: software functional clone detection based on semantic tokens analysis. In: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, pp. 821–833 (2020)

    Google Scholar 

  3. Hamilton, W.L., Ying, R., Leskovec, J.: Inductive representation learning on large graphs. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 1025–1035 (2017)

    Google Scholar 

  4. Sajnani, H., Saini, V., Svajlenko, J., et al.: Sourcerercc: scaling code clone detection to big-code. In: Proceedings of the 38th International Conference on Software Engineering, pp. 1157–1168 (2016)

    Google Scholar 

  5. White, M., Tufano, M., Vendome, C., et al.: Deep learning code fragments for code clone detection. In: 2016 IEEE/ACM 31th International Conference on Automated Software Engineering, pp. 87–98 (2016)

    Google Scholar 

  6. Yu, H., Lam, W., Chen, L., et al.: Neural detection of semantic code clones via tree-based convolution. In: 2019 IEEE/ACM 27th International Conference on Program Comprehension, pp. 70–80 (2019)

    Google Scholar 

  7. Zhao, G., Huang, J.: Deepsim: deep learning code functional similarity. In: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 141–151 (2018)

    Google Scholar 

  8. Roy, C.K., Cordy, J.R.: NICAD: accurate detection of near-miss intentional clones using flexible pretty-printing and code normalization. In: 2008 16th IEEE International Conference on Program Comprehension, pp. 172–181 (2008)

    Google Scholar 

  9. Kodhai, E., Kanmani, S., Kamatchi, A., et al.: Detection of type-1 and type-2 code clones using textual analysis and metrics. In: 2010 International Conference on Recent Trends in Information, Telecommunication and Computing, pp. 241–249 (2010)

    Google Scholar 

  10. Jia, X., Ma, R., Liu, S., et al.: BinDeep: a deep learning approach to binary code similarity detection. Expert Syst. Appl. 168, 114348 (2021)

    Article  Google Scholar 

  11. Rattan, D., Bhatia, R.K., Singh, M.: Software clone detection: a systematic review. Inf. Softw. Technol. 55(7), 1165–1199 (2013)

    Article  Google Scholar 

  12. Rattan, D., Kaur, J.: Systematic mapping study of metrics based clone detection techniques. In: Proceedings of the International Conference on Advances in Information Communication Technology and Computing, pp. 1–7 (2016)

    Google Scholar 

  13. Roy, C.K., Cordy, J.R.: A survey on software clone detection research. Queen’s Sch. Comput. TR 541(115), 64–68 (2007)

    Google Scholar 

  14. Sheneamer, A., Kalita, J.: Code clone detection using coarse and fine-grained hybrid approaches. In: 2015 IEEE 7th International Conference on Intelligent Computing and Information Systems, pp. 472–480 (2015)

    Google Scholar 

  15. Sudhamani, M., Rangarajan, L.: Code clone detection based on order and content of control statements. In: 2016 2nd International Conference on Contemporary Computing and Informatics, pp. 59–64 (2016)

    Google Scholar 

  16. Hu, Y., Wang, H., Zhang, Y., et al.: A semantics-based hybrid approach on binary code similarity comparison. IEEE Trans. Softw. Eng. 47(6), 1241–1258 (2019)

    Article  Google Scholar 

  17. Zhang, F., Li, G., Liu, C., et al.: Flowchart-based cross-language source code similarity detection. Sci. Program. 2020, 1–15 (2020)

    Google Scholar 

  18. Haq, I.U., Juan, C.: A survey of binary code similarity. ACM Comput. Surv. 54(3), 1–38 (2021)

    Article  Google Scholar 

  19. Wang, W., Li, G., Ma, B., et al.: Detecting code clones with graph neural network and flow-augmented abstract syntax tree. In: 2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering, pp. 261–271 (2020)

    Google Scholar 

  20. Svajlenko, J., Islam, J.F., Keivanloo, I., et al.: Towards a big data curated benchmark of inter-project code clones. In: 2014 IEEE International Conference on Software Maintenance and Evolution, pp. 476–480 (2014)

    Google Scholar 

  21. Mou, L., Li, G., Zhang, L., et al.: Convolutional neural networks over tree structures for programming language processing. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, pp. 1287–1293 (2016)

    Google Scholar 

  22. Mehrotra, N., Agarwal, N., Gupta, P., et al.: Modeling functional similarity in source code with graph-based Siamese networks. IEEE Trans. Softw. Eng. 48, 3771–3789 (2021)

    Article  Google Scholar 

  23. Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(2605), 2579–2605 (2008)

    MATH  Google Scholar 

Download references

Acknowledgments

This work was supported by the Major Project of NSF Shandong Province under Grant No. ZR2018ZB0420 and the Key Research and Development Program of Shandong Province under Grant No. 2019JZZY010107.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuqing Sun .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ning, X., Wu, H., Wan, L., Gong, B., Sun, Y. (2023). Collaborative Analysis on Code Structure and Semantics. In: Sun, Y., et al. Computer Supported Cooperative Work and Social Computing. ChineseCSCW 2022. Communications in Computer and Information Science, vol 1682. Springer, Singapore. https://doi.org/10.1007/978-981-99-2385-4_6

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-2385-4_6

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-2384-7

  • Online ISBN: 978-981-99-2385-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics