Skip to main content

Assessing Similarity-Based Grammar-Guided Genetic Programming Approaches for Program Synthesis

  • Conference paper
  • First Online:
Optimization and Learning (OLA 2022)

Abstract

Grammar-Guided Genetic Programming is widely recognised as one of the most successful approaches for program synthesis, i.e., the task of automatically discovering an executable piece of code given user intent. Grammar-Guided Genetic Programming has been shown capable of successfully evolving programs in arbitrary languages that solve several program synthesis problems based only on a set of input-output examples. Despite its success, the restriction on the evolutionary system to only leverage input/output error rate during its assessment of the programs it derives limits its scalability to larger and more complex program synthesis problems. With the growing number and size of open software repositories and generative artificial intelligence approaches, there is a sizeable and growing number of approaches for retrieving/generating source code based on textual problem descriptions. Therefore, it is now, more than ever, time to introduce G3P to other means of user intent (particularly textual problem descriptions). In this paper, we would like to assess the potential for G3P to evolve programs based on their similarity to particular target codes of interest (obtained using some code retrieval/generative approach). We particularly assess 4 similarity measures from various fields: text processing (i.e., FuzzyWuzzy), natural language processing (i.e., Cosine Similarity based on term frequency), software clone detection (i.e., CCFinder), plagiarism detector(i.e., SIM). Through our experimental evaluation on a well-known program synthesis benchmark, we have shown that G3P successfully manages to evolve some of the desired programs with three of the used similarity measures. However, in its default configuration, G3P is not as successful with similarity measures as with the classical input/output error rate at evolving solving program synthesis problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Alexandru, C.V.: Guided code synthesis using deep neural networks. In: ACM SIGSOFT, pp. 1068–1070 (2016)

    Google Scholar 

  2. Brameier, M., Banzhaf, W., Banzhaf, W.: Linear Genetic Programming, vol. 1. Springer, New York (2007)

    MATH  Google Scholar 

  3. Byrne, J., Cardiff, P., Brabazon, A., et al.: Evolving parametric aircraft models for design exploration and optimisation. Neurocomputing 142, 39–47 (2014)

    Article  Google Scholar 

  4. Ciritoglu, H.E., Saber, T., Buda, T.S., Murphy, J., Thorpe, C.: Towards a better replica management for hadoop distributed file system. In: IEEE BigData Congress (2018)

    Google Scholar 

  5. Cohen, A.: Fuzzywuzzy: fuzzy string matching in python (2011)

    Google Scholar 

  6. Forstenlechner, S.: Program synthesis with grammars and semantics in genetic programming. Ph. D. dissertation (2019)

    Google Scholar 

  7. Forstenlechner, S., Fagan, D., Nicolau, M., O’Neill, M.: A grammar design pattern for arbitrary program synthesis problems in genetic programming. In: McDermott, J., Castelli, M., Sekanina, L., Haasdijk, E., García-Sánchez, P. (eds.) EuroGP 2017. LNCS, vol. 10196, pp. 262–277. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-55696-3_17

    Chapter  Google Scholar 

  8. Gitchell, D., Tran, N.: Sim: a utility for detecting similarity in computer programs. ACM SIGCSE Bull. 31(1), 266–270 (1999)

    Article  Google Scholar 

  9. Hartmann, B., MacDougall, D., Brandt, J., Klemmer, S.R.: What would other programmers do: suggesting solutions to error messages. In: SIGCHI, pp. 1019–1028 (2010)

    Google Scholar 

  10. Helmuth, T., Spector, L.: Detailed problem descriptions for general program synthesis benchmark suite. University of Massachusetts Amherst (2015)

    Google Scholar 

  11. Helmuth, T., Spector, L.: General program synthesis benchmark suite. In: GECCO, pp. 1039–1046 (2015)

    Google Scholar 

  12. Holmes, R., Murphy, G.C.: Using structural context to recommend source code examples. In: ICSE, pp. 117–125 (2005)

    Google Scholar 

  13. Hu, X., Li, G., Xia, X., Lo, D., Jin, Z.: Deep code comment generation. In: IEEE/ACM ICPC, pp. 200–20010 (2018)

    Google Scholar 

  14. Jeon, J., Qiu, X., Foster, J.S., Solar-Lezama, A.: Jsketch: sketching for java. In: ESEC/FSE, pp. 934–937 (2015)

    Google Scholar 

  15. Kamiya, T., Kusumoto, S., Inoue, K.: Ccfinder: A multilinguistic token-based code clone detection system for large scale source code. IEEE Trans. Softw. Eng. 28(7), 654–670 (2002)

    Article  Google Scholar 

  16. Koza, J.R., et al.: Genetic Programming II, vol. 17. MIT Press, Cambridge (1994)

    MATH  Google Scholar 

  17. Loughran, R., McDermott, J., O’Neill, M.: Tonality driven piano compositions with grammatical evolution. In: IEEE CEC, pp. 2168–2175 (2015)

    Google Scholar 

  18. Lynch, D., Saber, T., Kucera, S., Claussen, H., O’Neill, M.: Evolutionary learning of link allocation algorithms for 5G heterogeneous wireless communications networks. In: GECCO, pp. 1258–1265 (2019)

    Google Scholar 

  19. Miller, J.F., Harding, S.L.: Cartesian genetic programming. In: GECCO, pp. 2701–2726 (2008)

    Google Scholar 

  20. O’Neill, M., Nicolau, M., Agapitos, A.: Experiments in program synthesis with grammatical evolution: a focus on integer sorting. In: CEC, pp. 1504–1511 (2014)

    Google Scholar 

  21. O’Neill, M., Ryan, C.: Grammatical Evolution: Evolutionary Automatic Programming in a Arbitrary Language, vol. 4 of Genetic Programming (2003)

    Google Scholar 

  22. Pantridge, E., Spector, L.: Pyshgp: pushgp in python. In: GECCO, pp. 1255–1262 (2017)

    Google Scholar 

  23. Ragkhitwetsagul, C., Krinke, J., Clark, D.: A comparison of code similarity analysers. Empir. Softw. Eng. 23(4), 2464–2519 (2018). https://doi.org/10.1007/s10664-017-9564-7

    Article  Google Scholar 

  24. Saber, T., Brevet, D., Botterweck, G., Ventresque, A.: Is seeding a good strategy in multi-objective feature selection when feature models evolve? IST (2017)

    Google Scholar 

  25. Saber, T., Brevet, D., Botterweck, G., Ventresque, A.: MILPIBEA: algorithm for multi-objective features selection in (evolving) software product lines. In: Paquete, L., Zarges, C. (eds.) EvoCOP 2020. LNCS, vol. 12102, pp. 164–179. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-43680-3_11

    Chapter  Google Scholar 

  26. Saber, T., Delavernhe, F., Papadakis, M., O’Neill, M., Ventresque, A.: A hybrid algorithm for multi-objective test case selection. In: IEEE CEC (2018)

    Google Scholar 

  27. Saber, T., Fagan, D., Lynch, D., Kucera, S., Claussen, H., O’Neill, M.: A hierarchical approach to grammar-guided genetic programming: the case of scheduling in heterogeneous networks. In: Fagan, D., Martín-Vide, C., O’Neill, M., Vega-Rodríguez, M.A. (eds.) TPNC 2018. LNCS, vol. 11324, pp. 225–237. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-04070-3_18

    Chapter  Google Scholar 

  28. Saber, T., Fagan, D., Lynch, D., Kucera, S., Claussen, H., O’Neill, M.: Multi-level grammar genetic programming for scheduling in heterogeneous networks. In: Castelli, M., Sekanina, L., Zhang, M., Cagnoni, S., García-Sánchez, P. (eds.) EuroGP 2018. LNCS, vol. 10781, pp. 118–134. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-77553-1_8

    Chapter  Google Scholar 

  29. Saber, T., Fagan, D., Lynch, D., Kucera, S., Claussen, H., O’Neill, M.: Hierarchical grammar-guided genetic programming techniques for scheduling in heterogeneous networks. In: CEC (2020)

    Google Scholar 

  30. Saber, T., Fagan, D., Lynch, D., Kucera, S., Claussen, H., O’Neill, M.: A multi-level grammar approach to grammar-guided genetic programming: the case of scheduling in heterogeneous networks. Genet. Program. Evolvable Mach. 20(2), 245–283 (2019). https://doi.org/10.1007/s10710-019-09346-4

    Article  Google Scholar 

  31. Saber, T., Wang, S.: Evolving better rerouting surrogate travel costs with grammar-guided genetic programming. In: IEEE CEC, pp. 1–8 (2020)

    Google Scholar 

  32. Tao, N., Ventresque, A., Saber, T.: Multi-objective grammar-guided genetic programming with code similarity measurement for program synthesis. In: IEEE CEC (2022)

    Google Scholar 

  33. Whigham, P.A.: Grammatical bias for evolutionary learning (1997)

    Google Scholar 

Download references

Acknowledgement

Supported, in part, by Science Foundation Ireland grant 13/RC/2094\(\_\)P2.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ning Tao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tao, N., Ventresque, A., Saber, T. (2022). Assessing Similarity-Based Grammar-Guided Genetic Programming Approaches for Program Synthesis. In: Dorronsoro, B., Pavone, M., Nakib, A., Talbi, EG. (eds) Optimization and Learning. OLA 2022. Communications in Computer and Information Science, vol 1684. Springer, Cham. https://doi.org/10.1007/978-3-031-22039-5_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-22039-5_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-22038-8

  • Online ISBN: 978-3-031-22039-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics