Abstract
The Learning Assistant Manager and Builder (LAMB) is an open-source software framework that lets educators build and deploy AI learning assistants within institutional Learning Management Systems (LMS) without coding expertise. It addresses critical challenges in educational AI by providing privacy-focused integration, controlled knowledge bases, and seamless deployment through standard protocols. This paper presents major enhancements that enable systematic quality assurance and continuous improvement of these learning assistants.
The new LAMB includes mechanisms for structured feedback on real-world assistant behavior, transforming it into a test suite with curated prompts and expected correct or incorrect responses. When changes are made—such as prompt engineering, retrieval-augmented generation optimization, or knowledge base expansions—this suite enables automated validation of their impact.
A key innovation is using frontier large language models (LLMs) to evaluate responses automatically, generating detailed reports that reveal improvement areas and confirm performance gains. This systematic feedback-driven testing fosters continuous refinement while preserving quality standards.
Validation studies show measurable boosts in reliability and consistency. In various educational contexts, the framework identifies edge cases, maintains consistency across iterations, and provides actionable insights. Automated testing is especially beneficial for assistants with extensive knowledge bases and complex interaction patterns.
This work advances educational AI by providing a robust methodology for quality assurance and ongoing improvement of learning assistants. Its structured feedback and automated evaluations ensure alignment with educational goals while refining assistants over time. The enhanced LAMB framework offers a scalable and reliable solution for educators aiming to integrate AI-driven support into their LMS environments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
OpenAI.: GPT-4 Technical Report. arXiv:2303.08774v4 (2023)
Crawford, J., Cowling, M., Allen, K.A.: Leadership is needed for ethical ChatGPT: character, assessment, and learning using artificial intelligence (AI). J. Univ. Teach. Learn. Pract. 20(3), 1–12 (2023)
Levesque, H., Davis, E., Morgenstern, L.: The winograd schema challenge. In: 13th International Conference on Principles of Knowledge Representation and Reasoning, pp. 552–561. AAAI Press, USA (2012)
Nazir, A., Wang, Z.: A comprehensive survey of ChatGPT: advancements, applications, prospects, and challenges. Meta-Radiology 1(2), 100022 (2023)
Thorp, H.H.: ChatGPT is fun, but not an author. Science 379(6630), 313 (2023)
Dwivedi, Y.K., et al.: So what if ChatGPT wrote it? Multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice and policy. Int. J. Inf. Manag. 71, 102642 (2023)
Lewis, P., et al.: Retrieval-augmented generation for knowledge-intensive NLP tasks. In: Larochelle, H., et al. (eds.) Advances in Neural Information Processing Systems, vol. 33, pp. 9459–9474. Curran Associates Inc. (2020)
Gao, Y., et al.: Retrieval-augmented generation for large language models: a survey. arXiv:2312.10997v5 (2024)
Llorens-Largo, F., García-Peñalvo, F.J.: La inteligencia artificial en el gobierno universitario. Universidad (2023)
Lim, W.M., et al.: Generative AI and the future of education: ragnarök or reformation? A paradoxical perspective from management educators. Int. J. Manag. Educ. 21(2), 100790 (2023)
Wang, T., et al.: Security and privacy on generative data in AIGC: a survey. arXiv:2309.09435v2 (2023)
Alier, M., Casañ, M.J., Amo, D., Severance, C., Fonseca, D.: Privacy and e-learning: a pending task. Sustainability 13(16), 9206 (2021)
Choi, E.P.H., Lee, J.J., Ho, M.H., Kwok, J.Y.Y., Lok, K.Y.W.: Chatting or cheating? The impacts of ChatGPT and other artificial intelligence language models on nurse education. Nurse Educ. Today 125, 105796 (2023)
Cotton, D.R.E., Cotton, P.A., Shipway, J.R.: Chatting and cheating: ensuring academic integrity in the era of ChatGPT. Innov. Educ. Teach. Int. 61(2), 228–239 (2024)
García-Peñalvo, F.J.: Generative artificial intelligence and education: an analysis from multiple perspectives. Educ. Knowl. Soc. 25, 31942 (2024)
Alier, M., Pereira, J., García-Peñalvo, F.J., Casañ, M.J., Cabré, J.: LAMB: an open-source software framework to create artificial intelligence assistants deployed and integrated into learning management systems. Comput. Stand. Interfaces 92, 103940 (2025)
Gašević, D., Siemens, G., Sadiq, S.: Empowering learners for the age of artificial intelligence. Comput. Educ. Artif. Intell. 4, 100130 (2023)
Zhao, W.X., et al.: A survey of large language models. arXiv:2303.18223v13 (2023)
Lewis, P., et al.: Retrieval-augmented generation for knowledge-intensive NLP tasks. In: Advances in Neural Information Processing Systems, vol. 33, pp. 9459–9474 (2020)
Sahoo, P., et al.: A systematic survey of prompt engineering in large language models: techniques and applications. arXiv:2402.07927v1 (2024)
Yao, Y., et al.: A survey on large language model (LLM) security and privacy: The Good, The Bad, and The Ugly. High-Confidence Comput. 4(2), 100211 (2024)
Triberti, S., Di Fuccio, R., Scuotto, C., Marsico, E., Limone, P.: Better than my professor? How to develop artificial intelligence tools for higher education. Front. Artif. Intell. 7, 1329605 (2024)
Alier, M., García-Peñalvo, F.J., Camba, J.D.: Generative artificial intelligence in education: from deceptive to disruptive. Int. J. Interact. Multimedia Artif. Intell. 8(5), 5–14 (2024)
Su, J., Yang, W.: Unlocking the power of ChatGPT: a framework for applying generative AI in education. ECNU Rev. Educ. 6(3), 355–366 (2023)
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: 40th Annual Meeting of the Association for Computational Linguistics, pp. 311–318. ACL Press, Philadelphia (2002)
Zheng, L., Gao, J., Ding, N., Liu, Z.: Challenges and opportunities in LLM-based assessment. arXiv:2402.12287v1 (2024)
Liu, H., Guo, T., Fang, H., Li, X.: On the evaluation of large language models: a framework and best practices. arXiv:2403.00285v1 (2024)
Zheng, C., et al.: A survey on evaluation of large language models. arXiv:2307.03109v5 (2024)
Aspillaga, C., Bhargava, R., Goel, A.: Test case generation using large language models. In: Proceedings of the 10th International Conference on Learning Analytics and Knowledge, pp. 1–10. ACM, New York (2024)
García-Peñalvo, F.J., Llorens-Largo, F., Vidal, J.: The new reality of education in the face of advances in generative artificial intelligence. RIED: Revista Iberoamericana de Educación a Distancia 27(1), 9–39 (2024)
Wu, J., et al.: Assessment criteria for AI-powered educational tools: a systematic review. Comput. Educ. 185, 104767 (2024)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Funding and Acknowledgements
The authors give thanks and acknowledge the grad student Joel Corredor for his contribution to the lamb project in the evals implementation. This research is partially funded by the Ministry of Science and Innovation through the AvisSA project (reference PID2020-118345RB-I00), by the Department of Research and Universities of the Catalan Government through the 2021 SGR 01412 grant for research groups, and by the University of the Basque Country/Euskal Herriko Unibertsitatea under contract GIU21/037 as part of the “Call for Grants for Research Groups at the University of the Basque Country/Euskal Herriko Unibertsitatea (2021).
Disclosure of Interests
The authors have no competing interests to declare that are relevant to the content of this article.
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Alier-Forment, M., Pereira-Valera, J., Casañ-Guerrero, M.J., Garcia-Penalvo, F.J. (2025). Enhancing Learning Assistant Quality Through Automated Feedback Analysis and Systematic Testing in the LAMB Framework. In: Smith, B.K., Borge, M. (eds) Learning and Collaboration Technologies. HCII 2025. Lecture Notes in Computer Science, vol 15807. Springer, Cham. https://doi.org/10.1007/978-3-031-93567-1_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-93567-1_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-93566-4
Online ISBN: 978-3-031-93567-1
eBook Packages: Computer ScienceComputer Science (R0)