Abstract
The Transformer model has inspired state-of-the-art generative NLP models such as the Bidirectional Encoder Representations from Transformers (BERT), Generative Pre-trained Transformer (GPT) and their variations models. Despite the fact that these training methods are responsible to have an effective language model, but the computational cost will be very expensive for performing any NLP tasks. Accordingly, the fine-tuning plays an essential role in improving the performance during the training process and reducing the computational cost. In this study we applied one of the pre-trained model from HuggingFace platform called a BERT-base-uncased model, which it’s variant of BERT and we utilized a specific purpose dataset for high school advising. However, the data is collected from high school and universities websites as well as from educational experts. The data includes enquiries and answers about advising high school students toward their future. This transformer, takes the input as a pair from the context and the question, and the output defined with the start and end positions of the answer in the context. Accordingly, the collected dataset is converted into json file, and then we applied the PyTorch libraries for building both, training and inference models. The ROUGE metrics revealed that the model achieves a good performance in answering to the students’ questions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)
Shorten, C., Khoshgoftaar, T.M., Furht, B.: Text data augmentation for deep learning. J. Big Data 8(1), 101 (2021)
Gillioz, A., Casas, J., Mugellini, E., Abou Khaled, O.: Overview of the transformer-based models for NLP tasks. In: 2020 15th Conference on Computer Science and Information Systems (FedCSIS), pp. 179–183. IEEE, September 2020
Yu, S., Su, J., Luo, D.: Improving BERT-based text classification with auxiliary sentence and domain knowledge. IEEE Access 7, 176600–176612 (2019)
Truong, T.L., Le, H.L., Le-Dang, T.P.: Sentiment analysis implementing BERT-based pre-trained language model for Vietnamese. In: 2020 7th NAFOSTED Conference on Information and Computer Science (NICS), pp. 362–367. IEEE, November 2020
Ngai, H., Park, Y., Chen, J., Parsapoor, M.: Transformer-based models for question answering on COVID19. arXiv preprint arXiv:2101.11432 (2021)
Aparicio, V., Gordon, D., Huayamares, S.G., Luo, Y.: BioFinBERT: finetuning large language models (LLMs) to analyze sentiment of press releases and financial text around inflection points of biotech stocks. arXiv preprint arXiv:2401.11011 (2024)
Xie, K., et al.: Generalization of finetuned transformer language models to new clinical contexts. JAMIA Open 6(3), ooad070 (2023)
Yang, F., Wang, X., Ma, H., Li, J.: Transformers-sklearn: a toolkit for medical language understanding with transformer-based models. BMC Med. Inform. Decis. Making 21(2), 1–8 (2021)
Sung, C., Dhamecha, T.I., Mukhi, N.: Improving short answer grading using transformer-based pre-training. In: Isotani, S., Millán, E., Ogan, A., Hastings, P., McLaren, B., Luckin, R. (eds.) AIED 2019, Part I. LNCS (LNAI), vol. 11625, pp. 469–481. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-23204-7_39
Beheitt, M.E.G., Hmida, M.B.H.: Automatic Arabic poem generation with GPT-2. In: ICAART, no. 2, pp. 366–374 (2022)
Sur, C.: RBN: enhancement in language attribute prediction using global representation of natural language transfer learning technology like Google BERT. SN Appl. Sci. 2(1), 22 (2020)
Lanzillotta, S.: Learning text embeddings on virus sequences for candidate drug discovery (2020)
Song, K., Tan, X., Qin, T., Lu, J., Liu, T.Y.: MPNet: masked and permuted pre-training for language understanding. Adv. Neural Inf. Process. Syst. 33, 16857–16867 (2020)
Shi, W., Demberg, V.: Next sentence prediction helps implicit discourse relation classification within and across domains. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 5790–5796, November 2019
Imambi, S., Prakash, K.B., Kanagachidambaresan, G.R.: PyTorch. In: Kolla Bhanu Prakash, G.R., Kanagachidambaresan (eds.) Programming with TensorFlow: Solution for Edge Computing Applications, pp. 87–104. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-57077-4
Hugging Face: Hugging Face Transformers Documentation (n.d.). https://huggingface.co/docs/transformers/en/model_doc/bert
Assayed, S.K., Alkhatib, M., Shaalan, K.: A generative AI Chatbot in high school advising: a qualitative analysis of domain-specific Chatbot and ChatGPT (2023)
Assayed, S.K., Alkhatib, M., Shaalan, K.: Advising chatbot for high school in smart cities. In: 2023 8th International Conference on Smart and Sustainable Technologies (SpliTech), pp. 1–6. IEEE, June 2023
Assayed, S.K., Alkhatib, M., Shaalan, K.: Enhancing student services: machine learning chatbot intent recognition for high school inquiries. In: Al Marri, K., Mir, F.A., David, S.A., Al-Emran, M. (eds.) BUiD Doctoral Research Conference 2023: Multidisciplinary Studies. LNCS, vol. 473, pp. 243–254. Springer, Cham (2024). https://doi.org/10.1007/978-3-031-56121-4_24
He, J., et al.: TransFG: a transformer architecture for fine-grained recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 1, pp. 852–860 (2022)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Assayed, S.K., Alkhatib, M., Shaalan, K. (2024). A Transformer-Based Generative AI Model in Education: Fine-Tuning BERT for Domain-Specific in Student Advising. In: Basiouni, A., Frasson, C. (eds) Breaking Barriers with Generative Intelligence. Using GI to Improve Human Education and Well-Being. BBGI 2024. Communications in Computer and Information Science, vol 2162. Springer, Cham. https://doi.org/10.1007/978-3-031-65996-6_14
Download citation
DOI: https://doi.org/10.1007/978-3-031-65996-6_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-65995-9
Online ISBN: 978-3-031-65996-6
eBook Packages: Computer ScienceComputer Science (R0)