Loading [a11y]/accessibility-menu.js
Efficiently Transferring Pre-trained Language Model RoBERTa Base English to Hindi Using WECHSEL | IEEE Conference Publication | IEEE Xplore

Efficiently Transferring Pre-trained Language Model RoBERTa Base English to Hindi Using WECHSEL


Abstract:

A crucial element of Natural Language Processing (NLP) is to make it possible for computers to comprehend and process human language, Language models (LMs), have taken ov...Show More

Abstract:

A crucial element of Natural Language Processing (NLP) is to make it possible for computers to comprehend and process human language, Language models (LMs), have taken over the discipline in recent years. LMs are pre-trained models that can be customized and have significantly improved performance for a variety of challenging natural language tasks. Bidirectional Encoders for Transformers (BERT) used in both English and other languages, is one of the most well-known LMs. Large pre-trained LMs require enormous computational resources to train on English text. This makes training these models in other languages difficult. This paper used a unique approach termed WECHSEL to address this problem on the Hindi dataset. The WECHSEL method to the RoBERTa model and assess its effectiveness. The source model’s English tokenizer is replaced with a tokenizer in the target language, Hindi. The advantages of this approach for Hindi demonstrate that WECHSEL outperforms other models of comparable size built from scratch with up to 64 times less training effort. WECHSEL RoBERTa-based Hindi was fine-tuned on NER task using SemEval-2022 datasets, and the accuracy is greater than any other BERT-based models. In this paper, WECHSEL RoBERTa-based Hindi achieved an accuracy of 73.45%, which is higher than the rest of the Hindi-language based BERT models.
Date of Conference: 04-06 December 2023
Date Added to IEEE Xplore: 02 April 2024
ISBN Information:

ISSN Information:

Conference Location: Delhi, India

Contact IEEE to Subscribe

References

References is not available for this document.