loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Rakia Saidi 1 and Fethi Jarray 1 ; 2

Affiliations: 1 LIMTIC Laboratory, UTM University, Tunis, Tunisia ; 2 Higher institute of Computer Science of Medenine, Gabes University, Medenine, Tunisia

Keyword(s): Clustering, Word Embedding, Word Sense Induction, NLP, BERT, Arabic Language.

Abstract: Word sense induction (WSI) is a fundamental task in natural language processing (NLP) that consists in discovering the sense associated to each instance of a given target ambiguous word. In this paper, we propose a two-stage approach for solving Arabic WSI. In the first stage, we encode the input sentence into context representations using Transformer-based encoder such as BERT or DistilBERT. In the second stage, we apply clustering to the embedded corpus obtained in the first stage by using K-Means and Agglomerative Hierarchical Clustering (HAC). We evaluate our proposed method on the Arabic WSI summarization task. Experimental results show that our model achieves new state-of-the-art on both the Open Source Arabic Corpus (OSAC)(Saad and Ashour, 2010) and the SemEval arabic (2017).

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.128.168.87

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Saidi, R. and Jarray, F. (2023). Sentence Transformers and DistilBERT for Arabic Word Sense Induction. In Proceedings of the 15th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART; ISBN 978-989-758-623-1; ISSN 2184-433X, SciTePress, pages 1020-1027. DOI: 10.5220/0011891700003393

@conference{icaart23,
author={Rakia Saidi. and Fethi Jarray.},
title={Sentence Transformers and DistilBERT for Arabic Word Sense Induction},
booktitle={Proceedings of the 15th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART},
year={2023},
pages={1020-1027},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011891700003393},
isbn={978-989-758-623-1},
issn={2184-433X},
}

TY - CONF

JO - Proceedings of the 15th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART
TI - Sentence Transformers and DistilBERT for Arabic Word Sense Induction
SN - 978-989-758-623-1
IS - 2184-433X
AU - Saidi, R.
AU - Jarray, F.
PY - 2023
SP - 1020
EP - 1027
DO - 10.5220/0011891700003393
PB - SciTePress