Reference Hub

This research has been cited in:

Conference
A novel approach to improve the Record Linkage process2019 6th International Conference on Control, Decision and Information Technologies (CoDIT)10.1109/CoDIT.2019.8820340
Conference
Block Sizes Control For an Efficient Real Time Record Linkage2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech)10.1109/CloudTech49835.2020.9365866
Article
Data Warehouses and Big DataInternational Journal of Organizational and Collective Intelligence10.4018/IJOCI.2020070101

An Unsupervised Entity Resolution Framework for English and Arabic Datasets

Abdelkrim OUHAB, Mimoun MALKI, Djamel BERRABAH, Faouzi BOUFARES

Source Title: International Journal of Strategic Information Technology and Applications (IJSITA)8(4)

ISSN: 1947-3095|EISSN: 1947-3109|EISBN13: 9781522514008|DOI: 10.4018/IJSITA.2017100102

Cite Article Cite Article

MLA

Abdelkrim OUHAB, et al. "An Unsupervised Entity Resolution Framework for English and Arabic Datasets." IJSITA vol.8, no.4 2017: pp.16-29. http://doi.org/10.4018/IJSITA.2017100102

APA

Abdelkrim OUHAB, Mimoun MALKI, Djamel BERRABAH, & Faouzi BOUFARES. (2017). An Unsupervised Entity Resolution Framework for English and Arabic Datasets. International Journal of Strategic Information Technology and Applications (IJSITA), 8(4), 16-29. http://doi.org/10.4018/IJSITA.2017100102

Chicago

Abdelkrim OUHAB, et al. "An Unsupervised Entity Resolution Framework for English and Arabic Datasets," International Journal of Strategic Information Technology and Applications (IJSITA) 8, no.4: 16-29. http://doi.org/10.4018/IJSITA.2017100102

Export Reference

Favorite Full-Issue Download

View Full Text HTML

View Full Text PDF

Abstract

Entity resolution (ER) is an important step in data integration and in many data mining projects; its goal is to identify records that refer to the same real-world entity. Most existing ER frameworks have focused on datasets in Latin-based languages and do not support Arabic language. In this article, the authors present an unsupervised ER framework that supports English and Arabic datasets. Rather than using matching rules developed by an expert or manually labeled training examples, the proposed framework automatically generates its own training set. The generated training set is then used to train a classifier and learn a classification model. Finally, the learned classification model is used to perform ER. The proposed framework was implemented and tested on three Arabic datasets and four English datasets. Experimental results show that the proposed framework is competitive with supervised approaches and outperform recently proposed unsupervised approaches in terms of F-measure.

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.

Username or email: *

Password: *

Forgot individual login password?

Create individual account

An Unsupervised Entity Resolution Framework for English and Arabic Datasets

MLA

APA

Chicago

Export Reference

Abstract

Request Access