Welcome to the InfoSci Platform

Document Alignment for Generation of English-Punjabi Comparable Corpora from Wikipedia

Vishal Goyal, Ajit Kumar, Manpreet Singh Lehal

Source Title: International Journal of E-Adoption (IJEA)12(1)

ISSN: 1937-9633|EISSN: 1937-9641|EISBN13: 9781799805656|DOI: 10.4018/IJEA.2020010104

MLA

Goyal, Vishal, et al. "Document Alignment for Generation of English-Punjabi Comparable Corpora from Wikipedia." IJEA vol.12, no.1 2020: pp.42-51. http://doi.org/10.4018/IJEA.2020010104

APA

Goyal, V., Kumar, A., & Lehal, M. S. (2020). Document Alignment for Generation of English-Punjabi Comparable Corpora from Wikipedia. International Journal of E-Adoption (IJEA), 12(1), 42-51. http://doi.org/10.4018/IJEA.2020010104

Chicago

Goyal, Vishal, Ajit Kumar, and Manpreet Singh Lehal. "Document Alignment for Generation of English-Punjabi Comparable Corpora from Wikipedia," International Journal of E-Adoption (IJEA) 12, no.1: 42-51. http://doi.org/10.4018/IJEA.2020010104

Export Reference

Favorite Full-Issue Download

View Full Text HTML

View Full Text PDF

Abstract

Comparable corpora come as an alternative to parallel corpora for the languages where the parallel corpora is scarce. The efficiency of the models trained on comparable corpora is comparatively less to that of the parallel corpora however it helps to compensate much to the machine translation. In this article, the authors have explored Wikipedia as a potential source and delineated the process of alignment of documents which will be further used for the extraction of parallel data. The parallel data thus extracted will help to enhance the performance of Statistical Machine translation.

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.

Username or email: *

Password: *

Forgot individual login password?

Create individual account

Document Alignment for Generation of English-Punjabi Comparable Corpora from Wikipedia

MLA

APA

Chicago

Export Reference

Abstract

Request Access