To read this content please select one of the options below:

Domain-specific readability measures to improve information retrieval in the Persian language

Sholeh Arastoopoor (Department of Information Science and Knowledge Studies, Ferdowsi University of Mashhad, Mashhad, Iran)

The Electronic Library

ISSN: 0264-0473

Article publication date: 14 May 2018

Issue publication date: 11 June 2018

175

Abstract

Purpose

The degree to which a text is considered readable depends on the capability of the reader. This assumption puts different information retrieval systems at the risk of retrieving unreadable or hard-to-be-read yet relevant documents for their users. This paper aims to examine the potential use of concept-based readability measures along with classic measures for re-ranking search results in information retrieval systems, specifically in the Persian language.

Design/methodology/approach

Flesch–Dayani as a classic readability measure along with document scope (DS) and document cohesion (DC) as domain-specific measures have been applied for scoring the retrieved documents from Google (181 documents) and the RICeST database (215 documents) in the field of computer science and information technology (IT). The re-ranked result has been compared with the ranking of potential users regarding their readability.

Findings

The results show that there is a difference among subcategories of the computer science and IT field according to their readability and understandability. This study also shows that it is possible to develop a hybrid score based on DS and DC measures and, among all four applied scores in re-ranking the documents, the re-ranked list of documents based on the DSDC score shows correlation with re-ranking of the participants in both groups.

Practical implications

The findings of this study would foster a new option in re-ranking search results based on their difficulty for experts and non-experts in different fields.

Originality/value

The findings and the two-mode re-ranking model proposed in this paper along with its primary focus on domain-specific readability in the Persian language would help Web search engines and online databases in further refining the search results in pursuit of retrieving useful texts for users with differing expertise.

Keywords

Citation

Arastoopoor, S. (2018), "Domain-specific readability measures to improve information retrieval in the Persian language", The Electronic Library, Vol. 36 No. 3, pp. 430-444. https://doi.org/10.1108/EL-01-2017-0007

Publisher

:

Emerald Publishing Limited

Copyright © 2018, Emerald Publishing Limited

Related articles