Abstract:
Assessing the readability of documents is always a rewarding work. In this paper, we apply linear regression models for readability assessment of Chinese documents, and p...Show MoreMetadata
Abstract:
Assessing the readability of documents is always a rewarding work. In this paper, we apply linear regression models for readability assessment of Chinese documents, and put forward LiFR (Linear model incorporating Feature Ranking), which uses feature ranking to select the most appropriate text features to build the linear model. Text features specialized for Chinese are developed, which include the surface, part of speech, parse tree and entropy features. The experimental results demonstrate that both linear and log-linear regression models are worthy of confidence for readability assessment, and can achieve competitive performance to other machine learning methods, such as SVR (Support Vector Machine for Regression). Also the designed features are valuable, and feature ranking is essential to build useful linear functions.
Date of Conference: 12-14 September 2014
Date Added to IEEE Xplore: 27 October 2014
Electronic ISBN:978-1-4799-4219-0