Abstract:
This study describes the construction of the TOCFL (Test Of Chinese as a Foreign Language) learner corpus, including the collection and grammatical error annotation of 2,...Show MoreMetadata
Abstract:
This study describes the construction of the TOCFL (Test Of Chinese as a Foreign Language) learner corpus, including the collection and grammatical error annotation of 2,837 essays written by Chinese language learners originating from a total of 46 different mother-tongue languages. We propose hierarchical tagging sets to manually annotate grammatical errors, resulting in 33,835 inappropriate usages. Our built corpus has been provided for the shared tasks on Chinese grammatical error diagnosis. These demonstrate the usability of our learner corpus annotation.
Date of Conference: 21-23 November 2016
Date Added to IEEE Xplore: 13 March 2017
ISBN Information: