Abstract:
The construction of a Gold Standard Corpus for Plagiarism Detection (GSCPD) is a challenging task for reproducible research in computer science, given that there is a tra...Show MoreMetadata
Abstract:
The construction of a Gold Standard Corpus for Plagiarism Detection (GSCPD) is a challenging task for reproducible research in computer science, given that there is a trade off between the time expended by the experts and the size, quality, and reliability of a GSCPD. In such a challenging scenario, this paper describes a framework to support the construction of a GSCPD in any language. Aiming for reproducibility and scalability, the framework involves a data acquisition process and a Crowd Science project that employs human processing power to identify plagiarism in pairs of textual data extracted via the data acquisition process. This papers also presents the application of this framework in Portuguese language and the preliminary results of a feasibility study about the use of a tool that composes the framework.
Published in: 2019 IEEE 23rd International Conference on Computer Supported Cooperative Work in Design (CSCWD)
Date of Conference: 06-08 May 2019
Date Added to IEEE Xplore: 08 August 2019
ISBN Information: