Abstract:
The speech corpus is the basis of linguistic research and natural language processing. In order to make the speech corpus be collected more efficiently and be used or sha...Show MoreMetadata
Abstract:
The speech corpus is the basis of linguistic research and natural language processing. In order to make the speech corpus be collected more efficiently and be used or shared easier, it is necessary to develop the standardization scheme for speech corpus project. This paper tries to provide a standardization program that covers all aspects of data collection, annotation, and distribution. The specifications of constructing a speech corpus are also introduced in the paper. Finally, a telephone speech corpus, TSC973, be exemplified to illuminate the standardization program.
Date of Conference: 01-03 November 2017
Date Added to IEEE Xplore: 14 June 2018
ISBN Information:
Electronic ISSN: 2472-7695