Abstract:
The quality of question classification is vital for a practical question-answering system. This paper proposes a transfer learning method based on generating virtual data...Show MoreMetadata
Abstract:
The quality of question classification is vital for a practical question-answering system. This paper proposes a transfer learning method based on generating virtual data for zero-shot questions. The basic idea is to exploit the commonality and difference between zero annotated questions and large enough annotated questions to generate virtual training data for zero annotated questions, thereby relieving the problem of data imbalance and improving performance of question classifier. Concretely, we first apply a template-based generator to generate basic virtual samples, then use them to train an encoder-decoder based generator to generate large enough virtual data. Finally, the real samples and virtual ones are used to train a supervised question classifier. Experiments show that the proposed method improves the overall classification performance both for English and Chinese data sets. Especially, the classification performance of zero annotated questions increased significantly, from 7.46% to 59.34% for English and from 1.96% to 42.67% for Chinese, and the generated virtual data has minute impact on the performance of large annotated question test set.
Published in: 2018 5th IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS)
Date of Conference: 23-25 November 2018
Date Added to IEEE Xplore: 14 April 2019
ISBN Information: