Abstract
Different from English processing, Chinese text processing starts from word segmentation, and the results of word segmentation will influence the outcomes of subsequent processing especially in short text processing. In this paper, we introduce a novel method for Short Text Information Retrieval based Chinese Question Answering. It is developed from the Discernibility Matrix based Rules Acquisition method. Based on the acquired rules, the matching patterns of the training QA pairs can be represented by the reduced attribute words, and the words can also be represented by the QA patterns. Then the attribute words in the test QA pairs can be used to calculate the matching scores. The experimental results show that the proposed representation method of QA patterns has good flexibility to deal with the uncertainty caused by the Chinese word segmentation, and the proposed method has good performance at both MAP and MRR on the test data.
References
Yang, Y., Jiang, P., Ren, F., et al.: Classic Chinese automatic question answering system based on pragmatics information. In: 7th Mexican International Conference on Artificial Intelligence, pp. 58–64. IEEE Computer Society (2008)
Hu, H., Ren, F., Kuroiwa, S., Zhang, S.: A question answering system on special domain and the implementation of speech interface. In: Gelbukh, A. (ed.) CICLing 2006. LNCS, vol. 3878, pp. 458–469. Springer, Heidelberg (2006). doi:10.1007/11671299_48
Yao, Y., Zhao, Y.: Attribute reduction in decision-theoretic rough set models. Inf. Sci. 178(17), 3356–3373 (2008)
Wang, J., Miao, D.: Analysis on attribute reduction strategies of rough set. J. Comput. Sci. Technol. 13(2), 189–192 (1998)
Lang, G., Miao, D., Yang, T., et al.: Knowledge reduction of dynamic covering decision information systems when varying covering cardinalities. Inf. Sci. 346, 236–260 (2016)
Skowron, A., Rauszer, C.: The discernibility matrices and functions in information systems. Theory Decis. Libr. 11, 331–362 (1992)
Miao, D.Q., Zhao, Y., Yao, Y.Y., et al.: Relative reducts in consistent and inconsistent decision tables of the Pawlak rough set model. Inf. Sci. 179(24), 4140–4150 (2009)
Duan, N.: Overview of the NLPCC-ICCPOL 2016 shared task: open domain chinese question answering. In: Lin, C.-Y., Xue, N., Zhao, D., Huang, X., Feng, Y. (eds.) ICCPOL/NLPCC -2016. LNCS, vol. 10102, pp. 942–948. Springer, Cham (2016). doi:10.1007/978-3-319-50496-4_89
Rehurek, R., Sojka, P.: Software framework for topic modelling with large corpora. In: Proceedings of LREC 2010 Workshop on New Challenges for NLP Frameworks, pp. 45–50 (2010)
Pawlak, Z.: Rough sets. Int. J. Parallel Prog. 11(5), 341–356 (1982)
Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning about Data. Springer Science and Business Media, Heidelberg (2012)
Pawlakab, Z.: Rough set approach to knowledge-based decision support. Eur. J. Oper. Res. 99(1), 48–57 (1995)
Bargiela, A., Pedrycz, W.: Toward a theory of granular computing for human-centered information processing. IEEE Trans. Fuzzy Syst. 16(2), 320–330 (2008)
Yao, J.T., Vasilakos, A.V., Pedrycz, W.: Granular computing: perspectives and challenges. IEEE Trans. Cybern. 43(6), 1977–1989 (2013)
Sun, A., Jiang, M., He, Y., et al.: Chinese question answering based on syntax analysis and answer classification. Acta Electronica Sinica 36(5), 833–839 (2008)
Dwivedi, S.K., Singh, V.: Research and reviews in question answering system. Proc. Technol. 10(1), 417–424 (2013)
Salton, G.: A vector space model for automatic indexing. Commun. ACM 18(11), 613–620 (1975)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3(1), 993–1022 (2003)
Papadimitriou, C.H., Tamaki, H., Raghavan, P., Indexing, L.S., et al.: A probabilistic analysis. In: Proceedings of 17th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 159–168. ACM (1998)
Mikolov, T., Sutskever, I., Chen, K., et al.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, vol. 26, pp. 3111–3119 (2013)
Zhang, H.P., Yu, H.K., Xiong, D.Y., et al.: HHMM-based Chinese lexical analyzer ICTCLAS. In: Proceedings of 2nd SIGHAN Workshop on Chinese Language Processing, vol. 17, pp. 184–187. Association for Computational Linguistics (2003)
Janusz, A., Zak, D., Nguyen, H.S.: Unsupervised similarity learning from textual data. Fundamenta Informaticae 119(3–4), 319–336 (2012)
Janusz, A.: Algorithms for similarity relation learning from high dimensional data. In: Peters, J.F. (ed.) Transactions on Rough Sets XVII, pp. 174–292. Springer, Heidelberg (2014)
Acknowledgments
This work is supported by the National Natural Science Foundation of China (61273304, 61673301, 61573255) and the Specialized Research Fund for the Doctoral Program of Higher Education of China (20130072130004).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Han, Z., Miao, D., Ren, F., Zhang, H. (2017). Discernibility Matrix and Rules Acquisition Based Chinese Question Answering System. In: Polkowski, L., et al. Rough Sets. IJCRS 2017. Lecture Notes in Computer Science(), vol 10313. Springer, Cham. https://doi.org/10.1007/978-3-319-60837-2_20
Download citation
DOI: https://doi.org/10.1007/978-3-319-60837-2_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-60836-5
Online ISBN: 978-3-319-60837-2
eBook Packages: Computer ScienceComputer Science (R0)