An attention mechanism and multi-granularity-based Bi-LSTM model for Chinese Q&A system

Yu, Xiao-mei; Feng, Wen-zhi; Wang, Hong; Chu, Qian; Chen, Qi

doi:10.1007/s00500-019-04367-8

An attention mechanism and multi-granularity-based Bi-LSTM model for Chinese Q&A system

Focus
Published: 24 September 2019

Volume 24, pages 5831–5845, (2020)
Cite this article

Soft Computing Aims and scope Submit manuscript

Xiao-mei Yu^1,2,
Wen-zhi Feng¹,
Hong Wang^1,2,
Qian Chu¹ &
…
Qi Chen¹

1807 Accesses
46 Citations
Explore all metrics

Abstract

Natural language processing (NLP) is one of the key techniques in intelligent question-answering (Q&A) systems. Although recurrent neural networks and long short-term memory (LSTM) networks exhibit obvious advantages on well-known English Q&A datasets, they still suffer from several defects including indeterminateness, polysemy and the lack of changing morphology in Chinese, which results in complex NLP on large and diverse Chinese Q&A datasets. In this paper, we first analyze limitations of applying LSTM and bidirectional LSTM (Bi-LSTM) models to noisy Chinese Q&A datasets. Then, we focus on integrating attention mechanisms and multi-granularity word segmentation into Bi-LSTM and propose an attention mechanism and multi-granularity-based Bi-LSTM model (AM–Bi-LSTM) which combines the improved attention mechanism with a novel processing of multi-granularity word segmentation to handle the complex NLP in Chinese Q&A datasets. Furthermore, similarity of questions and answers is formulated to implement the quantitative computation which helps to achieve better performance in Chinese Q&A systems. Finally, we verify the proposed model on a noisy Chinese Q&A dataset. The experimental results demonstrate that the novel AM–Bi-LSTM model achieves significant improvement on evaluation metrics of accuracy, mean average precision and so on. Moreover, the experimental results indicate that the novel AM–Bi-LSTM model outperforms baseline methods and other LSTM-based models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Development of Hierarchical Attention Network Based Architecture for Cloze-Style Question Answering

Semantic Improvement of Question-Answering System with Multi-Layer LSTM

Enhancing context representations with part-of-speech information and neighboring signals for question classification

Article Open access 26 April 2023

Peizhu Gong, Jin Liu, … Xiliang Zhang

References

Allam AMN, Haggag MH (2012) The question answering systems: a survey. Int J Res Rev Inf Sci 2(3):211–221
Google Scholar
Almomani A, Alauthman M, Albalas F et al (2018) An online intrusion detection system to cloud computing based on NeuCube algorithms. Int J Cloud Appl Comput 8(2):96–112
Google Scholar
Bird S, Klein E, Loper E (2009) Natural language processing with Python: analyzing text with the natural language toolkit. O’Reilly Media, Beijing
MATH Google Scholar
Chang X, Yu YL, Yang Y et al (2016) Semantic pooling for complex event analysis in untrimmed videos. IEEE Trans Pattern Anal Mach Intell 39(8):1617–1632
Google Scholar
Cheng Z, Chang X, Zhu L et al (2019) MMALFM: explainable recommendation by leveraging reviews and images. ACM Trans Inf Syst 37(2):16
Google Scholar
Cho K, van Merrienboer B, Bahdanau D, Bengio Y (2014) On the properties of neural machine translation: encoder–decoder approaches. In: Eighth workshop on syntax, semantics and structure in statistical translation, 10
Day MY, Ong CS, Hsu WL (2007) Question classification in English-Chinese cross-language question answering: an integrated genetic algorithm and machine learning approach. In: IEEE international conference on information reuse and integration, pp 203–208
Demin B, Parlati S, Spinnato PF et al (2019) U-LITE, a private cloud approach for particle physics computing. Int J Cloud Appl Comput 9(1):1–15
Google Scholar
Dkaich R, El Azami I, Mouloudi A (2017) XML OLAP cube in the cloud towards the DWaaS. Int J Cloud Appl Comput 7(1):47–56
Google Scholar
Erhan D, Bengio Y, Courville A et al (2010) Why does unsupervised pre-training help deep learning? J Mach Learn Res 11(3):625–660
MathSciNet MATH Google Scholar
Feng M, Xiang B, Glass MR et al (2015) Applying deep learning to answer selection: a study and an open task. In: IEEE workshop on automatic speech recognition and understanding (ASRU). IEEE, pp 813–820
Gao H, Mao J, Zhou J et al (2015) Are you talking to a machine? Dataset and methods for multilingual image question. In: Advances in neural information processing systems, pp 2296–2304
Green Jr BF, Wolf AK, Chomsky C et al (1961) Baseball: an automatic question-answerer. In: Proceedings of western joint IRE-AIEE-ACM computing conference, Los Angeles, 9–11 May, pp 219–224
Guan Y, Wang XL, Zhao J (2006) The research on professional website oriented Chinese question answering system. Nat Immunol 8(1):92–100
Google Scholar
Hermjakob U (2001) Parsing and question classification for question answering. In: Proceedings of the ACL 2001 workshop on open-domain question answering
Hu B, Wang H, Yu X et al (2017) Sparse network embedding for community detection and sign prediction in signed social networks. J Ambient Intell Humaniz Comput 1:1–12
Google Scholar
Iyyer M, Boyd-Graber J, Claudino L et al (2014) A neural network for factoid question answering over paragraphs. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 633–644
Ji C, Liu S, Yang C et al (2018) A shapelet selection algorithm for time series classification: new directions. Procedia Comput Sci 129:461–467
Google Scholar
Lai Y, Jia Y, Lin Y et al (2017) A Chinese question answering system for single-relation factoid questions. In: National CCF conference on natural language processing and Chinese computing. Springer, Cham, pp 124–135
Lee CW, Day MY, Sung CL et al (2008) Boosting Chinese question answering with two lightweight methods: ABSPs and SCO-QAT. ACM Trans Asian Lang Inf Process 7(4):12
Google Scholar
Li S, Zhang J, Huang X et al (2002) Semantic computation in a Chinese question-answering system. J Comput Sci Technol 17(6):933–939
MATH Google Scholar
Li Z, Nie F, Chang X et al (2017) Beyond trace ratio: weighted harmonic mean of trace ratios for multiclass discriminant analysis. IEEE Trans Knowl Data Eng 29(10):2100–2110
Google Scholar
Liu H, Hu B, Moore P (2015) HCI model with learning mechanism for cooperative design in pervasive computing environment. J Internet Technol 16(2):201–210
Google Scholar
Liu FL, Hao WN et al (2017) Attention of bilinear function based Bi-LSTM model for machine reading comprehension. Comput Sci 44(s1):92–96
Google Scholar
Liu R, Wang H, Yu XM (2018) Shared-nearest-neighbor-based clustering by fast search and find of density peaks. Inf Sci 450:200–226
MathSciNet Google Scholar
Mao J, Xu W, Yang Y et al (2014) Deep captioning with multimodal recurrent neural networks (m-RNN). arXiv preprint arXiv:1412.6632
Negi P, Mishra A, Gupta BB (2013) Enhanced CBF packet filtering method to detect DDoS attack in cloud computing environment. arXiv preprint arXiv:1304.7073
Paris CL (1985) Towards more graceful interaction: a survey of question-answering programs. Columbia University Computer Science Technical Reports
Peng F, Weischedel R, Licuanan A et al (2005) Combining deep linguistics analysis and surface pattern learning: a hybrid approach to Chinese definitional question answering. In: Proceedings of the conference on human language technology and empirical methods in natural language processing. Association for Computational Linguistics, pp 307–314
Qiu X, Huang X (2015) Convolutional neural tensor network architecture for community-based question answering. In: Twenty-Fourth international joint conference on artificial intelligence
Sakre MM, Kouta MM, Allam AMN (2009) Automated construction of Arabic-English parallel corpus. J Adv Comput Sci 3:1–8. https://doi.org/10.13140/RG.2.1.2135.0880
Shi D, Zhu L, Cheng Z et al (2018) Unsupervised multi-view feature extraction with dynamic graph learning. J Vis Commun Image Represent 56:256–264
Seo M, Kembhavi A, Farhadi A et al (2016) Bidirectional attention flow for machine comprehension. arXiv preprint arXiv:1611.01603
Socher R, Lin CC, Manning C et al (2011) Parsing natural scenes and natural language with recursive neural networks. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 129–136
Suppes P, Liang L, Bottner M (1996) Machine learning comprehension grammars for ten languages. Comput Linguist 22(3):329–350
Google Scholar
Sutskever I, Vinyals O, Le Q (2014) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems
Tan M, Santos C, Xiang B et al (2015) LSTM-based deep learning models for non-factoid answer selection. arXiv preprint arXiv:1511.04108
Wang YL, Chuan Z, Qiuliang X et al (2015) Fair secure computation with reputation assumptions in the mobile social networks. Mobile Inf Syst. https://doi.org/10.1155/2015/637458
Wang Y, Huang M, Zhu X et al (2016) Attention-based LSTM for aspect-level sentiment classification. In: Conference on empirical methods in natural language processing, pp 606–615
Wang YL, Zhao M, Hu YM, Gao YJ, Cui XC (2019) Secure computation protocols under asymmetric scenarios in enterprise information system. Enterp Inf Syst. https://doi.org/10.1080/17517575.2019.1597387
Article Google Scholar
Woods WA (1973) Progress in natural language understanding: an application to lunar geology. In: Proceedings of the national conference of the American Federation of Information Processing Societies, 4–8 June, pp 441–450
Wu H, Zhang H, Cui L et al (2018) A heuristic model for supporting users’ decision-making in privacy disclosure for recommendation. Secur Commun. https://doi.org/10.1155/2018/2790373
Article Google Scholar
Xu C (2017) Research on multi-granularity analysis and processing method of time series signal based on convolution-long-term memory neural network, Harbin Institute of Technology
Xu J, Xu Y, Zhang Y et al (2015) Combining semantic comprehension and machine learning for Chinese sentiment classification. Open Autom Control Syst J 7(1):1660–1666
MathSciNet Google Scholar
Yao Y, Huang Z (2016) Bi-directional LSTM recurrent neural network for Chinese word segmentation. In: International conference on neural information processing. Springer, Cham, pp 345–353
Yin J, Jiang X, Lu Z et al (2015) Neural generative question answering. arXiv preprint arXiv:1512.01337
Yu XM, Wang H, Zheng X et al (2016) Effective algorithms for vertical mining probabilistic frequent patterns in uncertain mobile environments. Int J Ad Hoc Ubiquitous Comput 23(3/4):137
Google Scholar
Zhang D, Lee WS (2003) Question classification using support vector machines. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval. ACM, pp 26–32
Zhang B, Zhu L, Sun J, Zhang H (2018) Cross-media retrieval with collective deep semantic learning. Multimed Tools Appl 77(17):22247–22266
Zhen L, Wenxian X, Wenlong W et al (2012) The research of Chinese Q&A system based on similarity algorithm. Computer, informatics, cybernetics and applications. Springer, Dordrecht, pp 981–990
Zheng XW, Li Y, Liu H et al (2016) A study on a cooperative character modeling based on an improved NSGA II. Multimed Tools Appl 75(8):4305–4320
Google Scholar
Zheng X, Tian J, Xiao X et al (2018) A heuristic survivable virtual network mapping algorithm. Soft Comput 23:1453. https://doi.org/10.1007/s00500-018-3152-7
Article Google Scholar
Zhou B, Sun C, Lin L et al (2018) LSTM based question answering for large scale knowledge base. Beijing Da Xue Xue Bao 54(2):286–292
MathSciNet Google Scholar
Zhu L, Huang Z, Li Z et al (2018) Exploring auxiliary context: discrete semantic transfer hashing for scalable image retrieval. IEEE Trans Neural Netw Learn syst 29(11):5264–5276

Download references

Acknowledgements

This work is partly funded by the National Nature Science Foundation of China (Nos. 61672329, 61773246), Major Program of Shandong Province Natural Science Foundation (ZR2018ZB0419).

Author information

Authors and Affiliations

School of Information Science and Engineering, Shandong Normal University, Jinan, 250014, People’s Republic of China
Xiao-mei Yu, Wen-zhi Feng, Hong Wang, Qian Chu & Qi Chen
Shandong Provincial Key Laboratory for Distributed Computer Software Novel Technology, Jinan, 250014, People’s Republic of China
Xiao-mei Yu & Hong Wang

Authors

Xiao-mei Yu
View author publications
You can also search for this author in PubMed Google Scholar
Wen-zhi Feng
View author publications
You can also search for this author in PubMed Google Scholar
Hong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Qian Chu
View author publications
You can also search for this author in PubMed Google Scholar
Qi Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiao-mei Yu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

Additional information

Communicated by B. B. Gupta.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yu, Xm., Feng, Wz., Wang, H. et al. An attention mechanism and multi-granularity-based Bi-LSTM model for Chinese Q&A system. Soft Comput 24, 5831–5845 (2020). https://doi.org/10.1007/s00500-019-04367-8

Download citation

Published: 24 September 2019
Issue Date: April 2020
DOI: https://doi.org/10.1007/s00500-019-04367-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An attention mechanism and multi-granularity-based Bi-LSTM model for Chinese Q&A system

Abstract

Access this article

Similar content being viewed by others

Development of Hierarchical Attention Network Based Architecture for Cloze-Style Question Answering

Semantic Improvement of Question-Answering System with Multi-Layer LSTM

Enhancing context representations with part-of-speech information and neighboring signals for question classification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An attention mechanism and multi-granularity-based Bi-LSTM model for Chinese Q&A system

Abstract

Access this article

Similar content being viewed by others

Development of Hierarchical Attention Network Based Architecture for Cloze-Style Question Answering

Semantic Improvement of Question-Answering System with Multi-Layer LSTM

Enhancing context representations with part-of-speech information and neighboring signals for question classification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation