Skip to main content
Log in

An attention mechanism and multi-granularity-based Bi-LSTM model for Chinese Q&A system

  • Focus
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Natural language processing (NLP) is one of the key techniques in intelligent question-answering (Q&A) systems. Although recurrent neural networks and long short-term memory (LSTM) networks exhibit obvious advantages on well-known English Q&A datasets, they still suffer from several defects including indeterminateness, polysemy and the lack of changing morphology in Chinese, which results in complex NLP on large and diverse Chinese Q&A datasets. In this paper, we first analyze limitations of applying LSTM and bidirectional LSTM (Bi-LSTM) models to noisy Chinese Q&A datasets. Then, we focus on integrating attention mechanisms and multi-granularity word segmentation into Bi-LSTM and propose an attention mechanism and multi-granularity-based Bi-LSTM model (AM–Bi-LSTM) which combines the improved attention mechanism with a novel processing of multi-granularity word segmentation to handle the complex NLP in Chinese Q&A datasets. Furthermore, similarity of questions and answers is formulated to implement the quantitative computation which helps to achieve better performance in Chinese Q&A systems. Finally, we verify the proposed model on a noisy Chinese Q&A dataset. The experimental results demonstrate that the novel AM–Bi-LSTM model achieves significant improvement on evaluation metrics of accuracy, mean average precision and so on. Moreover, the experimental results indicate that the novel AM–Bi-LSTM model outperforms baseline methods and other LSTM-based models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Allam AMN, Haggag MH (2012) The question answering systems: a survey. Int J Res Rev Inf Sci 2(3):211–221

    Google Scholar 

  • Almomani A, Alauthman M, Albalas F et al (2018) An online intrusion detection system to cloud computing based on NeuCube algorithms. Int J Cloud Appl Comput 8(2):96–112

    Google Scholar 

  • Bird S, Klein E, Loper E (2009) Natural language processing with Python: analyzing text with the natural language toolkit. O’Reilly Media, Beijing

    MATH  Google Scholar 

  • Chang X, Yu YL, Yang Y et al (2016) Semantic pooling for complex event analysis in untrimmed videos. IEEE Trans Pattern Anal Mach Intell 39(8):1617–1632

    Google Scholar 

  • Cheng Z, Chang X, Zhu L et al (2019) MMALFM: explainable recommendation by leveraging reviews and images. ACM Trans Inf Syst 37(2):16

    Google Scholar 

  • Cho K, van Merrienboer B, Bahdanau D, Bengio Y (2014) On the properties of neural machine translation: encoder–decoder approaches. In: Eighth workshop on syntax, semantics and structure in statistical translation, 10

  • Day MY, Ong CS, Hsu WL (2007) Question classification in English-Chinese cross-language question answering: an integrated genetic algorithm and machine learning approach. In: IEEE international conference on information reuse and integration, pp 203–208

  • Demin B, Parlati S, Spinnato PF et al (2019) U-LITE, a private cloud approach for particle physics computing. Int J Cloud Appl Comput 9(1):1–15

    Google Scholar 

  • Dkaich R, El Azami I, Mouloudi A (2017) XML OLAP cube in the cloud towards the DWaaS. Int J Cloud Appl Comput 7(1):47–56

    Google Scholar 

  • Erhan D, Bengio Y, Courville A et al (2010) Why does unsupervised pre-training help deep learning? J Mach Learn Res 11(3):625–660

    MathSciNet  MATH  Google Scholar 

  • Feng M, Xiang B, Glass MR et al (2015) Applying deep learning to answer selection: a study and an open task. In: IEEE workshop on automatic speech recognition and understanding (ASRU). IEEE, pp 813–820

  • Gao H, Mao J, Zhou J et al (2015) Are you talking to a machine? Dataset and methods for multilingual image question. In: Advances in neural information processing systems, pp 2296–2304

  • Green Jr BF, Wolf AK, Chomsky C et al (1961) Baseball: an automatic question-answerer. In: Proceedings of western joint IRE-AIEE-ACM computing conference, Los Angeles, 9–11 May, pp 219–224

  • Guan Y, Wang XL, Zhao J (2006) The research on professional website oriented Chinese question answering system. Nat Immunol 8(1):92–100

    Google Scholar 

  • Hermjakob U (2001) Parsing and question classification for question answering. In: Proceedings of the ACL 2001 workshop on open-domain question answering

  • Hu B, Wang H, Yu X et al (2017) Sparse network embedding for community detection and sign prediction in signed social networks. J Ambient Intell Humaniz Comput 1:1–12

    Google Scholar 

  • Iyyer M, Boyd-Graber J, Claudino L et al (2014) A neural network for factoid question answering over paragraphs. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 633–644

  • Ji C, Liu S, Yang C et al (2018) A shapelet selection algorithm for time series classification: new directions. Procedia Comput Sci 129:461–467

    Google Scholar 

  • Lai Y, Jia Y, Lin Y et al (2017) A Chinese question answering system for single-relation factoid questions. In: National CCF conference on natural language processing and Chinese computing. Springer, Cham, pp 124–135

  • Lee CW, Day MY, Sung CL et al (2008) Boosting Chinese question answering with two lightweight methods: ABSPs and SCO-QAT. ACM Trans Asian Lang Inf Process 7(4):12

    Google Scholar 

  • Li S, Zhang J, Huang X et al (2002) Semantic computation in a Chinese question-answering system. J Comput Sci Technol 17(6):933–939

    MATH  Google Scholar 

  • Li Z, Nie F, Chang X et al (2017) Beyond trace ratio: weighted harmonic mean of trace ratios for multiclass discriminant analysis. IEEE Trans Knowl Data Eng 29(10):2100–2110

    Google Scholar 

  • Liu H, Hu B, Moore P (2015) HCI model with learning mechanism for cooperative design in pervasive computing environment. J Internet Technol 16(2):201–210

    Google Scholar 

  • Liu FL, Hao WN et al (2017) Attention of bilinear function based Bi-LSTM model for machine reading comprehension. Comput Sci 44(s1):92–96

    Google Scholar 

  • Liu R, Wang H, Yu XM (2018) Shared-nearest-neighbor-based clustering by fast search and find of density peaks. Inf Sci 450:200–226

    MathSciNet  Google Scholar 

  • Mao J, Xu W, Yang Y et al (2014) Deep captioning with multimodal recurrent neural networks (m-RNN). arXiv preprint arXiv:1412.6632

  • Negi P, Mishra A, Gupta BB (2013) Enhanced CBF packet filtering method to detect DDoS attack in cloud computing environment. arXiv preprint arXiv:1304.7073

  • Paris CL (1985) Towards more graceful interaction: a survey of question-answering programs. Columbia University Computer Science Technical Reports

  • Peng F, Weischedel R, Licuanan A et al (2005) Combining deep linguistics analysis and surface pattern learning: a hybrid approach to Chinese definitional question answering. In: Proceedings of the conference on human language technology and empirical methods in natural language processing. Association for Computational Linguistics, pp 307–314

  • Qiu X, Huang X (2015) Convolutional neural tensor network architecture for community-based question answering. In: Twenty-Fourth international joint conference on artificial intelligence

  • Sakre MM, Kouta MM, Allam AMN (2009) Automated construction of Arabic-English parallel corpus. J Adv Comput Sci 3:1–8. https://doi.org/10.13140/RG.2.1.2135.0880

  • Shi D, Zhu L, Cheng Z et al (2018) Unsupervised multi-view feature extraction with dynamic graph learning. J Vis Commun Image Represent 56:256–264

  • Seo M, Kembhavi A, Farhadi A et al (2016) Bidirectional attention flow for machine comprehension. arXiv preprint arXiv:1611.01603

  • Socher R, Lin CC, Manning C et al (2011) Parsing natural scenes and natural language with recursive neural networks. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 129–136

  • Suppes P, Liang L, Bottner M (1996) Machine learning comprehension grammars for ten languages. Comput Linguist 22(3):329–350

    Google Scholar 

  • Sutskever I, Vinyals O, Le Q (2014) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems

  • Tan M, Santos C, Xiang B et al (2015) LSTM-based deep learning models for non-factoid answer selection. arXiv preprint arXiv:1511.04108

  • Wang YL, Chuan Z, Qiuliang X et al (2015) Fair secure computation with reputation assumptions in the mobile social networks. Mobile Inf Syst. https://doi.org/10.1155/2015/637458

  • Wang Y, Huang M, Zhu X et al (2016) Attention-based LSTM for aspect-level sentiment classification. In: Conference on empirical methods in natural language processing, pp 606–615

  • Wang YL, Zhao M, Hu YM, Gao YJ, Cui XC (2019) Secure computation protocols under asymmetric scenarios in enterprise information system. Enterp Inf Syst. https://doi.org/10.1080/17517575.2019.1597387

    Article  Google Scholar 

  • Woods WA (1973) Progress in natural language understanding: an application to lunar geology. In: Proceedings of the national conference of the American Federation of Information Processing Societies, 4–8 June, pp 441–450

  • Wu H, Zhang H, Cui L et al (2018) A heuristic model for supporting users’ decision-making in privacy disclosure for recommendation. Secur Commun. https://doi.org/10.1155/2018/2790373

    Article  Google Scholar 

  • Xu C (2017) Research on multi-granularity analysis and processing method of time series signal based on convolution-long-term memory neural network, Harbin Institute of Technology

  • Xu J, Xu Y, Zhang Y et al (2015) Combining semantic comprehension and machine learning for Chinese sentiment classification. Open Autom Control Syst J 7(1):1660–1666

    MathSciNet  Google Scholar 

  • Yao Y, Huang Z (2016) Bi-directional LSTM recurrent neural network for Chinese word segmentation. In: International conference on neural information processing. Springer, Cham, pp 345–353

  • Yin J, Jiang X, Lu Z et al (2015) Neural generative question answering. arXiv preprint arXiv:1512.01337

  • Yu XM, Wang H, Zheng X et al (2016) Effective algorithms for vertical mining probabilistic frequent patterns in uncertain mobile environments. Int J Ad Hoc Ubiquitous Comput 23(3/4):137

    Google Scholar 

  • Zhang D, Lee WS (2003) Question classification using support vector machines. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval. ACM, pp 26–32

  • Zhang B, Zhu L, Sun J, Zhang H (2018) Cross-media retrieval with collective deep semantic learning. Multimed Tools Appl 77(17):22247–22266

  • Zhen L, Wenxian X, Wenlong W et al (2012) The research of Chinese Q&A system based on similarity algorithm. Computer, informatics, cybernetics and applications. Springer, Dordrecht, pp 981–990

  • Zheng XW, Li Y, Liu H et al (2016) A study on a cooperative character modeling based on an improved NSGA II. Multimed Tools Appl 75(8):4305–4320

    Google Scholar 

  • Zheng X, Tian J, Xiao X et al (2018) A heuristic survivable virtual network mapping algorithm. Soft Comput 23:1453. https://doi.org/10.1007/s00500-018-3152-7

    Article  Google Scholar 

  • Zhou B, Sun C, Lin L et al (2018) LSTM based question answering for large scale knowledge base. Beijing Da Xue Xue Bao 54(2):286–292

    MathSciNet  Google Scholar 

  • Zhu L, Huang Z, Li Z et al (2018) Exploring auxiliary context: discrete semantic transfer hashing for scalable image retrieval. IEEE Trans Neural Netw Learn syst 29(11):5264–5276

Download references

Acknowledgements

This work is partly funded by the National Nature Science Foundation of China (Nos. 61672329, 61773246), Major Program of Shandong Province Natural Science Foundation (ZR2018ZB0419).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiao-mei Yu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

Additional information

Communicated by B. B. Gupta.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yu, Xm., Feng, Wz., Wang, H. et al. An attention mechanism and multi-granularity-based Bi-LSTM model for Chinese Q&A system. Soft Comput 24, 5831–5845 (2020). https://doi.org/10.1007/s00500-019-04367-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-019-04367-8

Keywords

Navigation