Abstract
How to effectively measure the similarity between two sentences is a challenging task in natural language processing. In this paper, we propose a sentence similarity comparison method that combines word embeddings and syntactic structure. First of all, by generating the corresponding syntactic tree, we synthetically analyze the two sentences and block them according to the syntactic components. Secondly, we prune the syntactic tree, remove the stop words and perform morphological restoration. Then, some important operations will be performed, such as passive flipping, negative flipping, and so on. Finally, the similarity of two sentence pairs is calculated by weighting the block embeddings of the syntactic tree. Experiments show the effectiveness of this method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Harris, Z.S.: Distributional structure. Word 10(2–3), 146–162 (1954)
Firth, J.: A synopsis of linguistic theory 1930–1955. Stud. Linguist. Anal. Oxf. Philol. Soc. 41(4), 1–32 (1957)
Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)
Covington, M.A.: A fundamental algorithm for dependency parsing. In: 39th Annual ACM Southeast Conference, pp. 95–102. ACM Press, Pisa (2001)
Yamada, H., Matsumoto, Y.: Statistical dependency analysis with support vector machines. In: 8th International Workshop on Parsing Technologies, pp. 195–206. ACL Press, Nancy (2003)
Nivre, J., Nilsson, J.: Three algorithms for deterministic dependency parsing. Comput. Linguist. 34(4), 513–553 (2003)
Andor, D., et al.: Globally Normalized transition-based neural networks. In: 54th Annual Meeting of the Association for Computational Linguistics, pp. 2442–2452. ACL Press, Berlin (2016)
Tian, J., Zhang, T., Qin, A., Shang, Z., Tang, Y.Y.: Learning the distribution preserving semantic subspace for clustering. IEEE Trans. Image Process. 26(12), 5950–5965 (2017)
Xu, W., Alex, R.: Can artificial neural networks learn language models? In: 6th International Conference on Spoken Language Processing, pp. 202–205. China Military Friendship Publish, Beijing (2000)
Bengio, Y., Senecal, J.S.: Adaptive importance sampling to accelerate training of a neural probabilistic language model. IEEE Trans. Neural Netw. 19(4), 713–722 (2008)
Mnih, A., Hinton, G.: Three new graphical models for statistical language modelling. In: 24th International Conference on Machine Learning, pp. 641–648. ACM Press, Corvallis (2007)
Mnih, A., Kavukcuoglu, K.: Learning word embeddings efficiently with noise-contrastive estimation. Adv. Neural. Inf. Process. Syst. 2013, 2265–2273 (2013)
Mikolov, T.: Statistical language models based on neural networks. Technical report, Google Mountain View (2012)
Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: 25th International Conference on Machine Learning, Helsinki, Finland, pp. 160–167 (2008)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: International Conference on Learning Representations, pp. 1–12. Hans Publisher, Scottsdale (2013)
Henry, S., Cuffy, C., Mcinnes, B.T.: Vector representations of multi-word terms for semantic relatedness. J. Biomed. Inform. 77, 111–119 (2018)
Jin, P., Zhang, Y., Chen, X., Xia, Y.: Bag-of-embeddings for text classification, In: 25th International Joint Conference on Artificial Intelligence, pp. 2824–2830. AAAI Press, New York (2016)
Deng, H., Zhu, X., Li, Q.: sentence similarity calculation based on syntactic structure and modifier. Comput. Eng. 43(9), 240–244 (2017)
Lévy, B.: Robustness and efficiency of geometric programs the Predicate Construction Kit (PCK). Comput. Aided Des. 72(1), 3–12 (2016)
Bin, L.I., Liu, T., Bing, Q., Sheng, L.I.: Chinese sentence similarity computing based on semantic dependency relationship analysis. Appl. Res. Comput. 12, 15–17 (2003)
Liu, W., Liu, P., Yang, Y., Gao, Y., Yi, J.: An attention-based syntax-tree and tree-LSTM model for sentence summarization. Int. J. Perform. Eng. 13(5), 775–782 (2017)
Acknowledgments
This work was supported by the national natural science foundation of China (61373148, 61502151), Shandong social science planning project (17CHLJ18, 17CHLJ33, 17CHLJ30), the natural science foundation of Shandong province (ZR2014FL010) and Shandong province department of education (J15LN34).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Liu, W., Liu, P., Yi, J., Yang, Y., Liu, W., Li, N. (2018). A Sentence Similarity Model Based on Word Embeddings and Dependency Syntax-Tree. In: Cheng, L., Leung, A., Ozawa, S. (eds) Neural Information Processing. ICONIP 2018. Lecture Notes in Computer Science(), vol 11303. Springer, Cham. https://doi.org/10.1007/978-3-030-04182-3_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-04182-3_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04181-6
Online ISBN: 978-3-030-04182-3
eBook Packages: Computer ScienceComputer Science (R0)