A Sentence Similarity Model Based on Word Embeddings and Dependency Syntax-Tree

Liu, Wenfeng; Liu, Peiyu; Yi, Jing; Yang, Yuzhen; Liu, Weitong; Li, Nana

doi:10.1007/978-3-030-04182-3_12

Wenfeng Liu^16,17,
Peiyu Liu^16,18,
Jing Yi^16,19,
Yuzhen Yang¹⁷,
Weitong Liu^16,18 &
…
Nana Li^16,18

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11303))

Included in the following conference series:

International Conference on Neural Information Processing

2231 Accesses
1 Citations

Abstract

How to effectively measure the similarity between two sentences is a challenging task in natural language processing. In this paper, we propose a sentence similarity comparison method that combines word embeddings and syntactic structure. First of all, by generating the corresponding syntactic tree, we synthetically analyze the two sentences and block them according to the syntactic components. Secondly, we prune the syntactic tree, remove the stop words and perform morphological restoration. Then, some important operations will be performed, such as passive flipping, negative flipping, and so on. Finally, the similarity of two sentence pairs is calculated by weighting the block embeddings of the syntactic tree. Experiments show the effectiveness of this method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://mxnet.incubator.apache.org/.

References

Harris, Z.S.: Distributional structure. Word 10(2–3), 146–162 (1954)
Article Google Scholar
Firth, J.: A synopsis of linguistic theory 1930–1955. Stud. Linguist. Anal. Oxf. Philol. Soc. 41(4), 1–32 (1957)
Google Scholar
Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)
Google Scholar
Covington, M.A.: A fundamental algorithm for dependency parsing. In: 39th Annual ACM Southeast Conference, pp. 95–102. ACM Press, Pisa (2001)
Google Scholar
Yamada, H., Matsumoto, Y.: Statistical dependency analysis with support vector machines. In: 8th International Workshop on Parsing Technologies, pp. 195–206. ACL Press, Nancy (2003)
Google Scholar
Nivre, J., Nilsson, J.: Three algorithms for deterministic dependency parsing. Comput. Linguist. 34(4), 513–553 (2003)
Article Google Scholar
Andor, D., et al.: Globally Normalized transition-based neural networks. In: 54th Annual Meeting of the Association for Computational Linguistics, pp. 2442–2452. ACL Press, Berlin (2016)
Google Scholar
Tian, J., Zhang, T., Qin, A., Shang, Z., Tang, Y.Y.: Learning the distribution preserving semantic subspace for clustering. IEEE Trans. Image Process. 26(12), 5950–5965 (2017)
Article MathSciNet Google Scholar
Xu, W., Alex, R.: Can artificial neural networks learn language models? In: 6th International Conference on Spoken Language Processing, pp. 202–205. China Military Friendship Publish, Beijing (2000)
Google Scholar
Bengio, Y., Senecal, J.S.: Adaptive importance sampling to accelerate training of a neural probabilistic language model. IEEE Trans. Neural Netw. 19(4), 713–722 (2008)
Article Google Scholar
Mnih, A., Hinton, G.: Three new graphical models for statistical language modelling. In: 24th International Conference on Machine Learning, pp. 641–648. ACM Press, Corvallis (2007)
Google Scholar
Mnih, A., Kavukcuoglu, K.: Learning word embeddings efficiently with noise-contrastive estimation. Adv. Neural. Inf. Process. Syst. 2013, 2265–2273 (2013)
Google Scholar
Mikolov, T.: Statistical language models based on neural networks. Technical report, Google Mountain View (2012)
Google Scholar
Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: 25th International Conference on Machine Learning, Helsinki, Finland, pp. 160–167 (2008)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: International Conference on Learning Representations, pp. 1–12. Hans Publisher, Scottsdale (2013)
Google Scholar
Henry, S., Cuffy, C., Mcinnes, B.T.: Vector representations of multi-word terms for semantic relatedness. J. Biomed. Inform. 77, 111–119 (2018)
Article Google Scholar
Jin, P., Zhang, Y., Chen, X., Xia, Y.: Bag-of-embeddings for text classification, In: 25th International Joint Conference on Artificial Intelligence, pp. 2824–2830. AAAI Press, New York (2016)
Google Scholar
Deng, H., Zhu, X., Li, Q.: sentence similarity calculation based on syntactic structure and modifier. Comput. Eng. 43(9), 240–244 (2017)
Google Scholar
Lévy, B.: Robustness and efficiency of geometric programs the Predicate Construction Kit (PCK). Comput. Aided Des. 72(1), 3–12 (2016)
Article Google Scholar
Bin, L.I., Liu, T., Bing, Q., Sheng, L.I.: Chinese sentence similarity computing based on semantic dependency relationship analysis. Appl. Res. Comput. 12, 15–17 (2003)
Google Scholar
Liu, W., Liu, P., Yang, Y., Gao, Y., Yi, J.: An attention-based syntax-tree and tree-LSTM model for sentence summarization. Int. J. Perform. Eng. 13(5), 775–782 (2017)
Google Scholar

Download references

Acknowledgments

This work was supported by the national natural science foundation of China (61373148, 61502151), Shandong social science planning project (17CHLJ18, 17CHLJ33, 17CHLJ30), the natural science foundation of Shandong province (ZR2014FL010) and Shandong province department of education (J15LN34).

Author information

Authors and Affiliations

School of Information Science and Engineering, Shandong Normal University, Jinan, 250014, China
Wenfeng Liu, Peiyu Liu, Jing Yi, Weitong Liu & Nana Li
School of Computer, Heze University, Heze, 274015, China
Wenfeng Liu & Yuzhen Yang
Shandong Provincial Key Laboratory for Distributed Computer Software Novel Technology, Jinan, 250014, China
Peiyu Liu, Weitong Liu & Nana Li
School of Computer Science and Technology, Shandong Jianzhu University, Jinan, 250101, China
Jing Yi

Authors

Wenfeng Liu
View author publications
You can also search for this author in PubMed Google Scholar
Peiyu Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jing Yi
View author publications
You can also search for this author in PubMed Google Scholar
Yuzhen Yang
View author publications
You can also search for this author in PubMed Google Scholar
Weitong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Nana Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Peiyu Liu .

Editor information

Editors and Affiliations

The Chinese Academy of Sciences, Beijing, China
Long Cheng
City University of Hong Kong, Kowloon, Hong Kong
Andrew Chi Sing Leung
Kobe University, Kobe, Japan
Seiichi Ozawa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, W., Liu, P., Yi, J., Yang, Y., Liu, W., Li, N. (2018). A Sentence Similarity Model Based on Word Embeddings and Dependency Syntax-Tree. In: Cheng, L., Leung, A., Ozawa, S. (eds) Neural Information Processing. ICONIP 2018. Lecture Notes in Computer Science(), vol 11303. Springer, Cham. https://doi.org/10.1007/978-3-030-04182-3_12

Download citation

DOI: https://doi.org/10.1007/978-3-030-04182-3_12
Published: 18 November 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04181-6
Online ISBN: 978-3-030-04182-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics