Abstract
This work intends to capture the concept of similarity between phrases. The algorithm is based on a dynamic programming approach integrating both the edit distance between parse trees and single-term similarity. Our work stresses the use of the underlying grammatical structure, which serves as a guide in the computation of semantic similarity between words. This proposal allows us to obtain a more accurate notion of semantic proximity at sentence level, without increasing the complexity of the pattern-matching algorithm on which it is based.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Hammouda, K., Kamel, M.: Phrase-based document similarity based on an index graph model. In: 2002 IEEE Int. Conf. on Data Mining, Maebashi, Japan, pp. 203–210 (2002)
Montes-y-Gomez, M., Gelbukh, A., Lopez-Lopez, A., Baeza-Yates, R.: Flexible Comparison of Conceptual Graphs. In: Mayr, H.C., Lazanský, J., Quirchmayr, G., Vogel, P. (eds.) DEXA 2001. LNCS, vol. 2113, p. 102. Springer, Heidelberg (2001)
Lin, D.: An information-theoretic definition of similarity. In: Proc. 15th International Conf. on Machine Learning, pp. 296–304 (1998)
Miller, G.: WordNet: An online lexical database. International Journal of Lexico- graphy 3(4) (1990)
Mitchell: Machine learning and data mining. CACM: Communications of the ACM 42 (1999)
Tai, K.-C.: The Tree-to-Tree Correction Problem. Journal of the ACM 26(3), 422–433 (1979)
Vilares, M., Dion, B.A.: Efficient incremental parsing for context-free languages. In: Proc. of the 5th IEEE Int. Conf. on Computer Languages, Toulouse, France, pp. 241–252 (1994)
Vilares, M., Ribadas, F.J., Darriba, V.M.: Approximate pattern matching in shared-forest. In: Ibrahim, M., Küng, J., Revell, N. (eds.) DEXA 2000. LNCS, vol. 1873, pp. 322–333. Springer, Heidelberg (2000)
Wagner, R.A., Fischer, M.J.: The string to string correction problem. Journal of the ACM 21(1), 168–173 (1974)
Zhang, K., Shasha, D., Wang, J.T.L.: Approximate tree matching in the presence of variable length don’t cares. Journal of Algorithms 16(1), 33–66 (1994)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Vilares, M., Ribadas, F.J., Vilares, J. (2004). Phrase Similarity through the Edit Distance. In: Galindo, F., Takizawa, M., Traunmüller, R. (eds) Database and Expert Systems Applications. DEXA 2004. Lecture Notes in Computer Science, vol 3180. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30075-5_30
Download citation
DOI: https://doi.org/10.1007/978-3-540-30075-5_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22936-0
Online ISBN: 978-3-540-30075-5
eBook Packages: Springer Book Archive