Abstract
The exploitation of syntactic structures and semantic background knowledge has always been an appealing subject in the context of text retrieval and information management. The usefulness of this kind of information has been shown most prominently in highly specialized tasks, such as classification in Question Answering (QA) scenarios. So far, however, additional syntactic or semantic information has been used only individually. In this paper, we propose a principled approach for jointly exploiting both types of information. We propose a new type of kernel, the Semantic Syntactic Tree Kernel (SSTK), which incorporates linguistic structures, e.g. syntactic dependencies, and semantic background knowledge, e.g. term similarity based on WordNet, to automatically learn question categories in QA. We show the power of this approach in a series of experiments with a well known Question Classification dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys 34(1), 1–47 (2002)
Joachims, T.: Text categorization with Support Vector Machines: learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, Springer, Heidelberg (1998)
Collins, M., Duffy, N.: Convolution kernels for natural language. In: NIPS, MIT Press, Cambridge (2001), http://www-2.cs.cmu.edu/Groups/NIPS/NIPS2001/papers/psgz/AA58.ps.gz
Moschitti, A.: Efficient convolution kernels for dependency and constituent syntactic trees. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, Springer, Heidelberg (2006)
Siolas, G., d’Alche Buc, F.: Support Vector Machines based on a semantic kernel for text categorization. In: IJCNN, vol. 5 (2000), http://doi.ieeecomputersociety.org/10.1109/IJCNN.2000.861458
Mavroeidis, D., et al.: Word sense disambiguation for exploiting hierarchical thesauri in text classification. In: Jorge, A.M., et al. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, Springer, Heidelberg (2005)
Bloehdorn, S., et al.: Semantic kernels for text classification based on topological measures of feature similarity. In: Proceedings of ICDM (2006)
Li, X., Roth, D.: Learning question classifiers. In: Proceedings of the 19th International Conference on Computational Linguistics (COLING) (2002), http://acl.ldc.upenn.edu/C/C02/C02-1150.pdf
Vapnik, V., Golowich, S.E., Smola, A.J.: Support vector method for function approximation, regression estimation and signal processing. In: NIPS (1996), http://dblp.uni-trier.de/db/conf/nips/nipsN1996.html#VapnikGS96
Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis (Hardcover). Cambridge University Press, Cambridge (2004), http://www.amazon.fr/exec/obidos/ASIN/0521813972/citeulike04-21
Cristianini, N., Shawe-Taylor, J., Lodhi, H.: Latent Semantic Kernels. Journal of Intelligent Information Systems 18(2-3), 127–152 (2002)
Basili, R., Cammisa, M., Moschitti, A.: A semantic kernel to exploit linguistic knowledge. In: Bandini, S., Manzoni, S. (eds.) AI*IA 2005. LNCS (LNAI), vol. 3673, Springer, Heidelberg (2005)
Zelenko, D., Aone, C., Richardella, A.: Kernel methods for relation extraction. Journal of Machine Learning Research (2003), citeseer.nj.nec.com/zelenko02kernel.html
Culotta, A., Sorensen, J.: Dependency tree kernels for relation extraction. In: Proceedings of ACL (2004)
Cumby, C., Roth, D.: Kernel methods for relational learning. In: Proceedings of the Twentieth International Conference (ICML 2003), Washington, DC, USA (2003)
Moschitti, A.: A study on convolution kernels for shallow semantic parsing. In: Proceedings of ACL (2004)
Budanitsky, A., Hirst, G.: Evaluating wordnet-based measures of lexical semantic relatedness. Computational Linguistics 32(1), 13–47 (2006), http://www.ingentaconnect.com/content/mitpress/coli/2006/00000032/00000001/art00003
Zhang, D., Lee, W.S.: Question classification using Support Vector Machines. In: Proceedings of SIGIR, Toronto, Canada (2003), doi:10.1145/860435.860443
Joachims, T.: Making large-scale SVM learning practical. In: Advances in Kernel Methods (1999)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Bloehdorn, S., Moschitti, A. (2007). Combined Syntactic and Semantic Kernels for Text Classification. In: Amati, G., Carpineto, C., Romano, G. (eds) Advances in Information Retrieval. ECIR 2007. Lecture Notes in Computer Science, vol 4425. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71496-5_29
Download citation
DOI: https://doi.org/10.1007/978-3-540-71496-5_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71494-1
Online ISBN: 978-3-540-71496-5
eBook Packages: Computer ScienceComputer Science (R0)