On the Impact of Linguistic Information in Kernel-Based Deep Architectures

Croce, Danilo; Filice, Simone; Basili, Roberto

doi:10.1007/978-3-319-70169-1_27

Danilo Croce¹⁷,
Simone Filice¹⁷ &
Roberto Basili¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10640))

Included in the following conference series:

Conference of the Italian Association for Artificial Intelligence

1450 Accesses
2 Citations

Abstract

Kernel methods enable the direct usage of structured representations of textual data during language learning and inference tasks. On the other side, deep neural networks are effective in learning non-linear decision functions. Recent works demonstrated that expressive kernels and deep neural networks can be combined in a Kernel-based Deep Architecture (KDA), a common framework that allows to explicitly model structured information into a neural network. This combination achieves state-of-the-art accuracy in different semantic inference tasks. This paper investigates the impact of linguistic information on the performance reachable by a KDA by studying the benefits that different kernels can bring to the inference quality. We believe that the expressiveness of data representations will play a key role in the wide spread adoption of neural networks in AI problem solving. We experimentally evaluated the adoption of different kernels (each characterized by a growing expressive power) in a Question Classification task. Results suggest the importance of rich kernel functions in optimizing the accuracy of a KDA.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Kernel-Based Generative Adversarial Networks for Weakly Supervised Learning

A Shallow Convolutional Neural Network Architecture for Open Domain Question Answering

BloomQDE: Leveraging Bloom’s Taxonomy for Question Difficulty Estimation

Notes

1.
In our preliminary experiments, adjustments to the $H_{Ny}$ matrix have been tested, but it was not beneficial.
2.
The input layer and the Nyström layer are not modified during the learning process, and they are not regularized.
3.
http://www.kelp-ml.org/.
4.
https://www.tensorflow.org/.

References

Cancedda, N., Gaussier, É., Goutte, C., Renders, J.M.: Word-sequence kernels. J. Mach. Learn. Res. 3, 1059–1082 (2003)
MATH MathSciNet Google Scholar
Collins, M., Duffy, N.: Convolution kernels for natural language. In: Proceedings of Neural Information Processing Systems (NIPS 2001), pp. 625–632 (2001)
Google Scholar
Moschitti, A., Pighin, D., Basili, R.: Tree kernels for semantic role labeling. Comput. Linguist. 34(2), 193–224 (2008)
Article MathSciNet Google Scholar
Vapnik, V.N.: Statistical Learning Theory. Wiley, Hoboken (1998)
MATH Google Scholar
Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, New York (2004)
Book MATH Google Scholar
Socher, R., Perelygin, A., Wu, J., Chuang, J., Manning, C.D., Ng, A., Potts, C.: Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of EMNLP 2013 (2013)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Levy, O., Goldberg, Y., Dagan, I.: Improving distributional similarity with lessons learned from word embeddings. Trans. Assoc. Comput. Linguist. 3, 211–225 (2015). https://transacl.org/ojs/index.php/tacl/article/view/570
Google Scholar
Croce, D., Filice, S., Castellucci, G., Basili, R.: Deep learning in semantic kernel spaces. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pp. 345–354 (2017). http://aclanthology.coli.uni-saarland.de/pdf/P/P17/pp.17-1032.pdf
Croce, D., Basili, R.: Large-scale kernel-based language learning through the ensemble Nyström methods. In: Ferro, N., Crestani, F., Moens, M.-F., Mothe, J., Silvestri, F., Di Nunzio, G.M., Hauff, C., Silvello, G. (eds.) ECIR 2016. LNCS, vol. 9626, pp. 100–112. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-30671-1_8
Chapter Google Scholar
Severyn, A., Nicosia, M., Moschitti, A.: Building structures from classifiers for passage reranking. In: CIKM 2013, pp. 969–978. ACM, New York (2013)
Google Scholar
Filice, S., Croce, D., Moschitti, A., Basili, R.: KeLP at SemEval-2016 task 3: learning semantic relations between questions and comments. In: Proceedings of SemEval 2016, June 2016
Google Scholar
Zhang, D., Lee, W.S.: Question classification using support vector machines. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in information retrieval, pp. 26–32. ACM Press (2003)
Google Scholar
Haussler, D.: Convolution kernels on discrete structures. Technical report UCS-CRL-99-10. University of California, Santa Cruz (1999)
Google Scholar
Croce, D., Moschitti, A., Basili, R.: Structured lexical similarity via convolution kernels on dependency trees. In: Proceedings of EMNLP 2011, pp. 1034–1046 (2011)
Google Scholar
Filice, S., Da San Martino, G., Moschitti, A.: Structural representations for learning relations between pairs of texts. In: Proceedings of ACL 2015, Beijing, China, pp. 1003–1013. http://www.aclweb.org/anthology/pp.15-1097
Moschitti, A.: Efficient convolution kernels for dependency and constituent syntactic trees. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 318–329. Springer, Heidelberg (2006). https://doi.org/10.1007/11871842_32
Chapter Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. CoRR abs/1301.3781 (2013)
Google Scholar
Sahlgren, M.: The word-space model. Ph.D. thesis, Stockholm University (2006)
Google Scholar
Annesi, P., Croce, D., Basili, R.: Semantic compositionality in tree kernels. In: Proceedings of CIKM 2014. ACM (2014)
Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 27:1–27:27 (2011)
Article Google Scholar
Kumar, S., Mohri, M., Talwalkar, A.: Sampling methods for the Nyström method. J. Mach. Learn. Res. 13, 981–1006 (2012)
MATH MathSciNet Google Scholar
Li, X., Roth, D.: Learning question classifiers: the role of semantic information. Nat. Lang. Eng. 12(3), 229–249 (2006)
Article Google Scholar
Filice, S., Castellucci, G., Croce, D., Basili, R.: KeLP: a kernel-based learning platform for natural language processing. In: Proceedings of ACL-IJCNLP 2015 System Demonstrations, Beijing, China, pp. 19–24, July 2015. http://www.aclweb.org/anthology/pp.15-4004
Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings EMNLP 2014, Doha, Qatar, pp. 1746–1751, October 2014
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Enterprise Engineering, University of Roma, Tor Vergata, Rome, Italy
Danilo Croce, Simone Filice & Roberto Basili

Authors

Danilo Croce
View author publications
You can also search for this author in PubMed Google Scholar
Simone Filice
View author publications
You can also search for this author in PubMed Google Scholar
Roberto Basili
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Danilo Croce .

Editor information

Editors and Affiliations

University of Bari, Bari, Italy
Floriana Esposito
University of Rome Tor Vergata, Rome, Italy
Roberto Basili
University of Bari, Bari, Italy
Stefano Ferilli
University of Bari, Bari, Italy
Francesca A. Lisi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Croce, D., Filice, S., Basili, R. (2017). On the Impact of Linguistic Information in Kernel-Based Deep Architectures. In: Esposito, F., Basili, R., Ferilli, S., Lisi, F. (eds) AI*IA 2017 Advances in Artificial Intelligence. AI*IA 2017. Lecture Notes in Computer Science(), vol 10640. Springer, Cham. https://doi.org/10.1007/978-3-319-70169-1_27

Download citation

DOI: https://doi.org/10.1007/978-3-319-70169-1_27
Published: 07 November 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70168-4
Online ISBN: 978-3-319-70169-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

On the Impact of Linguistic Information in Kernel-Based Deep Architectures