Skip to main content

On the Impact of Linguistic Information in Kernel-Based Deep Architectures

  • Conference paper
  • First Online:
AI*IA 2017 Advances in Artificial Intelligence (AI*IA 2017)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10640))

Included in the following conference series:

Abstract

Kernel methods enable the direct usage of structured representations of textual data during language learning and inference tasks. On the other side, deep neural networks are effective in learning non-linear decision functions. Recent works demonstrated that expressive kernels and deep neural networks can be combined in a Kernel-based Deep Architecture (KDA), a common framework that allows to explicitly model structured information into a neural network. This combination achieves state-of-the-art accuracy in different semantic inference tasks. This paper investigates the impact of linguistic information on the performance reachable by a KDA by studying the benefits that different kernels can bring to the inference quality. We believe that the expressiveness of data representations will play a key role in the wide spread adoption of neural networks in AI problem solving. We experimentally evaluated the adoption of different kernels (each characterized by a growing expressive power) in a Question Classification task. Results suggest the importance of rich kernel functions in optimizing the accuracy of a KDA.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    In our preliminary experiments, adjustments to the \(H_{Ny}\) matrix have been tested, but it was not beneficial.

  2. 2.

    The input layer and the Nyström layer are not modified during the learning process, and they are not regularized.

  3. 3.

    http://www.kelp-ml.org/.

  4. 4.

    https://www.tensorflow.org/.

References

  1. Cancedda, N., Gaussier, É., Goutte, C., Renders, J.M.: Word-sequence kernels. J. Mach. Learn. Res. 3, 1059–1082 (2003)

    MATH  MathSciNet  Google Scholar 

  2. Collins, M., Duffy, N.: Convolution kernels for natural language. In: Proceedings of Neural Information Processing Systems (NIPS 2001), pp. 625–632 (2001)

    Google Scholar 

  3. Moschitti, A., Pighin, D., Basili, R.: Tree kernels for semantic role labeling. Comput. Linguist. 34(2), 193–224 (2008)

    Article  MathSciNet  Google Scholar 

  4. Vapnik, V.N.: Statistical Learning Theory. Wiley, Hoboken (1998)

    MATH  Google Scholar 

  5. Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, New York (2004)

    Book  MATH  Google Scholar 

  6. Socher, R., Perelygin, A., Wu, J., Chuang, J., Manning, C.D., Ng, A., Potts, C.: Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of EMNLP 2013 (2013)

    Google Scholar 

  7. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  8. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  9. Levy, O., Goldberg, Y., Dagan, I.: Improving distributional similarity with lessons learned from word embeddings. Trans. Assoc. Comput. Linguist. 3, 211–225 (2015). https://transacl.org/ojs/index.php/tacl/article/view/570

    Google Scholar 

  10. Croce, D., Filice, S., Castellucci, G., Basili, R.: Deep learning in semantic kernel spaces. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pp. 345–354 (2017). http://aclanthology.coli.uni-saarland.de/pdf/P/P17/pp.17-1032.pdf

  11. Croce, D., Basili, R.: Large-scale kernel-based language learning through the ensemble Nyström methods. In: Ferro, N., Crestani, F., Moens, M.-F., Mothe, J., Silvestri, F., Di Nunzio, G.M., Hauff, C., Silvello, G. (eds.) ECIR 2016. LNCS, vol. 9626, pp. 100–112. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-30671-1_8

    Chapter  Google Scholar 

  12. Severyn, A., Nicosia, M., Moschitti, A.: Building structures from classifiers for passage reranking. In: CIKM 2013, pp. 969–978. ACM, New York (2013)

    Google Scholar 

  13. Filice, S., Croce, D., Moschitti, A., Basili, R.: KeLP at SemEval-2016 task 3: learning semantic relations between questions and comments. In: Proceedings of SemEval 2016, June 2016

    Google Scholar 

  14. Zhang, D., Lee, W.S.: Question classification using support vector machines. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in information retrieval, pp. 26–32. ACM Press (2003)

    Google Scholar 

  15. Haussler, D.: Convolution kernels on discrete structures. Technical report UCS-CRL-99-10. University of California, Santa Cruz (1999)

    Google Scholar 

  16. Croce, D., Moschitti, A., Basili, R.: Structured lexical similarity via convolution kernels on dependency trees. In: Proceedings of EMNLP 2011, pp. 1034–1046 (2011)

    Google Scholar 

  17. Filice, S., Da San Martino, G., Moschitti, A.: Structural representations for learning relations between pairs of texts. In: Proceedings of ACL 2015, Beijing, China, pp. 1003–1013. http://www.aclweb.org/anthology/pp.15-1097

  18. Moschitti, A.: Efficient convolution kernels for dependency and constituent syntactic trees. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 318–329. Springer, Heidelberg (2006). https://doi.org/10.1007/11871842_32

    Chapter  Google Scholar 

  19. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. CoRR abs/1301.3781 (2013)

    Google Scholar 

  20. Sahlgren, M.: The word-space model. Ph.D. thesis, Stockholm University (2006)

    Google Scholar 

  21. Annesi, P., Croce, D., Basili, R.: Semantic compositionality in tree kernels. In: Proceedings of CIKM 2014. ACM (2014)

    Google Scholar 

  22. Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 27:1–27:27 (2011)

    Article  Google Scholar 

  23. Kumar, S., Mohri, M., Talwalkar, A.: Sampling methods for the Nyström method. J. Mach. Learn. Res. 13, 981–1006 (2012)

    MATH  MathSciNet  Google Scholar 

  24. Li, X., Roth, D.: Learning question classifiers: the role of semantic information. Nat. Lang. Eng. 12(3), 229–249 (2006)

    Article  Google Scholar 

  25. Filice, S., Castellucci, G., Croce, D., Basili, R.: KeLP: a kernel-based learning platform for natural language processing. In: Proceedings of ACL-IJCNLP 2015 System Demonstrations, Beijing, China, pp. 19–24, July 2015. http://www.aclweb.org/anthology/pp.15-4004

  26. Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings EMNLP 2014, Doha, Qatar, pp. 1746–1751, October 2014

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Danilo Croce .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Croce, D., Filice, S., Basili, R. (2017). On the Impact of Linguistic Information in Kernel-Based Deep Architectures. In: Esposito, F., Basili, R., Ferilli, S., Lisi, F. (eds) AI*IA 2017 Advances in Artificial Intelligence. AI*IA 2017. Lecture Notes in Computer Science(), vol 10640. Springer, Cham. https://doi.org/10.1007/978-3-319-70169-1_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-70169-1_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-70168-4

  • Online ISBN: 978-3-319-70169-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics