Skip to main content
Log in

Locality kernels for sequential data and their applications to parse ranking

  • Published:
Applied Intelligence Aims and scope Submit manuscript

An Erratum to this article was published on 06 November 2009

Abstract

We propose a framework for constructing kernels that take advantage of local correlations in sequential data. The kernels designed using the proposed framework measure parse similarities locally, within a small window constructed around each matching feature. Furthermore, we propose to incorporate positional information inside the window and consider different ways to do this. We applied the kernels together with regularized least-squares (RLS) algorithm to the task of dependency parse ranking using the dataset containing parses obtained from a manually annotated biomedical corpus of 1100 sentences. Our experiments show that RLS with kernels incorporating positional information perform better than RLS with the baseline kernel functions. This performance gain is statistically significant.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University Press, New York

    Google Scholar 

  2. Scholkopf B, Smola AJ (2001) Learning with kernels: Support vector machines, regularization, optimization, and beyond. MIT Press, Cambridge

    Google Scholar 

  3. Herbrich R (2002) Learning kernel classifiers: theory and algorithms. MIT Press, Cambridge

    Google Scholar 

  4. Collins M, Duffy N (2001) Convolution kernels for natural language. In: Dietterich TG, Becker S, Ghahramani Z (eds) NIPS. MIT Press, Cambridge, pp 625–632

    Google Scholar 

  5. Sleator DD, Temperley D (1991) Parsing English with a link grammar. Technical Report CMU-CS-91-196, Department of Computer Science, Carnegie Mellon University, Pittsburgh, PA

  6. Pyysalo S, Ginter F, Heimonen J, Björne J, Boberg J, Järvinen J, Salakoski T (2007) BioInfer: A corpus for information extraction in the biomedical domain. BMC Bioinformatics 8:50. The corpus is available at http://www.it.utu.fi/BioInfer

    Article  Google Scholar 

  7. Pyysalo S, Ginter F, Pahikkala T, Boberg J, Järvinen J, Salakoski T, Koivula J (2004) Analysis of link grammar on biomedical dependency corpus targeted at protein-protein interactions. In: Collier N, Ruch P, Nazarenko A (eds) Proceedings of the JNLPBA workshop at COLING’04, Geneva, 2004, pp 15–21

  8. Tsivtsivadze E, Pahikkala T, Pyysalo S, Boberg J, Mylläri A, Salakoski T (2005) Regularized least-squares for parse ranking. In: Proceedings of the 6th international symposium on intelligent data analysis. Springer, Berlin, pp 464–474

    Google Scholar 

  9. Poggio T, Smale S (2003) The mathematics of learning: Dealing with data. Am Math Soc Not 50:537–544

    MATH  MathSciNet  Google Scholar 

  10. Rifkin R (2002) Everything old is new again: A fresh look at historical approaches in machine learning. PhD thesis, MIT

  11. Pahikkala T, Boberg J, Salakoski T (2006) Fast n-fold cross-validation for regularized least-squares. In: Honkela T, Raiko T, Kortela J, Valpola H (eds) Proceedings of the 9th Scandinavian conference on artificial intelligence (SCAI 2006), Espoo, Finland, Otamedia Oy, pp 83–90

  12. Tsivtsivadze E, Pahikkala T, Boberg J, Salakoski T (2006) Locality-convolution kernel and its application to dependency parse ranking. In: Ali M, Dapoigny R (eds) IEA/AIE. Lecture notes in computer science, vol 4031. Springer, Berlin, pp 610–618

    Google Scholar 

  13. Zien A, Ratsch G, Mika S, Scholkopf B, Lengauer T, Muller KR (2000) Engineering support vector machine kernels that recognize translation initiation sites. Bioinformatics 16:799–807

    Article  Google Scholar 

  14. Leslie CS, Eskin E, Noble WS (2002) The spectrum kernel: A string kernel for svm protein classification. In: Pacific symposium on biocomputing, pp 566–575

  15. Kendall MG (1970) Rank correlation methods, 4th edn. Griffin, London

    MATH  Google Scholar 

  16. Haussler D (1999) Convolution kernels on discrete structures. Technical Report UCSC-CRL-99-10, UC Santa Cruz

  17. Schölkopf B, Herbrich R, Smola AJ (2001) A generalized representer theorem. In: Helmbold D, Williamson R (eds) Proceedings of the 14th annual conference on computational learning theory and 5th European conference on computational learning theory. Springer, Berlin, pp 416–426

    Google Scholar 

  18. Lodhi H, Saunders C, Shawe-Taylor J, Cristianini N, Watkins CJCH (2002) Text classification using string kernels. J Mach Learn Res 2:419–444

    Article  MATH  Google Scholar 

  19. Cancedda N, Gaussier E, Goutte C, Renders JM (2003) Word-sequence kernels. J Mach Learn Res 3:1059–1082

    Article  MATH  MathSciNet  Google Scholar 

  20. Moschitti A (2006) Making tree kernels practical for natural language learning. In: 11st Conference of the European chapter of the association for computational linguistics. The Association for Computer Linguistics

  21. Gärtner T, Flach PA, Wrobel S (2003) On graph kernels: Hardness results and efficient alternatives. In: Schölkopf B, Warmuth MK (eds) 16th annual conference on computational learning theory and 7th kernel workshop (COLT-2003). Lecture notes in computer science, vol 2777. Springer, Berlin, pp 129–143

    Google Scholar 

  22. Suzuki J, Isozaki H, Maeda E (2004) Convolution kernels with feature selection for natural language processing tasks. In: ACL, pp 119–126

  23. Pahikkala T, Tsivtsivadze E, Boberg J, Salakoski T (2006) Graph kernels versus graph representations: a case study in parse ranking. In: Gärtner T, Garriga GC, Meinl T (eds) Proceedings of the ECML/PKDD’06 workshop on mining and learning with graphs (MLG’06)

  24. Pahikkala T, Pyysalo S, Ginter F, Boberg J, Järvinen J, Salakoski T (2005) Kernels incorporating word positional information in natural language disambiguation tasks. In: Russell I, Markov Z (eds) Proceedings of the 18th international Florida artificial intelligence research society conference, Menlo Park, CA. AAAI Press, Menlo Park, pp 442–447

    Google Scholar 

  25. Pahikkala T, Pyysalo S, Boberg J, Mylläri A, Salakoski T (2005) Improving the performance of Bayesian and support vector classifiers in word sense disambiguation using positional information. In: Honkela T, Könönen V, Pöllä M, Simula O. (eds) Proceedings of the international and interdisciplinary conference on adaptive knowledge representation and reasoning, Espoo, Finland, Helsinki University of Technology, pp 90–97

  26. Pahikkala T, Boberg J, Mylläri A, Salakoski T (2006) Incorporating external information in Bayesian classifiers via linear feature transformations. In: Salakoski T, Ginter F, Pyysalo S, Pahikkala T (eds) Proceedings of the 5th international conference on natural language processing FinTAL 06, Turku, Finland. Lecture notes in artificial intelligence, vol 4139. Springer, Heidelberg, pp 399–410

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Evgeni Tsivtsivadze.

Additional information

An erratum to this article can be found at http://dx.doi.org/10.1007/s10489-009-0198-3

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tsivtsivadze, E., Pahikkala, T., Boberg, J. et al. Locality kernels for sequential data and their applications to parse ranking. Appl Intell 31, 81–88 (2009). https://doi.org/10.1007/s10489-008-0114-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-008-0114-2

Keywords

Navigation