Skip to main content

Treebank Annotation for Formal Semantics Research

  • Conference paper
New Frontiers in Artificial Intelligence (JSAI-isAI 2012)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7856))

Included in the following conference series:

  • 938 Accesses

Abstract

This paper motivates and describes treebank annotation for Japanese and English following a scheme adapted from the Annotation manual for the Penn Historical Corpora and the PCEEC (Santorini 2010). The purpose of this annotation is to create a syntactic base from which meaning representations can be built automatically on a corpus linguistics scale (thousands of examples). Advantages of the adopted annotation scheme are highlighted. Most notably, marking clause level functional information is essential for deterministically building meaning representations beyond the predicate-argument structure level. Also an internal syntax where phrasal categories are fundamentally similar is of great assistance. Finally, the paper demonstrates how scope information is simple to add when bracketed syntactic structure is inherently flat.

This research has been supported by the JST PRESTO program (Synthesis of Knowledge for Information Oriented Society). We wish to thank attendees of LENLS9 for comments received that prompted improvements of the paper.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 49.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Bies, A., Ferguson, M., Katz, K., MacIntyre, R.: Bracketing guidelines for Treebank II style Penn Treebank project. Tech. Rep. MS-CIS-95-06, LINC LAB 281, University of Pennsylvania Computer and Information Science Department (1995)

    Google Scholar 

  • Bies, A., Maamouri, M.: Penn Arabic Treebank Guidelines. Tech. rep., Linguistic Data Consortium, University of Pennsylvania. DRAFT (2003)

    Google Scholar 

  • Blackburn, P., Bos, J.: Computational semantics. Theoria 13, 27–45 (2003)

    MathSciNet  MATH  Google Scholar 

  • Bos, J., Clark, S., Steedman, M., Curran, J.R., Hockenmaier, J.: Wide-coverage semantic representations from a CCG parser. In: Proceedings of the 20th International Conference on Computational Linguistics (COLING 2004), Geneva, Switzerland (2004)

    Google Scholar 

  • Butler, A.: The Semantics of Grammatical Dependencies. Current Research in the Semantics/Pragmatics Interface, vol. 23. Emerald, Bingley (2010)

    Book  Google Scholar 

  • Butler, A., Yoshimoto, K.: Banking meaning representations from treebanks. Linguistic Issues in Language Technology - LiLT 7(1), 1–22 (2012)

    Google Scholar 

  • Butler, A., Zhou, Z., Yoshimoto, K.: Problems for successful bunsetsu based parsing and some solutions. In: Proceedings of the Eighteenth Annual Meeting of the Association of Natural Language Processing, pp. 951–954. The Association of Natural Language Processing (2012)

    Google Scholar 

  • Cahill, A., McCarthy, M., van Genabith, J., Way, A.: Automatic annotation of the Penn Treebank with LFG F-structure information. In: LREC 2002 Workshop on Linguistic Knowledge Acquisition and Representation—Bootstrapping Annotated Language Data, Las Palmas, Spain, pp. 8–15 (2002)

    Google Scholar 

  • Davidson, D.: The logical form of action sentences. In: Rescher, N. (ed.) The Logic of Decision and Action. University of Pittsburgh Press, Pittsburgh (1967); Reprinted in: Davidson, D.: Essays on Actions and Events, pp. 105–122. Claredon Press, Oxford (1980)

    Google Scholar 

  • Dekker, P.: Dynamic Semantics. Studies in Linguistics and Philosophy, vol. 91. Springer, Dordrecht (2012)

    Book  Google Scholar 

  • Han, C.-H., Han, N.-R., Ko, E.-S.: Bracketing guidelines for Penn Korean TreeBank. Tech. Rep. IRCS Report 01-10, Institute for Research in Cognitive Science, University of Pennsylvania (2001)

    Google Scholar 

  • Hashimoto, S.: Essentials of Japanese Grammar (Kokugoho Yousetsu). Iwanami (1934) (in Japanese)

    Google Scholar 

  • Kawahara, D., Sasano, R., Kurohashi, S., Hashida, K.: Specification for annotating case, ellipsis and coreference. Kyoto Text Corpus Version 4.0 (2005) (in Japanese)

    Google Scholar 

  • King, T.H., Crouch, R., Riezler, S., Dalrymple, M., Kaplan, R.M.: The PARC 700 Dependency Bank. In: Proceedings of the 4th International Workshop on Linguistically Interpreted Corpora, held at the 10th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2003), Budapest (2003)

    Google Scholar 

  • Kurohashi, S., Nagao, M.: Building a Japanese parsed corpus – while improving the parsing system. In: Abeillé, A. (ed.) Treebanks: Building and Using Parsed Corpora, ch. 14, pp. 249–260. Kluwer Academic Publishers, Dordrecht (2003)

    Chapter  Google Scholar 

  • Miyao, Y., Ninomiya, T., Tsujii, J.: Corpus-oriented grammar development for acquiring a Head-driven Phrase Structure Grammar from the Penn Treebank. In: Su, K.-Y., Tsujii, J., Lee, J.-H., Kwong, O.Y. (eds.) IJCNLP 2004. LNCS (LNAI), vol. 3248, pp. 684–693. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  • Palmer, M., Gildea, D., Kingsbury, P.: The Proposition Bank: An annotated corpus of semantic roles. Computational Linguistics 31(1), 71–106 (2005)

    Article  Google Scholar 

  • Santorini, B.: Annotation manual for the Penn Historical Corpora and the PCEEC (Release 2). Tech. rep., Department of Computer and Information Science, University of Pennsylvania, Philadelphia (2010), http://www.ling.upenn.edu/histcorpora/annotation

  • Vermeulen, C.F.M.: Variables as stacks: A case study in dynamic model theory. Journal of Logic, Language and Information 9, 143–167 (2000)

    Article  MathSciNet  Google Scholar 

  • Xia, F., Palmer, M., Joshi, A.: A uniform method of grammar extraction and its applications. In: Proceedings of the 2000 Conference on Empirical Methods in Natural Language Processing, Hong Kong, pp. 53–62 (2000)

    Google Scholar 

  • Xue, N., Xia, F.: The bracketing guidelines for the Penn Chinese Treebank (3.0). Tech. Rep. 00-08, Institute for Research in Cognitive Science, University of Pennsylvania (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Butler, A., Otomo, R., Zhou, Z., Yoshimoto, K. (2013). Treebank Annotation for Formal Semantics Research. In: Motomura, Y., Butler, A., Bekki, D. (eds) New Frontiers in Artificial Intelligence. JSAI-isAI 2012. Lecture Notes in Computer Science(), vol 7856. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39931-2_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-39931-2_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-39930-5

  • Online ISBN: 978-3-642-39931-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics