Skip to main content

Syntactic Analysis of Long Sentences Based on S-Clauses

  • Conference paper
Natural Language Processing – IJCNLP 2004 (IJCNLP 2004)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3248))

Included in the following conference series:

  • 1607 Accesses

Abstract

In dependency parsing of long sentences with fewer subjects than predicates, it is difficult to recognize which predicate governs which subject. To handle such syntactic ambiguity between subjects and predicates, an “S(ubject)-clause” is defined as a group of words containing several predicates and their common subject, and then an automatic S-clause segmentation method is proposed using semantic features as well as morpheme features. We also propose a new dependency tree to reflect S-clauses. Trace information is used to indicate the omitted subject of each predicate. The S-clause information turned out to be very effective in analyzing long sentences, with an improved parsing performance of 4.5%. The precision in determining the governors of subjects in dependency parsing was improved by 32%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agarwal, R., Boggess, L.: A simple but useful approach to conjunct identification. In: Proceedings of the 30th Annual Meeting of the Association for Computational Linguistics, Nantes, France, pp. 15–21 (1992)

    Google Scholar 

  2. Carreras, X., Marquez, L., Punyakanok, V., Roth, D.: Learning and Inference for Clause Identification. In: Proceedings of European Conference on Machine Learning, Helsinki, Finland, pp. 35–47 (2002)

    Google Scholar 

  3. Haruno, M., Shirai, S., Ooyama, Y.: Using Decision Trees to Construct a Practical Parser. In: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics, Monteal, Quebec, Canada, pp. 505–511 (1998)

    Google Scholar 

  4. Kim, E., Lee, J.H.: A Collocation-Based Transfer Model for Japanese-to-Korean Machine Translation. In: Proceedings of the Natural Language Processing Pacific Rim Symposium, Fukuoka, Japan, pp. 223–231 (1993)

    Google Scholar 

  5. Kim, K., Park, E., Ra, D., Yoon, J.: A method of Korean parsing based on sentence segmentation. In: Proceedings of 14th Hangul and Korean Information Processing, Chung-ju, Korea, pp. 163–168 (2002) (written in Korean)

    Google Scholar 

  6. Kurohashi, S., Nagao, M.: A syntactic analysis method of long Japanese sentences based on the detection of conjunctive structures. Computational Linguistics 20(4), 507–534 (1994)

    Google Scholar 

  7. Leffa, V.J.: Clause processing in complex sentences. In: Proceedings of the 1st International Conference on Language Resources and Evaluation, Granada, Spain, pp. 937–943 (1998)

    Google Scholar 

  8. Lombardo, V., Lesmo, L.: A formal theory of dependency syntax with non-lexical units. In: Traitement Automatique des Langues, Mel.cuk I.: Dependency syntax: theory and practice, SUNY University Press (1988)

    Google Scholar 

  9. Molina, A., Pla, F.: Clause Detection using HMM. In: Proceedings of the 5th Conference on Computational Natural Language Learning, Toulouse, France, pp. 70–72 (2001)

    Google Scholar 

  10. Nomoto, T., Matsumoto, Y.: Discourse Parsing: A Decision Tree Approach. In: Proceedings of the 6th Workshop on Very Large Corpora, Montreal, Quebec, Canada, pp. 216–224 (1998)

    Google Scholar 

  11. Palmer, D.D., Hearst, M.A.: Adaptive Multilingual Sentence Boundary Disambiguation. Computational Linguistics 27, 241–261 (1997)

    Google Scholar 

  12. Ross Quinlan, J.: C4.5 Programs for Machine Learning. Morgan Kaufmann Publishers, San Francisco (1993)

    Google Scholar 

  13. Sang, E.F.T.K., Dejean, H.: Introduction to the CoNLL 2001 Shared Task: Clause Identification. In: Proceedings of CoNLL 2001, Toulouse, France, pp. 53–57 (2001)

    Google Scholar 

  14. Sornertlamvanich, V., Potipiti, T., Charoenporn, T.: Automatic Corpus- Based Thai Word Extraction with the C4.5 Learning Algorithm. In: Proceedings of the 18th International Conference on Computational Linguistics, Saarbrucken, Germany, pp. 802–807 (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kim, MY., Lee, JH. (2005). Syntactic Analysis of Long Sentences Based on S-Clauses. In: Su, KY., Tsujii, J., Lee, JH., Kwong, O.Y. (eds) Natural Language Processing – IJCNLP 2004. IJCNLP 2004. Lecture Notes in Computer Science(), vol 3248. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30211-7_55

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30211-7_55

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-24475-2

  • Online ISBN: 978-3-540-30211-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics