Syntactic Analysis of Long Sentences Based on S-Clauses

Kim, Mi-Young; Lee, Jong-Hyeok

doi:10.1007/978-3-540-30211-7_55

Mi-Young Kim²² &
Jong-Hyeok Lee²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3248))

Included in the following conference series:

International Conference on Natural Language Processing

1607 Accesses

Abstract

In dependency parsing of long sentences with fewer subjects than predicates, it is difficult to recognize which predicate governs which subject. To handle such syntactic ambiguity between subjects and predicates, an “S(ubject)-clause” is defined as a group of words containing several predicates and their common subject, and then an automatic S-clause segmentation method is proposed using semantic features as well as morpheme features. We also propose a new dependency tree to reflect S-clauses. Trace information is used to indicate the omitted subject of each predicate. The S-clause information turned out to be very effective in analyzing long sentences, with an improved parsing performance of 4.5%. The precision in determining the governors of subjects in dependency parsing was improved by 32%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agarwal, R., Boggess, L.: A simple but useful approach to conjunct identification. In: Proceedings of the 30th Annual Meeting of the Association for Computational Linguistics, Nantes, France, pp. 15–21 (1992)
Google Scholar
Carreras, X., Marquez, L., Punyakanok, V., Roth, D.: Learning and Inference for Clause Identification. In: Proceedings of European Conference on Machine Learning, Helsinki, Finland, pp. 35–47 (2002)
Google Scholar
Haruno, M., Shirai, S., Ooyama, Y.: Using Decision Trees to Construct a Practical Parser. In: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics, Monteal, Quebec, Canada, pp. 505–511 (1998)
Google Scholar
Kim, E., Lee, J.H.: A Collocation-Based Transfer Model for Japanese-to-Korean Machine Translation. In: Proceedings of the Natural Language Processing Pacific Rim Symposium, Fukuoka, Japan, pp. 223–231 (1993)
Google Scholar
Kim, K., Park, E., Ra, D., Yoon, J.: A method of Korean parsing based on sentence segmentation. In: Proceedings of 14th Hangul and Korean Information Processing, Chung-ju, Korea, pp. 163–168 (2002) (written in Korean)
Google Scholar
Kurohashi, S., Nagao, M.: A syntactic analysis method of long Japanese sentences based on the detection of conjunctive structures. Computational Linguistics 20(4), 507–534 (1994)
Google Scholar
Leffa, V.J.: Clause processing in complex sentences. In: Proceedings of the 1st International Conference on Language Resources and Evaluation, Granada, Spain, pp. 937–943 (1998)
Google Scholar
Lombardo, V., Lesmo, L.: A formal theory of dependency syntax with non-lexical units. In: Traitement Automatique des Langues, Mel.cuk I.: Dependency syntax: theory and practice, SUNY University Press (1988)
Google Scholar
Molina, A., Pla, F.: Clause Detection using HMM. In: Proceedings of the 5th Conference on Computational Natural Language Learning, Toulouse, France, pp. 70–72 (2001)
Google Scholar
Nomoto, T., Matsumoto, Y.: Discourse Parsing: A Decision Tree Approach. In: Proceedings of the 6th Workshop on Very Large Corpora, Montreal, Quebec, Canada, pp. 216–224 (1998)
Google Scholar
Palmer, D.D., Hearst, M.A.: Adaptive Multilingual Sentence Boundary Disambiguation. Computational Linguistics 27, 241–261 (1997)
Google Scholar
Ross Quinlan, J.: C4.5 Programs for Machine Learning. Morgan Kaufmann Publishers, San Francisco (1993)
Google Scholar
Sang, E.F.T.K., Dejean, H.: Introduction to the CoNLL 2001 Shared Task: Clause Identification. In: Proceedings of CoNLL 2001, Toulouse, France, pp. 53–57 (2001)
Google Scholar
Sornertlamvanich, V., Potipiti, T., Charoenporn, T.: Automatic Corpus- Based Thai Word Extraction with the C4.5 Learning Algorithm. In: Proceedings of the 18th International Conference on Computational Linguistics, Saarbrucken, Germany, pp. 802–807 (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

Div. of Electrical and Computer Engineering, Pohang University of Science and Technology (POSTECH) and, Advanced Information Technology Research Center(AlTrc), Republic of Korea
Mi-Young Kim & Jong-Hyeok Lee

Authors

Mi-Young Kim
View author publications
You can also search for this author in PubMed Google Scholar
Jong-Hyeok Lee
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Behavior Design Corporation, IV Science-Based Industrial Park Hsinchu, 2F, No.5, Industry E. Rd, Taiwan
Keh-Yih Su
University of Tokyo, Hongo 7-3-1, Bunkyo-ku, Tokyo 113-0033, JST CREST, Honcho 4-1-8, Kawaguchi-shi,, 332-0012, Saitama,
Jun’ichi Tsujii
Pohang University of Science and Technology (POSTECH), AITrc, Republic of Korea
Jong-Hyeok Lee
Language Information Sciences Research Centre, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong
Oi Yee Kwong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kim, MY., Lee, JH. (2005). Syntactic Analysis of Long Sentences Based on S-Clauses. In: Su, KY., Tsujii, J., Lee, JH., Kwong, O.Y. (eds) Natural Language Processing – IJCNLP 2004. IJCNLP 2004. Lecture Notes in Computer Science(), vol 3248. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30211-7_55

Download citation

DOI: https://doi.org/10.1007/978-3-540-30211-7_55
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24475-2
Online ISBN: 978-3-540-30211-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics