Learning to Rank for Coordination Detection

Wang, Xun; Li, Rumeng; Shindo, Hiroyuki; Sudoh, Katsuhito; Nagata, Masaaki

doi:10.1007/978-3-319-77113-7_12

Xun Wang¹⁴,
Rumeng Li¹⁵,
Hiroyuki Shindo¹⁵,
Katsuhito Sudoh¹⁴ &
…
Masaaki Nagata¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10761))

Included in the following conference series:

International Conference on Computational Linguistics and Intelligent Text Processing

872 Accesses

Abstract

Coordinations refer to phrases such as “A and/but/or/... B”. The detection of coordinations remains a major problem due to the complexity of their components. Existing work normally classified the training data into two categories: correct and incorrect. This often caused the problem of data imbalance which inevitably damaged performances of the models they used. We propose to fully exploit the differences between training data by formulating the detection of coordinations as a ranking problem to remedy this problem. We develop a novel model based on the long short-term memory network. Experiments on Penn Treebank and Genia verified the effectiveness of the proposed model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Other neural network models can also be employed to learn representations. Here we choose this one for its effectiveness and simplicity.
2.
“&” which is usually regarded as a special form of “and” is excluded for it only appears in proper nouns and constitutes simple coordinations that are easy to identify.
3.
http://chainer.org/.
4.
In Genia, the proportion is even higher.

References

Bikel, D.M.: Intricacies of collins’ parsing model. Comput. Linguist. 30(4), 479–511 (2004)
Article Google Scholar
Buyko, E., Hahn, U.: Are morpho-syntactic features more predictive for the resolution of noun phrase coordination ambiguity than Lexico-semantic similarity scores? In: COLING, vol. 1, pp. 89–96. ACL (2008)
Google Scholar
Buyko, E., Tomanek, K., Hahn, U.: Resolution of coordination ellipses in biological named entities using conditional random fields. In: PACLING, pp. 163–171 (2007)
Google Scholar
Chantree, F., Kilgarriff, A., De Roeck, A., Willis, A.: Disambiguating coordinations using word distribution information. In: Proceedings of RANLP 2005 (2005)
Google Scholar
Elman, J.L.: Distributed representations, simple recurrent networks, and grammatical structure. Mach. Learn. 7(2–3), 195–225 (1991)
Google Scholar
Hanamoto, A., Matsuzaki, T., Tsujii, J.: Coordination structure analysis using dual decomposition. In: EACL, pp. 430–438. ACL (2012)
Google Scholar
Hara, K., Shimbo, M., Okuma, H., Matsumoto, Y.: Coordinate structure analysis with global structural constraints and alignment-based local features. In: ACL-AFNLP, vol. 2, pp. 967–975. ACL (2009)
Google Scholar
Hogan, D.: Coordinate Noun Phrase Disambiguation in a Generative Parsing Model. ACL (2007)
Google Scholar
Kawahara, D., Kurohashi, S.: Generative modeling of coordination by factoring parallelism and selectional preferences. In: IJCNLP, pp. 456–464 (2011)
Google Scholar
Kim, J.D., Ohta, T., Tateisi, Y., Tsujii, J.: Genia corpurs-semantically annotated corpus for bio-textmining. Bioinformatics 19(suppl 1), i180–i182 (2003)
Article Google Scholar
Kurohashi, S., Nagao, M.: A syntactic analysis method of long Japanese sentences based on the detection of conjunctive structures. Comput. Linguist. 20(4), 507–534 (1994)
Google Scholar
Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. arXiv preprint arXiv:1405.4053 (2014)
Li, J., Luong, M.T., Jurafsky, D., Hovy, E.: When are tree structures necessary for deep learning of representations? arXiv preprint arXiv:1503.00185 (2015)
Li, J., Monroe, W., Jurafsky, D.: A simple, fast diverse decoding algorithm for neural generation. arXiv preprint arXiv:1611.08562 (2016)
Li, J., Monroe, W., Shi, T., Ritter, A., Jurafsky, D.: Adversarial learning for neural dialogue generation. arXiv preprint arXiv:1701.06547 (2017)
Lopyrev, K.: Learning distributed representations of phrases (2014)
Google Scholar
Marcus, M.P., Marcinkiewicz, M.A., Santorini, B.: Building a large annotated corpus of English: the penn treebank. Comput. Linguist. 19(2), 313–330 (1993)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS, pp. 3111–3119 (2013)
Google Scholar
Mitchell, J., Lapata, M.: Composition in distributional models of semantics. Cogn. Sci. 34(8), 1388–1429 (2010)
Article Google Scholar
Miyao, Y., Tsujii, J.: Deep linguistic analysis for the accurate identification of predicate-argument relations. In: COLING, p. 1392. ACL (2004)
Google Scholar
Nakov, P., Hearst, M.: Using the web as an implicit training set: application to structural ambiguity resolution. In: HLT-EMNLP, pp. 835–842. ACL (2005)
Google Scholar
Nyblom, J., Kohonen, S., Haverinen, K., Salakoski, T., Ginter, F.: Predicting conjunct propagation and other extended stanford dependencies. In: Proceedings of the International Conference on Dependency Linguistics (Depling 2013), pp. 252–261 (2013)
Google Scholar
Okumura, A., Muraki, K.: Symmetric pattern matching analysis for English coordinate structures. In: ANLP, pp. 41–46. ACL (1994)
Google Scholar
Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. EMNLP 14, 1532–1543 (2014)
Google Scholar
Schachter, P.: Constraints on coördination. Language, pp. 86–103 (1977)
Google Scholar
Schmidhuber, J., Gers, F.A., Eck, D.: Learning nonregular languages: a comparison of simple recurrent networks and LSTM. Neural Comput. 14(9), 2039–2041 (2002)
Article Google Scholar
Shimbo, M., Hara, K.: A discriminative learning model for coordinate conjunctions. In: EMNLP-CoNLL, pp. 610–619. ACL (2007)
Google Scholar
Sundermeyer, M., Schlüter, R., Ney, H.: LSTM neural networks for language modeling. In: INTERSPEECH, pp. 194–197 (2012)
Google Scholar
Wang, X., Sudoh, K., Nagata, M.: Empty category detection with joint context-label embeddings. In: HLT-NAACL, pp. 263–271 (2015)
Google Scholar
Wang, X., Sudoh, K., Nagata, M.: Enhanced word embeddings from a hierarchical neural language model. In: CIKM, pp. 1927–1930. ACM (2015)
Google Scholar
Weston, J., Bengio, S., Usunier, N.: WSABIE: scaling up to large vocabulary image annotation. IJCAI 11, 2764–2770 (2011)
Google Scholar
Yoshimoto, A., Hara, K., Shimbo, M., Matsumoto, Y.: Coordination-aware dependency parsing (preliminary report). IWPT 2015, 66 (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

NTT Communication Science Laboratories, Kyoto, Japan
Xun Wang, Katsuhito Sudoh & Masaaki Nagata
Nara Institute of Science and Technology, Nara, Japan
Rumeng Li & Hiroyuki Shindo

Authors

Xun Wang
View author publications
You can also search for this author in PubMed Google Scholar
Rumeng Li
View author publications
You can also search for this author in PubMed Google Scholar
Hiroyuki Shindo
View author publications
You can also search for this author in PubMed Google Scholar
Katsuhito Sudoh
View author publications
You can also search for this author in PubMed Google Scholar
Masaaki Nagata
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xun Wang .

Editor information

Editors and Affiliations

CIC, Instituto Politécnico Nacional, Mexico City, Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, X., Li, R., Shindo, H., Sudoh, K., Nagata, M. (2018). Learning to Rank for Coordination Detection. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2017. Lecture Notes in Computer Science(), vol 10761. Springer, Cham. https://doi.org/10.1007/978-3-319-77113-7_12

Download citation

DOI: https://doi.org/10.1007/978-3-319-77113-7_12
Published: 10 October 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-77112-0
Online ISBN: 978-3-319-77113-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics