Skip to main content

Learning to Rank for Coordination Detection

  • Conference paper
  • First Online:
Computational Linguistics and Intelligent Text Processing (CICLing 2017)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10761))

  • 872 Accesses

Abstract

Coordinations refer to phrases such as “A and/but/or/... B”. The detection of coordinations remains a major problem due to the complexity of their components. Existing work normally classified the training data into two categories: correct and incorrect. This often caused the problem of data imbalance which inevitably damaged performances of the models they used. We propose to fully exploit the differences between training data by formulating the detection of coordinations as a ranking problem to remedy this problem. We develop a novel model based on the long short-term memory network. Experiments on Penn Treebank and Genia verified the effectiveness of the proposed model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Other neural network models can also be employed to learn representations. Here we choose this one for its effectiveness and simplicity.

  2. 2.

    “&” which is usually regarded as a special form of “and” is excluded for it only appears in proper nouns and constitutes simple coordinations that are easy to identify.

  3. 3.

    http://chainer.org/.

  4. 4.

    In Genia, the proportion is even higher.

References

  1. Bikel, D.M.: Intricacies of collins’ parsing model. Comput. Linguist. 30(4), 479–511 (2004)

    Article  Google Scholar 

  2. Buyko, E., Hahn, U.: Are morpho-syntactic features more predictive for the resolution of noun phrase coordination ambiguity than Lexico-semantic similarity scores? In: COLING, vol. 1, pp. 89–96. ACL (2008)

    Google Scholar 

  3. Buyko, E., Tomanek, K., Hahn, U.: Resolution of coordination ellipses in biological named entities using conditional random fields. In: PACLING, pp. 163–171 (2007)

    Google Scholar 

  4. Chantree, F., Kilgarriff, A., De Roeck, A., Willis, A.: Disambiguating coordinations using word distribution information. In: Proceedings of RANLP 2005 (2005)

    Google Scholar 

  5. Elman, J.L.: Distributed representations, simple recurrent networks, and grammatical structure. Mach. Learn. 7(2–3), 195–225 (1991)

    Google Scholar 

  6. Hanamoto, A., Matsuzaki, T., Tsujii, J.: Coordination structure analysis using dual decomposition. In: EACL, pp. 430–438. ACL (2012)

    Google Scholar 

  7. Hara, K., Shimbo, M., Okuma, H., Matsumoto, Y.: Coordinate structure analysis with global structural constraints and alignment-based local features. In: ACL-AFNLP, vol. 2, pp. 967–975. ACL (2009)

    Google Scholar 

  8. Hogan, D.: Coordinate Noun Phrase Disambiguation in a Generative Parsing Model. ACL (2007)

    Google Scholar 

  9. Kawahara, D., Kurohashi, S.: Generative modeling of coordination by factoring parallelism and selectional preferences. In: IJCNLP, pp. 456–464 (2011)

    Google Scholar 

  10. Kim, J.D., Ohta, T., Tateisi, Y., Tsujii, J.: Genia corpurs-semantically annotated corpus for bio-textmining. Bioinformatics 19(suppl 1), i180–i182 (2003)

    Article  Google Scholar 

  11. Kurohashi, S., Nagao, M.: A syntactic analysis method of long Japanese sentences based on the detection of conjunctive structures. Comput. Linguist. 20(4), 507–534 (1994)

    Google Scholar 

  12. Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. arXiv preprint arXiv:1405.4053 (2014)

  13. Li, J., Luong, M.T., Jurafsky, D., Hovy, E.: When are tree structures necessary for deep learning of representations? arXiv preprint arXiv:1503.00185 (2015)

  14. Li, J., Monroe, W., Jurafsky, D.: A simple, fast diverse decoding algorithm for neural generation. arXiv preprint arXiv:1611.08562 (2016)

  15. Li, J., Monroe, W., Shi, T., Ritter, A., Jurafsky, D.: Adversarial learning for neural dialogue generation. arXiv preprint arXiv:1701.06547 (2017)

  16. Lopyrev, K.: Learning distributed representations of phrases (2014)

    Google Scholar 

  17. Marcus, M.P., Marcinkiewicz, M.A., Santorini, B.: Building a large annotated corpus of English: the penn treebank. Comput. Linguist. 19(2), 313–330 (1993)

    Google Scholar 

  18. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS, pp. 3111–3119 (2013)

    Google Scholar 

  19. Mitchell, J., Lapata, M.: Composition in distributional models of semantics. Cogn. Sci. 34(8), 1388–1429 (2010)

    Article  Google Scholar 

  20. Miyao, Y., Tsujii, J.: Deep linguistic analysis for the accurate identification of predicate-argument relations. In: COLING, p. 1392. ACL (2004)

    Google Scholar 

  21. Nakov, P., Hearst, M.: Using the web as an implicit training set: application to structural ambiguity resolution. In: HLT-EMNLP, pp. 835–842. ACL (2005)

    Google Scholar 

  22. Nyblom, J., Kohonen, S., Haverinen, K., Salakoski, T., Ginter, F.: Predicting conjunct propagation and other extended stanford dependencies. In: Proceedings of the International Conference on Dependency Linguistics (Depling 2013), pp. 252–261 (2013)

    Google Scholar 

  23. Okumura, A., Muraki, K.: Symmetric pattern matching analysis for English coordinate structures. In: ANLP, pp. 41–46. ACL (1994)

    Google Scholar 

  24. Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. EMNLP 14, 1532–1543 (2014)

    Google Scholar 

  25. Schachter, P.: Constraints on coördination. Language, pp. 86–103 (1977)

    Google Scholar 

  26. Schmidhuber, J., Gers, F.A., Eck, D.: Learning nonregular languages: a comparison of simple recurrent networks and LSTM. Neural Comput. 14(9), 2039–2041 (2002)

    Article  Google Scholar 

  27. Shimbo, M., Hara, K.: A discriminative learning model for coordinate conjunctions. In: EMNLP-CoNLL, pp. 610–619. ACL (2007)

    Google Scholar 

  28. Sundermeyer, M., Schlüter, R., Ney, H.: LSTM neural networks for language modeling. In: INTERSPEECH, pp. 194–197 (2012)

    Google Scholar 

  29. Wang, X., Sudoh, K., Nagata, M.: Empty category detection with joint context-label embeddings. In: HLT-NAACL, pp. 263–271 (2015)

    Google Scholar 

  30. Wang, X., Sudoh, K., Nagata, M.: Enhanced word embeddings from a hierarchical neural language model. In: CIKM, pp. 1927–1930. ACM (2015)

    Google Scholar 

  31. Weston, J., Bengio, S., Usunier, N.: WSABIE: scaling up to large vocabulary image annotation. IJCAI 11, 2764–2770 (2011)

    Google Scholar 

  32. Yoshimoto, A., Hara, K., Shimbo, M., Matsumoto, Y.: Coordination-aware dependency parsing (preliminary report). IWPT 2015, 66 (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xun Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, X., Li, R., Shindo, H., Sudoh, K., Nagata, M. (2018). Learning to Rank for Coordination Detection. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2017. Lecture Notes in Computer Science(), vol 10761. Springer, Cham. https://doi.org/10.1007/978-3-319-77113-7_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-77113-7_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-77112-0

  • Online ISBN: 978-3-319-77113-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics