Abstract
Off-the-shelf coreference tools can be a useful ingredient for downstream applications in machine translation, information extraction or sentiment recognition. In this chapter, we will present the properties that are most important for the integration of coreference systems into a larger context, then describe the BART system, the dCoref system that is part of Stanford’s CoreNLP suite, as well as IMSCoref and HOTCoref as examples of state-of-the-art systems that are purely based on machine learning. We finish the chapter by outlining a checklist-based approach on choosing, integrating and adapting a coreference system for a putative new application context.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
In the context of this discussion, standardized amounts to being described well enough that it is possible to write interoperable programs that solve edge cases in the same way, and that it is possible to get stakeholders to agree on one particular interpretation of that description. Formal endorsement by a government or standards body are not relevant to the descriptions in this chapter, although these do improve the feasibility for institutional users to buy or commission such components.
References
Athar, A., Teufel, S.: Context-enhanced citation sentiment detection. In: Proceedings of the 2012 Conference of the NAACL: HLT, Association for Computational Linguistics, Montréal, pp. 597–601 (2012). http://www.aclweb.org/anthology/N12-1073
Bernaola Biggio, S.M., Giuliano, C., Poesio, M., Versley, Y., Uryupina, O., Zanoli, R.: Local entity detection and recognition task. In: Proceedings of Evalita-2009, Reggio Emilia (2009)
Berndtsson, J.: Coreference resolution in BART: essay assignment for Semantic Analysis in Language Technology. http://stp.lingfil.uu.se/~santinim/sais/Ass1_Essays_FinalVersion/Berntsson_Jakob_essay_final.pdf (2014)
Björkelund, A., Farkas, R.: Data-driven multilingual coreference resolution using resolver stacking. In: Joint Conference on EMNLP and CoNLL – Shared Task, Jeju Island, pp. 49–55. Association for Computational Linguistics (2012). http://www.aclweb.org/anthology/W12-4503
Björkelund, A., Kuhn, J.: Phrase structures and dependencies for end-to-end coreference resolution. In: Proceedings of COLING 2012: Posters, The COLING 2012 Organizing Committee, Mumbai, pp. 145–154 (2012). http://www.aclweb.org/anthology/C12-2015
Björkelund, A., Kuhn, J.: Learning structured perceptrons for coreference resolution with latent antecedents and non-local features. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. Volume 1: Long Papers. Association for Computational Linguistics, Baltimore, pp. 47–57 (2014). http://www.aclweb.org/anthology/P14-1005
Broscheit, S., Poesio, M., POnzetto, S., Rodriguez, K.J., Romano, L., Uryupina, O., Versley, Y., Zanoli, R.: BART: A multilingual anaphora resolution system. In: Proceedings of SemEval-2010, Uppsala (2010)
Broscheit, S., Ponzetto, S.P., Versley, Y., Poesio, M.: Extending BART to provide a coreference resolution system for German. In: Proceedings of the 7th International Conference on Language Resources and Evaluation, Valletta (2010)
Cai, J., Strube, M.: End-to-end coreference resolution via hypergraph partitioning. In: Proceedings of Coling 2010, Beijing (2010)
Cai, J., Mujdricza-Maydt, E., Strube, M.: Unrestricted coreference resolution via global hypergraph partitioning. In: Proceedings of the 15th Conference on Computational Natural Language Learning: Shared Task, Portland (2011)
Chang, K.W., Samdani, R., Rozovskaya, A., Sammons, M., Roth, D.: Illinois-coref: the ui system in the conll-2012 shared task. In: Joint Conference on EMNLP and CoNLL – Shared Task, pp. 113–117. Association for Computational Linguistics, Jeju Island (2012). http://www.aclweb.org/anthology/W12-4513
Chang, K.W., Samdani, R., Roth, D.: A constrained latent variable model for coreference resolution. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 601–612. Association for Computational Linguistics, Seattle (2013). http://www.aclweb.org/anthology/D13-1057
Charniak, E., Johnson, M.: Coarse-to-fine n-best parsing and maxent discriminative reranking. In: Proceedings of the ACL 2005, Ann Arbor (2005)
Collins, M.: Discriminative training methods for hidden markov models: theory and experiments with perceptron algorithms. In: Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing, pp. 1–8. Association for Computational Linguistics (2002). doi:10.3115/1118693.1118694. http://www.aclweb.org/anthology/W02-1001
Culotta, A., Wick, M., McCallum, A.: First-order probabilistic models for coreference resolution. In: Proceedings of the HLT/NAACL 2007, Rochester (2007)
Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evolut. Comput. 6 (2), 181–197 (2002)
Durrett, G., Klein, D.: Easy victories and uphill battles in coreference resolution. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1971–1982. Association for Computational Linguistics, Seattle (2013). http://www.aclweb.org/anthology/D13-1203
Elsner, M.: Character-based kernels for novelistic plot structure. In: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, pp. 634–644. Association for Computational Linguistics, Avignon (2012). http://www.aclweb.org/anthology/E12-1065
Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)
Fernandes, E., dos Santos, C., Milidiú, R.: Latent structure perceptron with feature induction for unrestricted coreference resolution. In: Joint Conference on EMNLP and CoNLL – Shared Task, pp. 41–48. Association for Computational Linguistics, Jeju Island (2012). http://www.aclweb.org/anthology/W12-4502
Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by gibbs sampling. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, University of Michigan (2005)
Foster, J., Cetinooglu, O., Wagner, J., Le Roux, J., Nivre, J., Hogan, D., van Genabith, J.: From news to comment: resources and benchmarks for parsing the language of Web 2.0. In: Proceedings of IJCNLP, Chiang Mai (2011)
Garrido, G., Cabaleiro, B., Penas, A., Rodrigo, A., Spina, D.: A distant supervised learning system for the tac-kbp slot filling and temporal slot filling tasks. In: Proceedings of Text Analysis Conference (TAC), Gaithersburg (2011)
Giesbrecht, E., Evert, S.: Part-of-speech tagging – a solved task? An evaluation of POS taggers for the Web as corpus. In: Proceedings of the 5th Web as Corpus Workshop (WaC 5), San Sebastian (2009)
Hardmeier, C.: Discourse in statistical machine translation: a survey and a case study. Discours 11 (2012). [online]. doi:10.4000/discours.8726
Hardmeier, C., Federico, M.: Modelling pronominal anaphora in statistical machine translation. In: Proceedings of the 7th International Workshop on Spoken Language Translation (IWSLT 2010), Paris (2010)
Klein, D., Manning, C.D.: Fast exact inference with a factored model for natural language parsing. In: NIPS 2002, Vancouver (2003)
Kobdani, H., Schütze, H.: Supervised coreference resolution with SUCRE. In: Proceedings of the 15th Conference on Natural Language Learning: Shared Task, Portland, pp. 71–75 (2011)
Kopeć, M., Ogrodniczuk, M.: Creating a coreference resolution system for polish. In: Proceedings of LREC 2010, Valletta (2010)
Kunze, C., Lemnitzer, L.: GermaNet – representation, visualization, application. In: Proceedings of LREC 2002, Las Palmas (2002)
Lee, H., Chang, A., Peirsman, Y., Chambers, N., Surdeanu, M., Jurafsky, D.: Deterministic coreference resolution based on entity-centric, precision-ranked rules. Comput. Linguist. 39 (4), 885–916 (2013)
Markert, K., Nissim, M.: Comparing knowledge sources for nominal anaphora resolution. Comput. Linguist. 31 (3), 367–402 (2005)
Martschat, S.: Multigraph clustering for unsupervised coreference resolution. In: Proceedings of the ACL Student Research Workshop, Sofia (2013)
Martschat, S., Cai, J., Broscheit, S., Mujdricza-Maydt, E., Strube, M.: A multigraph model for coreference resolution. In: Proceedings of the Shared Task of the 16th Conference on Computational Natural Language Learning, Jeju Island (2012)
Minnen, G., Caroll, J., Pearce, D.: Applied morphological processing of English. Nat. Lang. Eng. 7 (3), 207–223 (2001)
Morton, T.S.: Coreference for NLP Applications. In: Proceedings of the 38th Meeting of the Association for Computational Linguistics, Hong Kong (2000). http://aclweb.org/anthology-new/P/P00/P00-1023.pdf
Müller, C., Strube, M.: Multi-level annotation of linguistic data with MMAX2. In: Braun, S., Kohn, K., Mukherjee, J. (eds.) Corpus Technology and Language Pedagogy: New Resources, New Tools, New Methods, Peter Lang, Frankfurt a,M. (2006)
Ng, V., Cardie, C.: Improving machine learning approaches to coreference resolution. In: Proceedings of 40th Annual Meeting of the Association for Computational Linguistics, pp. 104–111. Association for Computational Linguistics, Philadelphia (2002). doi:10.3115/1073083.1073102. http://www.aclweb.org/anthology/P02-1014
Petrov, S., Barett, L., Thibaux, R., Klein, D.: Learning accurate, compact, and interpretable tree annotation. In: COLING-ACL 2006, Sydney (2006)
Poesio, M., Kabadjov, M.A.: A general-purpose, off-the-shelf anaphora resolution module: implementation and preliminary evaluation. In: LREC’2004, Lisbon (2004)
Poesio, M., Mehta, R., Maroudas, A., Hitzeman, J.: Learning to resolve bridging references. In: ACL-2004 (2004). http://cswww.essex.ac.uk/staff/poesio/publications/ACL04.pdf
Poesio, M., Uryupina, O., Versley, Y.: Creating a coreference resolution system for italian. In: Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010), Valletta (2010)
Ponzetto, S.P., Strube, M.: Exploiting semantic role labeling, WordNet and Wikipedia for coreference resolution. In: Proceedings of HLT/NAACL 2006, New York (2006)
Pradhan, S., Moschitti, A., Xue, N., Uryupina, O., Zhang, Y.: Conll-2012 shared task: modeling multilingual unrestricted coreference in ontonotes. In: Joint Conference on EMNLP and CoNLL – Shared Task, pp. 1–40. Association for Computational Linguistics, Jeju Island (2012). http://www.aclweb.org/anthology/W12-4501
Qiu, L., Kan, M.Y., Chua, T.S.: A public reference implementation of the RAP anaphora resolution algorithm. In: Proceedings of LREC 2004, Lisbon (2004)
Recasens, M., Can, M., Jurafsky, D.: Same referent, different words: unsupervised mining of opaque coreferent mentions. In: Proceedings of NAACL-HLT 2013, Atlanta (2013)
Recasens, M., de Marneffe, M.C., Potts, C.: The life and death of discourse entities: identifying singleton mentions. In: Proceedings of HLT-NAACL 2013, Atlanta (2013)
Reiter, N., Hellwig, O., Mishra, A., Gossmann, I., Larios, B.M., Rodrigues, J., Zeller, B., Frank, A.: Adapting standard NLP tools and resources to the processing of ritual descriptions. In: Proceedings of the ECAI 2010 Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH), Lisbon (2010)
Sikdar, U.K., Ekbal, A., Saha, S., Uryupina, O., Poesio, M.: Differential evolution-based feature selection technique for anaphora resolution. Soft Comput. 19 (8), 2149–2161 (2015)
Soon, W.M., Ng, H.T., Lim, D.C.Y.: A machine learning approach to coreference resolution of noun phrases. Comput. Linguist. 27 (4), 521–544 (2001). http://acl.eldoc.ub.rug.nl/mirror/J/J01/J01-4004.pdf
Telljohann, H., Hinrichs, E.W., Kübler, S., Zinsmeister, H., Beck, K.: Stylebook for the Tübingen Treebank of Written German (TüBa-D/Z). Tech. rep., Seminar für Sprachwissenschaft, Universität Tübingen (2009)
Uryupina, O., Saha, S., Ekbal, A., Poesio, M.: Multi-metric optimization for coreference: the unitn / iitp / essex submission to the CoNLL shared task. In: Proceedings of CoNLL-2011, Portland (2011)
Uryupina, O., Moschitti, A., Poesio, M.: BART goes multilingual: the UniTN/Essex submission to the CoNLL-2012 shared task. In: Proceedings of the Joint Conference on EMNLP and CoNLL: Shared Task, Jeju Island (2012)
Vadlapudi, R.: Verbose labels for semantic roles. Master’s thesis, Simon Fraser University (2013)
Versley, Y.: A constraint-based approach to noun phrase coreference resolution in German newspaper text. In: Konferenz zur Verarbeitung Natürlicher Sprache (KONVENS 2006), Konstanz (2006)
Versley, Y.: Antecedent selection techniques for high-recall coreference resolution. In: EMNLP 2007, Prague (2007)
Versley, Y., Moschitti, A., Poesio, M., Yang, X.: Coreference systems based on kernel methods. In: Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), Manchster (2008)
Versley, Y., Ponzetto, S., Poesio, M., Eidelman, V., Jern, A., Smith, J., Yang, X., Moschitti, A.: BART: a modular toolkit for coreference resolution. In: ACL 2008 System Demonstrations, Baltimore (2008)
Versley, Y., Beck, A.K., Hinrichs, E., Telljohann, H.: A syntax-first approach to high-quality morphological analysis and lemma disambiguation for the TüBa-D/Z treebank. In: Proceedings of the 9th Conference on Treebanks and Linguistic Theories (TLT9), Tartu (2010)
Wang, R., Zhang, Y., Neumann, G.: A joint syntactic-semantic representation for recognizing textual relatedness. In: Text Analysis Conference TAC 2009 Notebook Papers and Results, Gaithersburg (2009)
Wellner, B., Vilain, M.: Leveraging machine readable dictionaries in discriminative sequence models. In: Proceedings of LREC 2006, Genoa (2006)
Yang, X., Su, J., Tan, C.L.: Kernel-based pronoun resolution with structured syntactic knowledge. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, ACL-44, pp. 41–48 (2006). doi:10.3115/1220175.1220181. http://dx.doi.org/10.3115/1220175.1220181
Yang, X., Su, J., Tan, C.L.: Kernel-based pronoun resolution with structured syntactic knowledge. In: Proceedings of CoLing/ACL-2006 (2006). http://www.aclweb.org/anthology/P/P06/P06-1006
Young, P., Lai, A., Hodosh, M., Hockenmaier, J.: From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions. Trans. Assoc. Comput. Linguist. 3, 67–78 (2014)
Zhao, S., Ng, H.T.: Maximum metric score training for coreference resolution. In: Proceedings of Coling 2010, Beijing (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Versley, Y., Björkelund, A. (2016). Off-the-Shelf Tools. In: Poesio, M., Stuckardt, R., Versley, Y. (eds) Anaphora Resolution. Theory and Applications of Natural Language Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-47909-4_8
Download citation
DOI: https://doi.org/10.1007/978-3-662-47909-4_8
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-47908-7
Online ISBN: 978-3-662-47909-4
eBook Packages: Computer ScienceComputer Science (R0)