Skip to main content

Integer Linear Programming for Coreference Resolution

  • Chapter
  • First Online:
Anaphora Resolution

Abstract

In this chapter, we introduce Integer Linear Programming (ILP) and review some of its best performing applications to coreference resolution in the literature. We develop some intuitions for how to pose ILPs based on learned models and to how expert knowledge can be encoded as constraints that the learned models must then respect. We describe some of the difficulties encountered during both the development of an ILP and its deployment as well as how to deal with them. Finally, we see how ILP can create an environment in which independently learned models share knowledge for their mutual benefit.Most of the top results on coreference resolution over the last few years were achieved using an ILP formulation, and we provide a snapshot of these results. Conceptually, and from an engineering perspective, the ILP formulation is very simple and it provides system designers with a lot of flexibility in incorporating knowledge. Indeed, this is where we believe future research should focus.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    On the other hand, there may be good linguistic motivation to penalize clusters in exactly this way, since most documents have many more small clusters than large ones.

References

  1. Bansal, N., Blum, A., Chawla, S.: Correlation clustering. In: IEEE Symposium of Foundation of Computer Science, Vancouver (2002)

    Book  MATH  Google Scholar 

  2. Barzilay, R., Lapata, M.: Aggregation via set partitioning for natural language generation. In: Proceedings of HLT/NAACL, New York, June 2006

    Google Scholar 

  3. Bengtson, E., Roth, D.: Understanding the value of features for coreference resolution. In: Proceedings of the Conference on Empirical Methods for Natural Language Processing (EMNLP), Honolulu, pp. 294–303, Oct 2008

    Google Scholar 

  4. Bertsimas, D., Tsitsiklis, J.N.: Introduction to Linear Optimization. Volume 6 of Athena Scientific Series in Optimization and Neural Computation. Athena Scientific, Nashua (1997)

    Google Scholar 

  5. Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, New York (2004)

    Book  MATH  Google Scholar 

  6. Chang, M., Ratinov, L., Rizzolo, N., Roth, D.: Learning and inference with constraints. In: Proceedings of the National Conference on Artificial Intelligence (AAAI), Chicago, pp. 1513–1518, July 2008

    Google Scholar 

  7. Chang, M., Ratinov, L., Roth, D.: Constraints as prior knowledge. In: ICML Workshop on Prior Knowledge for Text and Language Processing, Helsinki, pp. 32–39, July 2008

    Google Scholar 

  8. Chang, K., Samdani, R., Rozovskaya, A., Rizzolo, N., Sammons, M., Roth, D.: Inference protocols for coreference resolution. In: Proceedings of the Annual Conference on Computational Natural Language Learning (CoNLL), Portland, pp. 40–44. Association for Computational Linguistics (2011)

    Google Scholar 

  9. Chang, M., Ratinov, L., Roth, D.: Structured learning with constrained conditional models. Mach. Learn. 88 (3), 399–431 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  10. Chang, K.-W., Samdani, R., Roth, D.: A constrained latent variable model for coreference resolution. In: Proceedings of the Conference on Empirical Methods for Natural Language Processing (EMNLP), Seattle (2013)

    Google Scholar 

  11. Denis, P., Baldridge, J.: Joint determination of anaphoricity and coreference resolution using integer programming. In: Proceedings of the Annual Meeting of the North American Association of Computational Linguistics (NAACL), Rochester (2007)

    Google Scholar 

  12. Finkel, J.R., Manning, C.D.: The importance of syntactic parsing and inference in semantic role labeling. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics – Human Language Technology Conference, Short Papers (ACL-HLT), Columbus (2008)

    Google Scholar 

  13. Finley, T., Joachims, T.: Supervised clustering with support vector machines. In: Proceedings of the International Conference on Machine Learning (ICML), Bonn (2005)

    Google Scholar 

  14. Freund, R., Mizuno, S.: Interior point methods: current status and future directions. In: Frenk, H., Roos, K., Terlaky, T., Zhang, S. (eds.) High Performance Optimization. Volume 33 of Applied Optimization, chapter 18, pp. 441–446. Springer, New York (2000)

    Chapter  Google Scholar 

  15. Freund, Y., Schapire, R.E.: Large margin classification using the perceptron algorithm. Mach. Learn. 37 (3), 277–296 (1999)

    Article  MATH  Google Scholar 

  16. Hoffman, A., Kruskal, J.: Integral boundary points of convex polyhedra. In: Kuhn, H., Tucker, A. (eds.) Annals of Mathematics Studies, vol. 38, pp. 223–246. Princeton University Press, Princeton (1956). Linear Inequalities and Related Systems

    Google Scholar 

  17. Jeter, M.W.: Mathematical Programming: An Introduction to Optimization. Volume 102 of Monographs and Textbooks in Pure and Applied Mathematics. Marcel Dekker, New York (1986)

    Google Scholar 

  18. Karmarkar, N.: A new polynomial-time algorithm for linear programming. In: Proceedings of the ACM Symposium on the Theory of Computing, New York, pp. 302–311. The Association for Computing Machinery (1984)

    Google Scholar 

  19. Karp, R.: Reducibility among combinatorial problems. In: Miller, R., Thatcher, J. (eds.) Complexity of Computer Computations, pp. 85–103. Plenum Press, New York (1972)

    Chapter  Google Scholar 

  20. Kundu, G., Srikumar, V., Roth, D.: Margin-based decomposed amortized inference. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), Sofia, vol. 8 (2013)

    Google Scholar 

  21. Luo, X.: On coreference resolution performance metrics. In: Proceedings of the Conference on Empirical Methods for Natural Language Processing (EMNLP), Vancouver (2005)

    Google Scholar 

  22. Marciniak, T., Strube, M.: Beyond the pipeline: discrete optimization in NLP. In: Proceedings of the Annual Conference on Computational Natural Language Learning (CoNLL), Ann Arbor, pp. 136–143. Association for Computational Linguistics, June 2005

    Google Scholar 

  23. Martins, A., Smith, N., Xing, E.: Concise integer linear programming formulations for dependency parsing. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), pp. 342–350. Suntec, Singapore, Aug 2009. Association for Computational Linguistics

    Google Scholar 

  24. Mccallum, A., Wellner, B.: Toward conditional models of identity uncertainty with application to proper noun coreference. In: The Conference on Advances in Neural Information Processing Systems (NIPS), Vancouver (2003)

    Google Scholar 

  25. Nash, J.: The (Dantzig) simplex method for linear programming. Comput. Sci. Eng. 2 (1), 29–31 (2000)

    Article  Google Scholar 

  26. Ng, V., Cardie, C.: Improving machine learning approaches to coreference resolution. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia (2002)

    Google Scholar 

  27. Pascal, D., Baldridge, J.: Global joint models for coreference resolution and named entity classification. In: Procesamiento del Lenguaje Natural. Sociedad Española para el Procesamiento del Lenguaje Natural (SEPLN), Spain (2009)

    Google Scholar 

  28. Peng, H., Khashabi, D., Roth, D.: Solving hard coreference problems. In: Proceedings of the Annual Meeting of the North American Association of Computational Linguistics (NAACL), Denver, vol. 5 (2015)

    Google Scholar 

  29. Pradhan, S., Ramshaw, L., Marcus, M., Palmer, M., Weischedel, R., Xue, N.: Conll-2011 shared task: modeling unrestricted coreference in ontonotes. In: Proceedings of the Annual Conference on Computational Natural Language Learning (CoNLL), Portland (2011)

    Google Scholar 

  30. Punyakanok, V., Roth, D., Yih, W.: The importance of syntactic parsing and inference in semantic role labeling. Comput. Linguist. 34 (2), 257–287 (2008)

    Article  Google Scholar 

  31. Riedel, S., Clarke, J.: Incremental integer linear programming for non-projective dependency parsing. In: Proceedings of the Conference on Empirical Methods for Natural Language Processing (EMNLP), Sydney, pp. 129–137 (2006)

    Google Scholar 

  32. Rizzolo, N., Roth, D.: Modeling discriminative global inference. In: Proceedings of the First International Conference on Semantic Computing (ICSC), Irvine, pp. 597–604. IEEE, Sept 2007

    Google Scholar 

  33. Rizzolo, N., Roth, D.: Learning based Java for rapid development of NLP systems. In: Proceedings of the International Conference on Language Resources and Evaluation, Valletta, May 2010

    Google Scholar 

  34. Roth, D., Yih, W.: A linear programming formulation for global inference in natural language tasks. In: Ng, H.T., Riloff, E. (eds.) Proceedings of the Annual Conference on Computational Natural Language Learning (CoNLL), Boston, pp. 1–8. Association for Computational Linguistics (2004)

    Google Scholar 

  35. Roth, D., Yih, W.: Integer linear programming inference for conditional random fields. In: Proceedings of the International Conference on Machine Learning (ICML), Bonn, pp. 737–744 (2005)

    Google Scholar 

  36. Roth, D., Yih, W.: Global inference for entity and relation identification via a linear programming formulation. In: Getoor, L., Taskar, B. (eds.) Introduction to Statistical Relational Learning. MIT, Cambridge (2007)

    Google Scholar 

  37. Soon, W.M., Ng, H.T., Lim, D.C.Y.: A machine learning approach to coreference resolution of noun phrases. Comput. Linguist. 27 (4), 521-544 (2001)

    Article  Google Scholar 

  38. Srikumar, V., Roth, D.: A joint model for extended semantic role labeling. In: Proceedings of the Conference on Empirical Methods for Natural Language Processing (EMNLP), Edinburgh (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nick Rizzolo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Rizzolo, N., Roth, D. (2016). Integer Linear Programming for Coreference Resolution. In: Poesio, M., Stuckardt, R., Versley, Y. (eds) Anaphora Resolution. Theory and Applications of Natural Language Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-47909-4_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-47909-4_11

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-47908-7

  • Online ISBN: 978-3-662-47909-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics