Abstract
In this chapter, we introduce Integer Linear Programming (ILP) and review some of its best performing applications to coreference resolution in the literature. We develop some intuitions for how to pose ILPs based on learned models and to how expert knowledge can be encoded as constraints that the learned models must then respect. We describe some of the difficulties encountered during both the development of an ILP and its deployment as well as how to deal with them. Finally, we see how ILP can create an environment in which independently learned models share knowledge for their mutual benefit.Most of the top results on coreference resolution over the last few years were achieved using an ILP formulation, and we provide a snapshot of these results. Conceptually, and from an engineering perspective, the ILP formulation is very simple and it provides system designers with a lot of flexibility in incorporating knowledge. Indeed, this is where we believe future research should focus.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
On the other hand, there may be good linguistic motivation to penalize clusters in exactly this way, since most documents have many more small clusters than large ones.
References
Bansal, N., Blum, A., Chawla, S.: Correlation clustering. In: IEEE Symposium of Foundation of Computer Science, Vancouver (2002)
Barzilay, R., Lapata, M.: Aggregation via set partitioning for natural language generation. In: Proceedings of HLT/NAACL, New York, June 2006
Bengtson, E., Roth, D.: Understanding the value of features for coreference resolution. In: Proceedings of the Conference on Empirical Methods for Natural Language Processing (EMNLP), Honolulu, pp. 294–303, Oct 2008
Bertsimas, D., Tsitsiklis, J.N.: Introduction to Linear Optimization. Volume 6 of Athena Scientific Series in Optimization and Neural Computation. Athena Scientific, Nashua (1997)
Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, New York (2004)
Chang, M., Ratinov, L., Rizzolo, N., Roth, D.: Learning and inference with constraints. In: Proceedings of the National Conference on Artificial Intelligence (AAAI), Chicago, pp. 1513–1518, July 2008
Chang, M., Ratinov, L., Roth, D.: Constraints as prior knowledge. In: ICML Workshop on Prior Knowledge for Text and Language Processing, Helsinki, pp. 32–39, July 2008
Chang, K., Samdani, R., Rozovskaya, A., Rizzolo, N., Sammons, M., Roth, D.: Inference protocols for coreference resolution. In: Proceedings of the Annual Conference on Computational Natural Language Learning (CoNLL), Portland, pp. 40–44. Association for Computational Linguistics (2011)
Chang, M., Ratinov, L., Roth, D.: Structured learning with constrained conditional models. Mach. Learn. 88 (3), 399–431 (2012)
Chang, K.-W., Samdani, R., Roth, D.: A constrained latent variable model for coreference resolution. In: Proceedings of the Conference on Empirical Methods for Natural Language Processing (EMNLP), Seattle (2013)
Denis, P., Baldridge, J.: Joint determination of anaphoricity and coreference resolution using integer programming. In: Proceedings of the Annual Meeting of the North American Association of Computational Linguistics (NAACL), Rochester (2007)
Finkel, J.R., Manning, C.D.: The importance of syntactic parsing and inference in semantic role labeling. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics – Human Language Technology Conference, Short Papers (ACL-HLT), Columbus (2008)
Finley, T., Joachims, T.: Supervised clustering with support vector machines. In: Proceedings of the International Conference on Machine Learning (ICML), Bonn (2005)
Freund, R., Mizuno, S.: Interior point methods: current status and future directions. In: Frenk, H., Roos, K., Terlaky, T., Zhang, S. (eds.) High Performance Optimization. Volume 33 of Applied Optimization, chapter 18, pp. 441–446. Springer, New York (2000)
Freund, Y., Schapire, R.E.: Large margin classification using the perceptron algorithm. Mach. Learn. 37 (3), 277–296 (1999)
Hoffman, A., Kruskal, J.: Integral boundary points of convex polyhedra. In: Kuhn, H., Tucker, A. (eds.) Annals of Mathematics Studies, vol. 38, pp. 223–246. Princeton University Press, Princeton (1956). Linear Inequalities and Related Systems
Jeter, M.W.: Mathematical Programming: An Introduction to Optimization. Volume 102 of Monographs and Textbooks in Pure and Applied Mathematics. Marcel Dekker, New York (1986)
Karmarkar, N.: A new polynomial-time algorithm for linear programming. In: Proceedings of the ACM Symposium on the Theory of Computing, New York, pp. 302–311. The Association for Computing Machinery (1984)
Karp, R.: Reducibility among combinatorial problems. In: Miller, R., Thatcher, J. (eds.) Complexity of Computer Computations, pp. 85–103. Plenum Press, New York (1972)
Kundu, G., Srikumar, V., Roth, D.: Margin-based decomposed amortized inference. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), Sofia, vol. 8 (2013)
Luo, X.: On coreference resolution performance metrics. In: Proceedings of the Conference on Empirical Methods for Natural Language Processing (EMNLP), Vancouver (2005)
Marciniak, T., Strube, M.: Beyond the pipeline: discrete optimization in NLP. In: Proceedings of the Annual Conference on Computational Natural Language Learning (CoNLL), Ann Arbor, pp. 136–143. Association for Computational Linguistics, June 2005
Martins, A., Smith, N., Xing, E.: Concise integer linear programming formulations for dependency parsing. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), pp. 342–350. Suntec, Singapore, Aug 2009. Association for Computational Linguistics
Mccallum, A., Wellner, B.: Toward conditional models of identity uncertainty with application to proper noun coreference. In: The Conference on Advances in Neural Information Processing Systems (NIPS), Vancouver (2003)
Nash, J.: The (Dantzig) simplex method for linear programming. Comput. Sci. Eng. 2 (1), 29–31 (2000)
Ng, V., Cardie, C.: Improving machine learning approaches to coreference resolution. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia (2002)
Pascal, D., Baldridge, J.: Global joint models for coreference resolution and named entity classification. In: Procesamiento del Lenguaje Natural. Sociedad Española para el Procesamiento del Lenguaje Natural (SEPLN), Spain (2009)
Peng, H., Khashabi, D., Roth, D.: Solving hard coreference problems. In: Proceedings of the Annual Meeting of the North American Association of Computational Linguistics (NAACL), Denver, vol. 5 (2015)
Pradhan, S., Ramshaw, L., Marcus, M., Palmer, M., Weischedel, R., Xue, N.: Conll-2011 shared task: modeling unrestricted coreference in ontonotes. In: Proceedings of the Annual Conference on Computational Natural Language Learning (CoNLL), Portland (2011)
Punyakanok, V., Roth, D., Yih, W.: The importance of syntactic parsing and inference in semantic role labeling. Comput. Linguist. 34 (2), 257–287 (2008)
Riedel, S., Clarke, J.: Incremental integer linear programming for non-projective dependency parsing. In: Proceedings of the Conference on Empirical Methods for Natural Language Processing (EMNLP), Sydney, pp. 129–137 (2006)
Rizzolo, N., Roth, D.: Modeling discriminative global inference. In: Proceedings of the First International Conference on Semantic Computing (ICSC), Irvine, pp. 597–604. IEEE, Sept 2007
Rizzolo, N., Roth, D.: Learning based Java for rapid development of NLP systems. In: Proceedings of the International Conference on Language Resources and Evaluation, Valletta, May 2010
Roth, D., Yih, W.: A linear programming formulation for global inference in natural language tasks. In: Ng, H.T., Riloff, E. (eds.) Proceedings of the Annual Conference on Computational Natural Language Learning (CoNLL), Boston, pp. 1–8. Association for Computational Linguistics (2004)
Roth, D., Yih, W.: Integer linear programming inference for conditional random fields. In: Proceedings of the International Conference on Machine Learning (ICML), Bonn, pp. 737–744 (2005)
Roth, D., Yih, W.: Global inference for entity and relation identification via a linear programming formulation. In: Getoor, L., Taskar, B. (eds.) Introduction to Statistical Relational Learning. MIT, Cambridge (2007)
Soon, W.M., Ng, H.T., Lim, D.C.Y.: A machine learning approach to coreference resolution of noun phrases. Comput. Linguist. 27 (4), 521-544 (2001)
Srikumar, V., Roth, D.: A joint model for extended semantic role labeling. In: Proceedings of the Conference on Empirical Methods for Natural Language Processing (EMNLP), Edinburgh (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Rizzolo, N., Roth, D. (2016). Integer Linear Programming for Coreference Resolution. In: Poesio, M., Stuckardt, R., Versley, Y. (eds) Anaphora Resolution. Theory and Applications of Natural Language Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-47909-4_11
Download citation
DOI: https://doi.org/10.1007/978-3-662-47909-4_11
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-47908-7
Online ISBN: 978-3-662-47909-4
eBook Packages: Computer ScienceComputer Science (R0)