Applications of Reinforcement Learning to Structured Prediction

Maes, Francis; Denoyer, Ludovic; Gallinari, Patrick

doi:10.1007/978-3-540-89722-4_16

Francis Maes³,
Ludovic Denoyer³ &
Patrick Gallinari³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5323))

Included in the following conference series:

European Workshop on Reinforcement Learning

1098 Accesses

Abstract

Supervised learning is about learning functions given a set of input and corresponding output examples. A recent trend in this field is to consider structured outputs such as sequences, trees or graphs. When predicting such structured data, learning models have to select solutions within very large discrete spaces. The combinatorial nature of this problem has recently led to learning models integrating a search component.

In this paper, we show that Structured Prediction (SP) can be seen as a sequential decision problem. We introduce SP-MDP: a Markov Decision Process based formulation of Structured Prediction. Learning the optimal policy in SP-MDP is shown to be equivalent as solving the SP problem. This allows us to apply classical Reinforcement Learning (RL) algorithms to SP. We present experiments on two tasks. The first, sequence labeling, has been extensively studied and allows us to compare the RL approach with traditional SP methods. The second, tree transformation, is a challenging SP task with numerous large-scale real-world applications. We show successful results with general RL algorithms on this task on which traditional SP models fail.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Collins, M.: Discriminative training methods for hidden markov models: Theory and experiments with perceptron algorithms. In: EMNLP (2002)
Google Scholar
Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: ICML 2001 (2001)
Google Scholar
Tsochantaridis, I., Hofmann, T., Joachims, T., Altun, Y.: Support vector machine learning for interdependent and structured output spaces. In: ICML 2004 (2004)
Google Scholar
Taskar, B., Guestrin, C., Koller, D.: Max-margin markov networks. In: NIPS (2003)
Google Scholar
Collins, M., Roark, B.: Incremental parsing with the perceptron algorithm. In: ACL 2004, Barcelona, Spain, pp. 111–118 (July, 2004)
Google Scholar
Daumé III, H., Marcu, D.: Learning as search optimization: Approximate large margin methods for structured prediction. In: ICML, Bonn, Germany (2005)
Google Scholar
Daumé III, H., Langford, J., Marcu, D.: Search-based structured prediction (2006)
Google Scholar
Sutton, R., Barto, A.: Reinforcement learning: an introduction. MIT Press, Cambridge (1998)
Google Scholar
Berger, A., Della Pietra, S., Della Pietra, V.: A maximum entropy approach to natural language processing. In: Computational Linguistics (1996)
Google Scholar
Maes, F., Denoyer, L., Gallinari, P.: Sequence labelling with reinforcement learning and ranking algorithms. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS, vol. 4701, pp. 648–657. Springer, Heidelberg (2007)
Chapter Google Scholar
Ramshaw, L., Marcus, M.: Text chunking using transformation-based learning. In: Yarovsky, D., Church, K. (eds.) Proceedings of the Third Workshop on Very Large Corpora, Somerset, New Jersey, ACL, pp. 82–94 (1995)
Google Scholar
Kassel, R.H.: A comparison of approaches to on-line handwritten character recognition. Ph.D thesis, Cambridge, MA, USA (1995)
Google Scholar
Maes, F., Denoyer, L., Gallinari, P.: Xml structure mapping application to the pascal/inex 2006 xml document mining track. In: INEX, Dagstuhl, Germany (2007)
Google Scholar
Maes, F., Denoyer, L., Gallinari, P.: Apprentissage de conversions de documents semi-structures a partir d’exemples. In: CORIA, Tregastel, France (2008)
Google Scholar
Denoyer, L., Gallinari, P.: Report on the xml mining track at inex 2005 and inex 2006. SIGIR Forum, 79–90 (2007)
Google Scholar
Denoyer, L., Gallinari, P.: The wikipedia xml corpus. SIGIR Forum (2006)
Google Scholar
Chidlovskii, B., Fuselier, J.: A probabilistic learning method for xml annotation of documents. In: IJCAI (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

LIP6 - University Pierre et Marie Curie, 104 avenue du President Kennedy, Paris, France
Francis Maes, Ludovic Denoyer & Patrick Gallinari

Authors

Francis Maes
View author publications
You can also search for this author in PubMed Google Scholar
Ludovic Denoyer
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Gallinari
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

INRIA Lille-Nord Europe, 59650, Villeneuve d’Ascq, France
Sertan Girgin
INRIA, LIFL, CNRS, Université de Lille, Villeneuve d’Ascq, France
Manuel Loth , Rémi Munos , Philippe Preux & Daniil Ryabko , , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Maes, F., Denoyer, L., Gallinari, P. (2008). Applications of Reinforcement Learning to Structured Prediction. In: Girgin, S., Loth, M., Munos, R., Preux, P., Ryabko, D. (eds) Recent Advances in Reinforcement Learning. EWRL 2008. Lecture Notes in Computer Science(), vol 5323. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89722-4_16

Download citation

DOI: https://doi.org/10.1007/978-3-540-89722-4_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89721-7
Online ISBN: 978-3-540-89722-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics