Learning Conditional Random Fields from Unaligned Data for Natural Language Understanding

Zhou, Deyu; He, Yulan

doi:10.1007/978-3-642-20161-5_28

Learning Conditional Random Fields from Unaligned Data for Natural Language Understanding

Deyu Zhou²¹ &
Yulan He²²

Conference paper

6730 Accesses
5 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6611))

Abstract

In this paper, we propose a learning approach to train conditional random fields from unaligned data for natural language understanding where input to model learning are sentences paired with predicate formulae (or abstract semantic annotations) without word-level annotations. The learning approach resembles the expectation maximization algorithm. It has two advantages, one is that only abstract annotations are needed instead of fully word-level annotations, and the other is that the proposed learning framework can be easily extended for training other discriminative models, such as support vector machines, from abstract annotations. The proposed approach has been tested on the DARPA Communicator Data. Experimental results show that it outperforms the hidden vector state (HVS) model, a modified hidden Markov model also trained on abstract annotations. Furthermore, the proposed method has been compared with two other approaches, one is the hybrid framework (HF) combining the HVS model and the support vector hidden Markov model, and the other is discriminative training of the HVS model (DT). The proposed approach gives a relative error reduction rate of 18.7% and 8.3% in F-measure when compared with HF and DT respectively.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Altun, Y., Tsochantaridis, I., Hofmann, T.: Hidden markov support vector machines. In: International Conference in Machine Learning, pp. 3–10 (2003)
Google Scholar
CUData. Darpa communicator travel data. university of colorado at boulder (2004), http://communicator.colorado.edu/phoenix
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the em algorithm. JOURNAL OF THE ROYAL STATISTICAL SOCIETY, SERIES B 39(1), 1–38 (1977)
MathSciNet MATH Google Scholar
He, Y., Young, S.: Semantic processing using the hidden vector state model. Computer Speech and Language 19(1), 85–106 (2005)
Article Google Scholar
Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: ICML 2001: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 282–289. Morgan Kaufmann Publishers Inc., San Francisco (2001)
Google Scholar
Shai Shalev-Shwartz, Y.S., Srebro, N.: Pegasos: Primal estimated sub-gradient solver for svm. In: Proceedings of the 24th International Conference on Machine Learning (ICML 2007), pp. 807–814 (2007)
Google Scholar
Zhou, D., He, Y.: A Hybrid Generative/Discriminative Framework to Train a Semantic Parser from an Un-annotated Corpus. In: Proceedings of 22nd International Conference on Computational Linguistics (COLING 2008), Manchester, UK, pp. 1113–1120 (2008)
Google Scholar
Zhou, D., He, Y.: Discriminative Training of the Hidden Vector State Model for Semantic Parsing. IEEE Transaction on Knowledge and Data Engineering 21(1), 66–77 (2008)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Engineering, Southeast University, China
Deyu Zhou
Knowledge Media Institute, Open University, Milton Keynes, MK7 6AA, UK
Yulan He

Authors

Deyu Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Yulan He
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Information School, University of Sheffield, Regent Court, 211 Portobello Street, S1 4DP, Sheffield, UK
Paul Clough
CLARITY: Centre for Sensor Web Technologies, School of Computing, Dublin City University, Glasnevin, Dublin 9, Ireland
Colum Foley , Cathal Gurrin & Hyowon Lee , &
Centre for Next Generation Localisation, School of Computing, Dublin City University, Glasnevin, Dublin 9, Ireland
Gareth J. F. Jones
TNO Human Factors, Brassersplein 2, 2612 CT, Delft, The Netherlands
Wessel Kraaij
Yahoo! Research, 177 Diagonal, 08018, Barcelona, Spain
Vanessa Mudoch

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhou, D., He, Y. (2011). Learning Conditional Random Fields from Unaligned Data for Natural Language Understanding. In: Clough, P., et al. Advances in Information Retrieval. ECIR 2011. Lecture Notes in Computer Science, vol 6611. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20161-5_28

Download citation

DOI: https://doi.org/10.1007/978-3-642-20161-5_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20160-8
Online ISBN: 978-3-642-20161-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics