Handling Ties Correctly and Efficiently in Viterbi Training Using the Viterbi Semiring

Saers, Markus; Wu, Dekai

doi:10.1007/978-3-319-77313-1_22

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10792))

Included in the following conference series:

International Conference on Language and Automata Theory and Applications

498 Accesses
1 Citations

Abstract

The handling of ties between equiprobable derivations during Viterbi training is often glossed over in research paper, whether they are broken randomly when they occur, or on an ad-hoc basis decided by the algorithm or implementation, or whether all equiprobable derivations are enumerated with the counts uniformly distributed among them, is left to the readers imagination. The first hurts rarely occurring rules, which run the risk of being randomly eliminated, the second suffers from algorithmic biases, and the last is correct but potentially very inefficient. We show that it is possible to Viterbi train correctly without enumerating all equiprobable best derivations. The method is analogous to expectation maximization, given that the automatic differentiation view is chosen over the reverse value/outside probability view, as the latter calculates the wrong quantity for reestimation under the Viterbi semiring. To get the automatic differentiation to work we devise an unbiased subderivative for the \(\mathrm {max}\) function.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Baccelli, F., Cohen, G., Older, G.J., Quadrat, J.P.: Synchronization and Linearity: An Algebra For Discrete Event Systems. Wiley Series in Probability and Mathematical Statistics. Wiley, Chichester (1992)
Google Scholar
Corliss, G., Faure, C., Griewank, A., Hascoët, L., Naumann, U. (eds.): Automatic Differntiation of Algorithms: From Simulation to Optimization. Springer, New York (2002). https://doi.org/10.1007/978-1-4613-0075-5
Google Scholar
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc.: Ser. B (Methodol.) 39(1), 1–38 (1977)
MathSciNet MATH Google Scholar
Eisner, J.: Inside-outside and forward-backward algorithms are just backprop. In: Proceedings of the EMNLP Workshop on Structured Prediction for NLP, Austin, Texas, November 2016
Google Scholar
Eisner, J., Goldlust, E., Smith, N.A.: Compiling comp Ling: weighted dynamic programming and the Dyna language. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing (HLT-EMNLP), Vancouver, Canada, pp. 281–290, October 2005
Google Scholar
Goodman, J.: Semiring parsing. Comput. Linguist. 25(4), 573–605 (1999)
MathSciNet Google Scholar
Juang, B.H., Rabiner, L.R.: The segmental K-means algorithm for estimating parameters of hidden Markov models. IEEE Trans. Acoust. Speech Signal Process. 38, 1639–1641 (1990)
Article MATH Google Scholar
Li, Z., Eisner, J.: First- and second-order expectation semirings with applications to minimum-risk training on translation forests. In: 2009 Conference on Empirical Methods in Natural Language Processing (EMNLP 2009), Singapore, pp. 40–51, August 2009
Google Scholar
Pereira, F.C.N., Warren, D.H.D.: Parsing as deduction. In: Proceedings of the 21st Annual Meeting of the Association for Computational Linguistics (ACL 1983), Cambridge, Massachusetts, pp. 137–144, June 1983
Google Scholar
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323, 533–536 (1986)
Article MATH Google Scholar
Saers, M., Wu, D.: Reestimation of reified rules in semiring parsing and biparsing. In: Fifth Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST-5), Portland, Oregon, pp. 70–78, June 2011
Google Scholar
Shieber, S.M., Schabes, Y., Pereira, F.C.: Principles and implementation of deductive parsing. J. Logic Program. 24(1–2), 3–36 (1995)
Article MathSciNet MATH Google Scholar
Simon, I.: Recognizable sets with multiplicities in the tropical semiring. In: Chytil, M.P., Koubek, V., Janiga, L. (eds.) MFCS 1988. LNCS, vol. 324, pp. 107–120. Springer, Heidelberg (1988). https://doi.org/10.1007/BFb0017135
Chapter Google Scholar
Smith, N.A.: Linguistic structure prediction. Synth. Lect. Hum. Lang. Technol. 4(2), 1–274 (2011)
Article Google Scholar

Download references

Acknowledgements

This material is based upon work supported in part by the Defense Advanced Research Projects Agency (DARPA) under LORELEI contract HR0011-15-C-0114, BOLT contracts HR0011-12-C-0014 and HR0011-12-C-0016, and GALE contracts HR0011-06-C-0022 and HR0011-06-C-0023; by the European Union under the Horizon 2020 grant agreement 645452 (QT21) and FP7 grant agreement 287658; and by the Hong Kong Research Grants Council (RGC) research grants GRF16210714, GRF16214315, GRF620811 and GRF621008. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of DARPA, the EU, or RGC. The authors would also like to thank the anonymous reviewers for valuable feedback.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Human Language Technology Center, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
Markus Saers & Dekai Wu

Authors

Markus Saers
View author publications
You can also search for this author in PubMed Google Scholar
Dekai Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Markus Saers .

Editor information

Editors and Affiliations

Bar-Ilan University, Ramat Gan, Israel
Shmuel Tomi Klein
Rovira i Virgili University, Tarragona, Spain
Carlos Martín-Vide
Ariel University, Ariel, Israel
Dana Shapira

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Saers, M., Wu, D. (2018). Handling Ties Correctly and Efficiently in Viterbi Training Using the Viterbi Semiring. In: Klein, S., Martín-Vide, C., Shapira, D. (eds) Language and Automata Theory and Applications. LATA 2018. Lecture Notes in Computer Science(), vol 10792. Springer, Cham. https://doi.org/10.1007/978-3-319-77313-1_22

Download citation

DOI: https://doi.org/10.1007/978-3-319-77313-1_22
Published: 08 March 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-77312-4
Online ISBN: 978-3-319-77313-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics