Skip to main content
Log in

Extreme Learning Machine for Huge Hypotheses Re-ranking in Statistical Machine Translation

  • Published:
Cognitive Computation Aims and scope Submit manuscript

Abstract

In statistical machine translation (SMT), a possibly infinite number of translation hypotheses can be decoded from a source sentence, among which re-ranking is applied to sort out the best translation result. Undoubtedly, re-ranking is an essential component of SMT for effective and efficient translation. A novel re-ranking method called Scaled Sorted Classification Re-ranking (SSCR) based on extreme learning machine (ELM) classification and minimum error rate training (MERT) is proposed. SSCR contains four steps: (1) the input features are normalized to the range of 0 to 1; (2) an ELM classification model is constructed for hypothesis ranking; (3) each translation hypothesis is ranked using the ELM classification model; and (4) the highest ranked subset of hypotheses are selected, in which the hypothesis with best predicted score based on MERT (system score) is returned as the final translation result. Compared with the baseline score (lower bound), SSCR with ELM classification can raise the translation quality up to 6.7% in IWSLT 2014 Chinese to English corpus. Compared with the state-of-the-art rank boosting, SSCR has a relatively 7.8% of improvement on BLEU in a larger WMT 2015 English-to-French corpus. Moreover, the training time of the proposed method is about 160 times faster than traditional regression-based re-ranking.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Aksoy S, Haralick RM. Feature normalization and likelihood-based similarity measures for image retrieval. Pattern Recogn Lett 2001;22(5):563–82.

    Article  Google Scholar 

  2. Banerjee S, Lavie A. Meteor: an automatic metric for MT evaluation with improved correlation with human judgments. Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization; 2005. p. 65–72.

  3. Bojar O, Chatterjee R, Federmann C, Haddow B, Hokamp C, Huck M, Logacheva V, Pecina P, (eds). 2015. Proceedings of the tenth workshop on statistical machine translation. Lisbon, Portugal: Association for Computational Linguistics. http://aclweb.org/anthology/W15-30.

    Google Scholar 

  4. Brown PF, Pietra VJD, Pietra SAD, Mercer RL. The mathematics of statistical machine translation: parameter estimation. Comput Linguist 1993;19(2):263–311.

    Google Scholar 

  5. Cettolo M, Niehues J, Stüker S, Bentivogli L, Federico M. Report on the 11th iwslt evaluation campaign, iwslt 2014. In: Proceedings of the eleventh international workshop on spoken language translation (IWSLT), Lake Tahoe, CA; 2014. p. 2–17.

  6. Collins M, Koo T. Discriminative reranking for natural language parsing. Comput Linguist 2005;31(1):25–70.

    Article  Google Scholar 

  7. Doddington G. Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. Proceedings of the second international conference on human language technology research. Morgan Kaufmann Publishers Inc; 2002. p. 138–45.

  8. Duh K, Kirchhoff K. Beyond log-linear models: boosted minimum error rate training for n-best re-ranking. In: Proceedings of the 46th annual meeting of the association for computational linguistics on human language technologies: short papers. Association for Computational Linguistics; 2008. p. 37–40.

  9. Huang GB. What are extreme learning machines? Filling the gap between Frank Rosenblatt’s dream and John von Neumann’s puzzle. Cogn. Comput. 2015;7(3):263–78. doi:http://dx.doi.org/10.1007/s12559-015-9333-0.

    Article  Google Scholar 

  10. Huang GB, Bai Z, Kasun LLC, Vong CM. Local receptive fields based extreme learning machine. IEEE Comput Intell Mag 2015;10(2):18–29.

    Article  Google Scholar 

  11. Huang GB, Zhou H, Ding X, Zhang R. Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Cybern. Part B Cybern. 2012;42(2):513–29.

    Article  Google Scholar 

  12. Huang GB, Zhu QY, Siew CK. Extreme learning machine: a new learning scheme of feedforward neural networks. IEEE international joint conference on neural networks, 2004. Proceedings. 2004. IEEE; 2004. p. 985–90.

  13. Juszczak P, Tax D, Duin R. Feature scaling in support vector data description. In: Proceedings of the ASCI. Citeseer; 2002. p. 95–102.

  14. Kirchhoff K, Yang M. Improved language modeling for statistical machine translation. In: Proceedings of the ACL workshop on building and using parallel texts. Association for Computational Linguistics; 2005. p. 125–28.

  15. Klakow D, Peters J. Testing the correlation of word error rate and perplexity. Speech Comm 2002;38 (1–2):19–28 . doi:10.1016/S0167-6393(01)00041-3. http://www.sciencedirect.com/science/article/pii/S0167639301000413.

    Article  Google Scholar 

  16. Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E. Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th annual meeting of the ACL on interactive poster and demonstration sessions, ACL ’07. Stroudsburg, PA, USA: Association for Computational Linguistics; 2007. p. 177–80. http://dl.acm.org/citation.cfm?id=1557769.1557821.

    Chapter  Google Scholar 

  17. Koehn P, Och FJ, Marcu D. Statistical phrase-based translation. In: Proceedings of the 2003 conference of the north american chapter of the association for computational linguistics on human language technology. Association for Computational Linguistics; 2003. p. 48–54.

  18. Lagarda AL, Casacuberta F. Applying boosting to statistical machine translation. In annual meeting of european association for machine translation (EAMT); 2008. p. 88–96.

  19. Li Z, Khudanpur S. 2009. Forest reranking for machine translation with the perceptron algorithm. GALE book chapter on MT from text.

  20. Luong NQ, Besacier L, Lecouteux B. Word confidence estimation for smt n-best list re-ranking. In: Proceedings of the workshop on humans and computer-assisted translation (HaCaT) during EACL; 2014.

  21. Nguyen P, Mahajan M, He X. Training non-parametric features for statistical machine translation. In: Proceedings of the second workshop on statistical machine translation. Association for Computational Linguistics; 2007. p. 72–79.

  22. Och FJ. Minimum error rate training in statistical machine translation. Proceedings of the 41st annual meeting on association for computational linguistics. Association for Computational Linguistics; 2003. p. 160–67.

  23. Och FJ, Gildea D, Khudanpur S, Sarkar A, Yamada K, Fraser A, Kumar S, Shen L, Smith D, Eng K, et al. A smorgasbord of features for statistical machine translation. HLT-NAACL; 2004. p. 161–68.

  24. Och FJ, Ney H. Discriminative training and maximum entropy models for statistical machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics. Association for Computational Linguistics; 2002 . p. 295–302.

  25. Och FJ, Ney H. A systematic comparison of various statistical alignment models. Comput Linguist 2003; 29(1):19–51.

    Article  Google Scholar 

  26. Powell MJ. An efficient method for finding the minimum of a function of several variables without calculating derivatives. Comput J 1964;7(2):155–62.

    Article  Google Scholar 

  27. Savitha R, Suresh S, Kim HJ. A meta-cognitive learning algorithm for an extreme learning machine classifier. Cogn Comput 2014;6(2):253–63. doi:10.1007/s12559-013-9223-2.

    Article  Google Scholar 

  28. Shen L, Joshi AK. An svm based voting algorithm with application to parse reranking. In: Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003. Association for Computational Linguistics; 2003. p. 9–16.

  29. Shen L, Sarkar A, Och FJ. Discriminative reranking for machine translation. HLT-NAACL; 2004. p. 177–84.

  30. Sokolov A, Wisniewski G, Yvon F. Non-linear n-best list reranking with few features. In: Proceedings of the conference of the association for machine translation in the Americas (AMTA). San Diego (CA); 2012.

  31. Stolcke A, et al. Srilm-an extensible language modeling toolkit. INTERSPEECH; 2002.

  32. Sun H, Zhao T. Neural network-based reranking model for statistical machine translation. 11th international conference on fuzzy systems and knowledge discovery (FSKD), 2014. IEEE; 2014. p. 460–65.

  33. Wong PK, Gao XH, Wong KI, Vong CM. An analytical study on reasoning of extreme learning machine for classification from its inductive bias. Cogn Comput 2016;8(4):746–56. doi:10.1007/s12559-016-9414-8.

    Article  Google Scholar 

  34. Zhang Y, Hildebrand AS, Vogel S. Distributed language modeling for n-best list re-ranking. In: Proceedings of the 2006 conference on empirical methods in natural language processing. Association for Computational Linguistics; 2006. p. 216–23.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chi Man Vong.

Ethics declarations

Funding

The work is financially supported by funding from University of Macau, project number MYRG2014-00083-FST, MYRG2016-00134 and from FDCT Macau, project number 050/2015/A.

Conflict of Interests

Yan Liu, Chi Man Vong, and Pak Kin Wong declare that they have no conflict of interest.

Ethical Approval

Informed consent was not required as no human or animals were involved.

Human and Animal Rights

This article does not contain any studies with human or animal subjects performed by any of the authors.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, Y., Vong, C.M. & Wong, P.K. Extreme Learning Machine for Huge Hypotheses Re-ranking in Statistical Machine Translation. Cogn Comput 9, 285–294 (2017). https://doi.org/10.1007/s12559-017-9452-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12559-017-9452-x

Keywords

Navigation