Optimized Uyghur Segmentation for Statistical Machine Translation

Mi, Chenggang; Yang, Yating; Dong, Rui; Zhou, Xi; Wang, Lei; Li, Xiao; Jiang, Tonghai; Osman, Turghun

doi:10.1007/978-3-319-19581-0_36

Chenggang Mi^18,19,
Yating Yang¹⁸,
Rui Dong^18,19,
Xi Zhou¹⁸,
Lei Wang¹⁸,
Xiao Li¹⁸,
Tonghai Jiang¹⁸ &
…
Turghun Osman¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9103))

Included in the following conference series:

International Conference on Applications of Natural Language to Information Systems

1804 Accesses

Abstract

In this paper, we propose an optimized method to segment the Uyghur word. We consider the optimization as a classification problem; the features are extracted from Uyghur-Chinese bilingual corpus. Experimental results show that with our method the performance of Uyghur-Chinese machine translation improved significantly.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Larkey, L.S., Ballesteros, L., Connell, M. E.: Improving stemming for Arabic information retrieval: light stemming and co-occurrence analysis. In: Proceedings of the 25th ACM SIGIR, pp. 275–282 (2002)
Google Scholar
Nguyen, T.L., Vogel, S., Smith, N.A.: Nonparametric word segmentation for machine translation. In: Proceedings of the 23rd COLING, pp. 815–823 (2010)
Google Scholar
Lafferty, J., McCallum, A., Pereira, F.C.N.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 8th ICML, pp. 282–289 (2001)
Google Scholar
McDonald, J.H: Handbook of Biological Statistics. 2nd edn pp. 173–181 (2009)
Google Scholar

Download references

Acknowledgements

This work is supported by the National High Technology Research and Development Program of China (No. 2013AA01A607), Strategic Priority Research Program of the Chinese Academy of Sciences (No. XDA06030400), West Light Foundation of Chinese Academy of Sciences (No. XBBS201216), and Key Project of Knowledge Innovation Program of Chinese Academy of Sciences (No. KGZD-EW-501).

Author information

Authors and Affiliations

Xinjiang Technical Institute of Physics and Chemistry of Chinese Academy of Sciences Urumqi, Xinjiang, 830011, China
Chenggang Mi, Yating Yang, Rui Dong, Xi Zhou, Lei Wang, Xiao Li, Tonghai Jiang & Turghun Osman
University of Chinese Academy of Sciences Beijing, Beijing, 100049, China
Chenggang Mi & Rui Dong

Authors

Chenggang Mi
View author publications
You can also search for this author in PubMed Google Scholar
Yating Yang
View author publications
You can also search for this author in PubMed Google Scholar
Rui Dong
View author publications
You can also search for this author in PubMed Google Scholar
Xi Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Lei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xiao Li
View author publications
You can also search for this author in PubMed Google Scholar
Tonghai Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Turghun Osman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chenggang Mi .

Editor information

Editors and Affiliations

Technische Universität Darmstadt, Darmstadt, Germany
Chris Biemann
Universität Passau, Passau, Germany
Siegfried Handschuh
Universität Passau, Passau, Germany
André Freitas
University of Salford, Salford, United Kingdom
Farid Meziane
Conservatoire National des Arts et Métiers, Paris, France
Elisabeth Métais

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mi, C. et al. (2015). Optimized Uyghur Segmentation for Statistical Machine Translation. In: Biemann, C., Handschuh, S., Freitas, A., Meziane, F., Métais, E. (eds) Natural Language Processing and Information Systems. NLDB 2015. Lecture Notes in Computer Science(), vol 9103. Springer, Cham. https://doi.org/10.1007/978-3-319-19581-0_36

Download citation

DOI: https://doi.org/10.1007/978-3-319-19581-0_36
Published: 04 June 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19580-3
Online ISBN: 978-3-319-19581-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics