Skip to main content

Resolving Named Entity Unknown Word in Chinese-Vietnamese Machine Translation

  • Conference paper
Knowledge and Systems Engineering

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 245))

Abstract

Vocabulary of natural language is an open set. So we cannot collect all words of a language. Therefore, arising unknown word (UKW) in statistical machine translation (SMT) is unavoidable. Named entity is the most common UKW. In this paper, we will present a new approach based on the meaning relationship in Chinese and Vietnamese to re-translate named entity UKW. Applying this approach to Chinese-Vietnamese SMT, experimental results show that our approach has significantly improved machines performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Tran, T.P., Dinh, D.: Dealing with affirmative-negative question in Chinese-Vietnamese statistical machine translation. Journal of Research, Development and Application on Information & Communication Technology 27, 140–150 (2012) (in Vietnamese)

    Google Scholar 

  2. Tran, T.P., Dinh, D.: Identifying and reodering prepositions in Chinese-Vietnamese machine translation. In: Conjunction with 9th IEEE-RIVF Conference on Computing and Communication Technologies, First International Workshop on Vietnamese Language and Speech Processing (VLSP), Vietnam (2012)

    Google Scholar 

  3. Tran, T.P., Dinh, D.: The issue of word boundary in Chinese-Vietnamese statistical machine translation. The Thirteen Scientific Meeting of Ho Chi Minh City University of Science (2012) (in Vietnamese)

    Google Scholar 

  4. Silva, J., Coheur, L., Costa, A., Trancoso, I.: Dealing with unknown words in statistical machine translation. In: Proceedings of the Eight International Conference on Language Resources and Evaluation, LREC 2012 (2012)

    Google Scholar 

  5. Eck, M., Vogel, S., Waibel, A.: Communicating Unknown words in machine translation. In: International Conference on Language Resources and Evaluation (2008)

    Google Scholar 

  6. Zhang, R., Sumita, E.: Chinese Unknown word Translation by Subword Resegmentation. In: International Joint Conference on Natural Language Processing (2008)

    Google Scholar 

  7. Chen, K.-J., Chen, C.-J.: Knowledge Extraction for Indentification of Chinese Organization Names. In: Second Chinese Language Processing Workshop, Hong Kong (2000)

    Google Scholar 

  8. Gao, J., Li, M., Huang, C.-N.: Improved Source-Channel Models for Chinese Word Segmentation. In: ACL 2003 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics (2003)

    Google Scholar 

  9. Wu, Y., Zhao, J., Xu, B.: Chinese Named Entity Recognition Combining a Statistical Model with Human Knowledge. In: MultiNER 2003 Proceedings of the ACL 2003 Workshop on Multilingual and Mixed-Language Named Entity Recognition, vol. 15 (2003)

    Google Scholar 

  10. Liu, H., Guo, D., Zhou, Q., Kenji, N., Sun, Q.: A pre-identification method for Chinese Named Entity Recognition (2010)

    Google Scholar 

  11. Dinh, D., Vu, T.: A maximum entropy approach for Vietnamese word segmentation. In: 2006 International Conference on Research, Innovation and Vision for the Future (2006)

    Google Scholar 

  12. Chinese names, http://www.chinesenames.org

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Phuoc Tran .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Tran, P., Dinh, D., Tran, L. (2014). Resolving Named Entity Unknown Word in Chinese-Vietnamese Machine Translation. In: Huynh, V., Denoeux, T., Tran, D., Le, A., Pham, S. (eds) Knowledge and Systems Engineering. Advances in Intelligent Systems and Computing, vol 245. Springer, Cham. https://doi.org/10.1007/978-3-319-02821-7_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-02821-7_25

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-02820-0

  • Online ISBN: 978-3-319-02821-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics