Skip to main content

Chinese-English OOV Term Translation with Web Mining, Multiple Feature Fusion and Supervised Learning

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8801))

Abstract

This paper focuses on the Web-based Chinese-English Out-of-Vocabulary (OOV) term translation pattern, and emphasizes on the translation selection based on multiple feature fusion and the ranking based on Ranking Support Vector Machine (Ranking SVM). By utilizing the SIGHAN2005 corpus for the Chinese Named Entity Recognition (NER) task and selected new terms, the experiments based on different data sources show the consistent results. From the experimental results for combining our model with Chinese-English Cross-Language Information Retrieval (CLIR) on the data sets of TREC, it can be found that the obvious performance improvements for both query translation and CLIR are obtained.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Al-Onaizan, Y., Knight, K.: Translating Named Entities using Monolingual and Bilingual Resources. In: Proceedings of ACL 2002, pp. 400–408 (2002)

    Google Scholar 

  2. Cao, Y.B., Xu, J., Liu, T.Y., Li, H., Huang, Y.L., Hon, H.W.: Adapting Ranking-SVM to Document Retrieval. In: Proceedings of SIGIR 2006, pp. 186–193 (2006)

    Google Scholar 

  3. Chen, C., Chen, H.H.: A High-Accurate Chinese-English NE Backward Translation System Combining Both Lexical Information and Web Statistics. In: Proceedings of COLING-ACL 2006, pp. 81–88 (2006)

    Google Scholar 

  4. Fang, G.L., Yu, H., Nishino, F.: Chinese-English Term Translation Mining based on Semantic Prediction. In: Proceedings of COLING-ACL 2006, pp. 199–206 (2006)

    Google Scholar 

  5. Ge, Y.D., Hong, Y., Yao, J.M., Zhu, Q.M.: Improving Web-Based OOV Translation Mining for Query Translation. In: Cheng, P.-J., Kan, M.-Y., Lam, W., Nakov, P. (eds.) AIRS 2010. LNCS, vol. 6458, pp. 576–587. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  6. Hu, R., Chen, W., Bai, P., Lu, Y., Chen, Z., Yang, Q.: Web Query Translation via Web Log Mining. In: Proceedings of SIGIR 2008, pp. 749–750 (2008)

    Google Scholar 

  7. Huang, S., Chen, Z., Yu, Y., Ma, W.Y.: Multitype Features Coselection for Web Document Clustering. IEEE Transactions on Knowledge and Data Engineering 18(4), 448–459 (2006)

    Article  Google Scholar 

  8. Jiang, L., Zhou, M., Chien, L.F., Niu, C.: Named Entity Translation with Web Mining and Transliteration. In: Proceedings of IJCAI 2007, pp. 1629–1634 (2007)

    Google Scholar 

  9. Joachimes, T.: Optimizing Search Engines using Click through Data. In: Proceedings of SIGKDD 2002, pp. 133–142 (2002)

    Google Scholar 

  10. Lee, C.J., Chang, J.S., Jang, J.R.: Alignment of Bilingual Named Entities in Parallel Corpora Using Statistical Models and Multiple Knowledge Sources. ACM Transactions on Asian Language Processing 5(2), 121–145 (2006)

    Article  MathSciNet  Google Scholar 

  11. Lu, W.H., Chien, L.F.: Translation of Web Queries using Anchor Text Mining. ACM Transactions on Asian Language Information Processing 1(2), 159–172 (2002)

    Article  Google Scholar 

  12. Lu, W.H., Chien, L.F.: Anchor Text Mining for Translation of Web Queries: A Transitive Translation Approach. ACM Transactions on Information Systems 22(2), 242–269 (2004)

    Article  Google Scholar 

  13. Ren, F.L., Zhu, M.H., Wang, H.Z., Zhu, J.B.: Chinese-English Organization Name Translation Based on Correlative Expansion. In: Proceedings of the 2009 Named Entities Workshop, ACL-IJCNLP 2009, pp. 143–151 (2009)

    Google Scholar 

  14. Shao, L., Ng, H.T.: Mining New Word Translations from Comparable Corpora. In: Proceedings of COLING 2004, pp. 618–624 (2004)

    Google Scholar 

  15. Shi, L.: Mining OOV Translations from Mixed-Language Web Pages for Cross Language Information Retrieval. In: Gurrin, C., He, Y., Kazai, G., Kruschwitz, U., Little, S., Roelleke, T., Rüger, S., van Rijsbergen, K. (eds.) ECIR 2010. LNCS, vol. 5993, pp. 471–482. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  16. Sproat, R., Tao, T., Zhai, C.X.: Named Entity Transliteration with Comparable Corpora. In: Proceedings of COLING-ACL, pp. 73–80 (2006)

    Google Scholar 

  17. Virga, P., Khudanpur, S.: Transliteration of Proper Names in Cross-Language Applications. In: Proceedings of SIGIR 2003, pp. 365–366 (2003)

    Google Scholar 

  18. Wang, J.H., Teng, J.W., Cheng, P.J., Lu, W.H., Chien, L.F.: Translating Unknown Cross-Lingual Queries in Digital Libraries using a Web-based Approach. In: Proceedings of JCDL 2004, pp. 108–116 (2004)

    Google Scholar 

  19. Wu, J.C., Chang, J.S.: Learning to Find English to Chinese Transliterations on the Web. In: Proceedings of EMNLP-CoNLL 2007, pp. 996–1004 (2007)

    Google Scholar 

  20. Xu, J., Cao, Y.B., Li, H., Zhao, M.: Ranking Definitions with Supervised Learning Methods. In: Proceedings of WWW 2005, pp. 811–819 (2005)

    Google Scholar 

  21. Yang, F., Zhao, J., Zou, B., Liu, K.: Chinese-English Backward Transliteration Assisted with Mining Monolingual Web Pages. In: Proceedings of ACL 2008, pp. 541–549 (2008)

    Google Scholar 

  22. Yang, F., Zhao, J., Liu, K.: A Chinese-English Organization Name Translation System Using Heuristic Web Mining and Asymmetric Alignment. In: Proceedings of ACL-AFNLP 2009, pp. 387–395 (2009a)

    Google Scholar 

  23. Yang, M., Shi, Z., Li, S., Zhao, T., Qi, H.: Ranking vs. Classification: a Case Study in Mining Organization Name Translation from Snippets. In: Proceedings of IALP 2009, pp. 308–313 (2009b)

    Google Scholar 

  24. Zhang, Y., Huang, F., Vogel, S.: Mining Translations of OOV Terms from the Web through Cross-Lingual Query Expansion. In: Proceedings of SIGIR 2005, pp. 669–670 (2005)

    Google Scholar 

  25. Zhang, Y., Vines, P.: Using the Web for Automated Translation Extraction in Cross-Language Information Retrieval. In: Proceedings of SIGIR 2004, pp. 162–169 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Zhao, Y., Zhu, Q., Jin, C., Zhang, Y., Huang, X., Zhang, T. (2014). Chinese-English OOV Term Translation with Web Mining, Multiple Feature Fusion and Supervised Learning. In: Sun, M., Liu, Y., Zhao, J. (eds) Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. NLP-NABD CCL 2014 2014. Lecture Notes in Computer Science(), vol 8801. Springer, Cham. https://doi.org/10.1007/978-3-319-12277-9_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-12277-9_21

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-12276-2

  • Online ISBN: 978-3-319-12277-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics