Skip to main content

Automatic Extraction of Thai-English Term Translations and Synonyms from Medical Web using Iterative Candidate Generation with Association Measures

  • Conference paper
New Frontiers in Applied Data Mining (PAKDD 2009)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5669))

Included in the following conference series:

  • 641 Accesses

Abstract

Electronic technical documents available on the Internet are a powerful source for automatic extraction of term translations and synonyms. This paper presents an association-based approach to extract possible translations and synonyms by iterative candidate generation using a search engine. The plausible candidate pairs can be chosen by calculating their co-occurring statistics. In our experiment to extract Thai-English medical term pairs, four possible alternative associations; namely confidence, support, lift and conviction, are investigated and their performances are compared by ten-fold cross validation. The experimental results show that lift achieves the best performance with 73.1% f-measure with 67% precision and 84.2% recall on translation pair extraction, 68.7% f-measure with 71.5% precision and 67.7% recall on Thai synonym term extraction and 72.8% f-measure with 72.0% precision and 75.1% recall on English synonym term extraction. The precision of our approach in Thai-English translation, Thai synonym and English synonym extraction are 4 times, 3.5 times and 5.5 times higher than baseline precision respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bodenreider., O.: Lexical, terminological, and ontological resources for biological text mining. In: Ananiadou, S., McNaught, J. (eds.) Text Mining for Biology and Biomedicine, ch. 3, pp. 43–66. Artech House (2006)

    Google Scholar 

  2. Zhang, Y., Vines, P.: Using the web for automated translation extraction in cross-language information retrieval. In: Proceedings of the 27th Annual International ACM SIGIR Conference (SIGIR 2004), Sheffield, South Yorkshire, UK, July 2004, pp. 162–169 (2004)

    Google Scholar 

  3. Viriyayudhakorn, K., Theeramunkong, T., Nattee, C.: Mining translation pairs for thai-english medical terms. In: Proceedings of the 3rd International Conference on Knowledge, Information and Creativity Support Systems (KICSS 2008), December 2008, pp. 104–111. Hanoi National University of Education (HNUE), Hanoi (2008)

    Google Scholar 

  4. Wang, J.-H., Teng, J.-W., Cheng, P.-J., Lu, W.-H., Chien, L.-F.: Translating unknown cross-lingual queries in digital libraries using a web-based approach. In: Proceedings of the 2004 Joint ACM/IEEE Conference on Digital Libraries (JCDL 2004), Tucson, Arizona, USA, June 2004, pp. 108–116 (2004)

    Google Scholar 

  5. Lu, W.-H., Lin, S.-J., Chan, Y.-C., Chen, K.-H.: Semi-automatic construction of the chinese-english MeSH using web-based term translation method. In: Proceedings of American Medical Informatics Association 2005 Symposium, pp. 475–479 (2005)

    Google Scholar 

  6. Wang, J.-H., Teng, J.-W., Lu, W.-H., Chien, L.-F.: Exploiting the web as the multilingual corpus for unknown query translation. J. Am. Soc. Inf. Sci. Technol. 57(5), 660–670 (2006)

    Article  Google Scholar 

  7. Turney, P.D.: Mining the web for synonyms: Pmi-ir versus lsa on toefl. In: Flach, P.A., De Raedt, L. (eds.) ECML 2001. LNCS (LNAI), vol. 2167, pp. 491–502. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  8. Inkpen, D.: A statistical model for near-synonym choice. ACM Trans. Speech Lang. Process. 4(1), 2 (2007)

    Article  MathSciNet  Google Scholar 

  9. Okamoto, H., Sato, K., Saito, H.: Preferential presentation of japanese near-synonyms using definition statements. In: Proceedings of the second international workshop on Paraphrasing, vol. 16, pp. 17–24 (2003)

    Google Scholar 

  10. Shimohata, M., Sumita, E.: Acquiring synonyms from monolingual comparable texts. In: Dale, R., Wong, K.-F., Su, J., Kwong, O.Y. (eds.) IJCNLP 2005. LNCS (LNAI), vol. 3651, p. 233. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Viriyayudhakorn, K., Theeramunkong, T., Nattee, C., Supnithi, T., Okumura, M. (2010). Automatic Extraction of Thai-English Term Translations and Synonyms from Medical Web using Iterative Candidate Generation with Association Measures. In: Theeramunkong, T., et al. New Frontiers in Applied Data Mining. PAKDD 2009. Lecture Notes in Computer Science(), vol 5669. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14640-4_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-14640-4_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-14639-8

  • Online ISBN: 978-3-642-14640-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics