Skip to main content

Extracting Semantic Frames from Thai Medical-Symptom Phrases with Unknown Boundaries

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5367))

Abstract

Due to the limitations of language-processing tools for the Thai language, pattern-based information extraction from Thai documents requires supplementary techniques. Based on sliding-window rule application and extraction filtering, we present a framework for extracting semantic information from medical-symptom phrases with unknown boundaries in Thai free-text information entries. A supervised rule learning algorithm is employed for automatic construction of information extraction rules from hand-tagged training symptom phrases. Two filtering components are introduced: one uses a classification model for predicting rule application across a symptom-phrase boundary, the other uses extraction distances observed during rule learning for resolving conflicts arising from overlapping-frame extractions. In our experimental study, we focus our attention on two basic types of symptom phrasal descriptions: one is concerned with abnormal characteristics of some observable entities and the other with human-body locations at which symptoms appear. The experimental results show that the filtering components improve precision while preserving recall satisfactorily.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Califf, M.E., Mooney, R.J.: Bottom-up Relational Learning of Pattern Matching Rules for Information Extraction. Journal of Machine Learning Research 4, 177–210 (2003)

    Article  MathSciNet  Google Scholar 

  2. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley Interscience, Hoboken (2000)

    Google Scholar 

  3. Freitag, D.: Machine Learning for Information Extraction in Informal Domains. Machine Learning 39(2–3), 169–202 (2000)

    Article  MATH  Google Scholar 

  4. Kim, E., Song, Y., Lee, C., Kim, K., Lee, G., Yi, B.-K.: Two-Phase Learning for Biological Event Extraction and Verification. ACM Transactions on Asian Language Information Processing 5(1), 61–73 (2006)

    Article  Google Scholar 

  5. Lee, C.-H., Na, J.-C., Khoo, C.S.G.: Towards ontology enrichment with treatment relations extracted from medical abstracts. In: Sugimoto, S., Hunter, J., Rauber, A., Morishima, A. (eds.) ICADL 2006. LNCS, vol. 4312, pp. 419–428. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  6. Soderland, S.: Learning Information Extraction Rules for Semi-Structured and Free Text. Machine Learning 34(1–3), 233–272 (1999)

    Article  MATH  Google Scholar 

  7. Sornlertlamvanich, V., Potipiti, T., Charoenporn, T.: Automatic Corpus-based Thai Word Extraction with the C4.5 Learning Algorithm. In: Proc. 18th International Conference on Computational Linguistics, Saarbrucken, Germany, pp. 802–807 (2000)

    Google Scholar 

  8. Sukhahuta, R., Smith, D.: Information Extraction Strategies for Thai Documents. International Journal of Computer Processing of Oriental Languages 14(2), 153–172 (2001)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Intarapaiboon, P., Nantajeewarawat, E., Theeramunkong, T. (2008). Extracting Semantic Frames from Thai Medical-Symptom Phrases with Unknown Boundaries. In: Domingue, J., Anutariya, C. (eds) The Semantic Web. ASWC 2008. Lecture Notes in Computer Science, vol 5367. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89704-0_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-89704-0_27

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-89703-3

  • Online ISBN: 978-3-540-89704-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics