Abstract
Based on sliding-window rule application and extraction filtering, we present a framework for extracting multi-slot frames describing chemical reactions from Thai free text with unknown target-phrase boundaries. A supervised rule learning algorithm is employed for automatic construction of pattern-based extraction rules from hand-tagged training phrases. A filtering method is devised for removal of incorrect extraction results based on features observed from text portions appearing between adjacent slot fillers in source documents. Extracted reaction frames are represented as concept expressions in description logics and are used as metadata for document indexing. A document knowledge base supporting semantics-based information retrieval is constructed by integrating document metadata with domain-specific ontologies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Califf, M.E., Mooney, R.J.: Bottom-up Relational Learning of Pattern Matching Rules for Information Extraction. Journal of Machine Learning Research 4, 177–210 (2003)
Danvivathana, N.: The Thai Writing System. Helmut Buske Verlag (1987)
Freitag, D.: Machine Learning for Information Extraction in Informal Domains. Machine Learning 39(2-3), 169–202 (2000)
Narupiyakul, L., Thomas, C., Cercone, N., Sirinaovakul, B.: Thai Syllable-Based Information Extraction Using Hidden Markov Models. In: Gelbukh, A. (ed.) CICLing 2004. LNCS, vol. 2945, pp. 537–546. Springer, Heidelberg (2004)
Ridley, D.D.: Strategies for Chemical Reaction Searching in SciFinder. Journal of Chemical Information and Computer Sciences 40, 1077–1084 (2000)
Sankar, P., Aghila, G.: Design and Development of Chemical Ontologies for Reaction Representation. Journal of Chemical Information and Modeling 46, 2355–2368 (2006)
Soderland, S.: Learning Information Extraction Rules for Semi-Structured and Free Text. Machine Learning 34(1-3), 233–272 (1999)
Sukhahuta, R., Smith, D.: Information Extraction Strategies for Thai Documents. International Journal of Computer Processing of Oriental Languages 14, 153–172 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Intarapaiboon, P., Nantajeewarawat, E., Theeramunkong, T. (2010). Extracting Chemical Reactions from Thai Text for Semantics-Based Information Retrieval. In: Nguyen, N.T., Le, M.T., ÅšwiÄ…tek, J. (eds) Intelligent Information and Database Systems. ACIIDS 2010. Lecture Notes in Computer Science(), vol 5990. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12145-6_28
Download citation
DOI: https://doi.org/10.1007/978-3-642-12145-6_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12144-9
Online ISBN: 978-3-642-12145-6
eBook Packages: Computer ScienceComputer Science (R0)