Abstract
One of the challenges in natural language processing (NLP) is to semantically treat documents. Such process is tailored to specific domains, where bioinformatics appears as a promising interest area. We focus this work on the rational drug design process, in trying to help the identification of new target proteins (receptors) and drug candidate compounds (ligands) in scientific documents. Our approach is to handle such structures as named entities (NE) in the text. We propose the recognition of these NE by analyzing their context. In doing so, considering an annotated corpus on the RDD domain, we present models generated by association rules mining that indicate which terms relevant to the context point out the presence of a receptor or ligand in a sentence.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Tan, P.-N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison-Wesley, Boston (2006)
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco (2006)
Tsuruoka, Y., Tsujii, J., Ananiadou, S.: Accelerating the Annotation of Sparse Named Entities by Dynamic Sentence Selection. BMC Bioinformatics 9, S8 (2008)
Wong, W., Martinez, D., Cavedon, L.: Extraction of Named Entities From Tables in Gene Mutation Literature. In: BioNLP 2009: Workshop on Current Trends in Biomedical Natural Language Processing, pp. 46–54. Association for Computational Linguistics, Boulder (2009)
Corbett, P., Copestake, A.: Cascaded Classifiers for Confidence-based Chemical Named Entity Recognition. BMC Bioinformatics 9, S4 (2008)
Li, D., Kipper-Schuler, K., Savova, G.: Conditional Random Fields and Support Vector Machines for Disorder Named Entity Recognition in Clinical Texts. In: BioNLP 2008: Workshop on Current Trends in Biomedical Natural Language Processing, pp. 94–95. Association for Computational Linguistics, Columbus (2008)
Kolarik, C., Hofmann-Apitius, M., Zimmermann, M., Fluck, J.: Identification of New Drug Classification Terms in Textual Resources. Bioinformatics 23, 264–272 (2007)
Starlinger, J., Leitner, F., Valencia, A., Leser, U.: SOA-Based Integration of Text Mining Services. In: SERVICES 2009: Congress on Services - I, pp. 99–106. IEEE Computer Society, Los Angeles (2009)
Ananiadou, S., Kell, D., Tsujii, J.: Text Mining and its Potential Applications in Systems Biology. Trends in Biotechnology 24, 571–579 (2006)
Mansouri, A., Affendey, L.S., Mamat, A.: Named Entity Recognition Approaches. International Journal of Computer Science and Network Security 8, 339–344 (2008)
Budi, I., Bressan, S.: Association Rules Mining for Name Entity Recognition. In: Fourth International Conference on Web Information Systems Engineering (WISE 2003), pp. 325–328. IEEE Computer Society, Roma (2003)
Li, S., Janneck, C.D., Belapurkar, A.D., Ganiz, M.C., Yang, X., Dilsizian, M., Wu, T., Bright, J.M., Pottenger, W.M.: Mining Higher-Order Association Rules from Distributed Named Entity Databases. In: IEEE International Conference on Intelligence and Security Informatics, pp. 236–246. IEEE Computer Society, New Brunswick (2007)
Mendes, A., Antunes, C.: Pattern Mining with Natural Language Processing: An Exploratory Approach. In: Perner, P. (ed.) MLDM 2009. LNCS, vol. 5632, pp. 266–279. Springer, Heidelberg (2009)
Scheffer, T.: Finding Association Rules That Trade Support Optimally Against Confidence. Intelligent Data Analysis 9(4), 381–395 (2005)
Lesk, A.: Introduction to Bioinformatics. Oxford University Press, New York (2002)
Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules in Large Databases. In: VLDB 1994, Proceedings of 20th International Conference on Very Large Data Bases, pp. 487–499. Morgan Kaufmann Publishers Inc., Santiago de Chile (1994)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Winck, A.T., Machado, K.S., Ruiz, D.D., Strube de Lima, V.L. (2010). Association Rules to Identify Receptor and Ligand Structures through Named Entities Recognition. In: García-Pedrajas, N., Herrera, F., Fyfe, C., Benítez, J.M., Ali, M. (eds) Trends in Applied Intelligent Systems. IEA/AIE 2010. Lecture Notes in Computer Science(), vol 6098. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13033-5_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-13033-5_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13032-8
Online ISBN: 978-3-642-13033-5
eBook Packages: Computer ScienceComputer Science (R0)