Abstract
Named Entity Recognition and Classification (NERC) is one of the most fundamental and important tasks in biomedical information extraction. Gene mention detection is concerned with the named entity (NE) extraction of gene and gene product mentions in text. Several different approaches have emerged but most of these state-of-the-art approaches suggest that individual NERC system may not cover entity representations with arbitrary set of features and cannot achieve best performance. In this paper, we propose a voted approach for gene mention detection. We use support vector machine (SVM) as the underlying classification methodology, and build different models of it depending upon the various representations of the set of features. One most important criterion of these features is that these are identified and selected largely without using any domain knowledge. Evaluation results with the benchmark dataset of GENTAG yields the state-of-the-art performance with the overall recall, precision and F-measure values of 94.95%, 94.32%, and 94.63%, respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aronson, A.R., Bodenreider, O., Chang, H.F., Humphrey, S.M., Mork, J.G., Nelson, S.J., Rindflesch, T.C., Wilbur, W.J.: The NLM Indexing Initiative. In: Proceedings of 2000 AMIA Annual Fall Symposium (2000)
Finkel, J., Dingare, S., Manning, C., Nissim, M., Alex, B., Grover, C.: Exploring the boundaries: gene and protein identification in biomedical text. BMC Bioinformatics 6 (2005)
Hirschman, L., Yeh, A., Blaschke, C., Valencia, A.: Overview of BioCreAtIvE: critical assessment of information extraction for biology. BMC Bioinformatics 6 (2005)
Joachims, T.: Making Large Scale SVM Learning Practical, pp. 169–184. MIT Press, Cambridge (1999)
Taira, H., Haruno, M.: Feature Selection in SVM Text Categorization. In: Proceedings of AAAI 1999 (1999)
Tjong Kim Sang, E.F., De Meulder, F.: Introduction to the Conll-2003 Shared Task: Language Independent Named Entity Recognition. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, pp. 142–147 (2003)
Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer-Verlag New York, Inc. (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Saha, S., Ekbal, A., Saha, S. (2011). A Supervised Approach for Gene Mention Detection. In: Panigrahi, B.K., Suganthan, P.N., Das, S., Satapathy, S.C. (eds) Swarm, Evolutionary, and Memetic Computing. SEMCCO 2011. Lecture Notes in Computer Science, vol 7076. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27172-4_52
Download citation
DOI: https://doi.org/10.1007/978-3-642-27172-4_52
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-27171-7
Online ISBN: 978-3-642-27172-4
eBook Packages: Computer ScienceComputer Science (R0)