Abstract
In this paper we propose a modified differential evolution (MDE) based feature selection and ensemble learning algorithms for biochemical entity recognizer. Identification and classification of chemical entities are relatively more complex and challenging compared to the other related tasks. As chemical entities we focus on IUPAC and IUPAC related entities. The algorithm performs feature selection within the framework of a robust machine learning algorithm, namely Conditional Random Field. Features are identified and implemented mostly without using any domain specific knowledge and/or resources. In this paper we modify traditional differential evolution to perform two tasks, viz. determining relevant set of features as well as determining proper voting weights for constructing an ensemble. The feature selection technique produces a set of potential solutions on the final population. We develop many models of CRF using these feature combinations. In order to further improve the performance the outputs of these classifiers are combined together using a classifier ensemble technique based on modified DE. Our experiments with the benchmark datasets yield the recall, precision and F-measure values of 82.34%, 88.26% and 85.20%, respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ekbal, A., Saha, S.: Classifier ensemble selection using genetic algorithm for named entity recognition. Research on Language and Computation 8, 73–99 (2010)
Ekbal, A., Saha, S.: Weighted vote based classifier ensemble selection using genetic algorithm for named entity recognition. In: Proceedings of the Natural Language Processing and Information Systems, NLDB 2010, pp. 256–267 (2010)
Ekbal, A., Saha, S.: Weighted vote-based classifier ensemble for named entity recognition: A genetic algorithm-based approach. ACM Trans. Asian Lang. Inf. Process. 10(2) (2011)
Ekbal, A., Saha, S.: Multiobjective optimization for classifier ensemble and feature selection: an application to named entity recognition. IJDAR 15(2), 143–166 (2012)
Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, New York (1989)
Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In: ICML, pp. 282–289 (2001)
Liu, H., Motoda, H.: Feature Selection for Knowledge Discovery and Data Mining. Kluwer Academic Publishers, Norwell (1998)
Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Trans. on Knowl. and Data Eng. 17(4), 491–502 (2005)
Sikdar, U.K., Ekbal, A., Saha, S.: Differential evolution based feature selection and classifier ensemble for named entity recognition. In: COLING, pp. 2475–2490 (2012)
Storn, R., Price, K.: Differential evolution a simple and efficient heuristic for global optimization over continuous spaces. J. of Global Optimization 11(4), 341–359 (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sikdar, U.K., Ekbal, A., Saha, S. (2014). Modified Differential Evolution for Biochemical Name Recognizer. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2014. Lecture Notes in Computer Science, vol 8403. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-54906-9_18
Download citation
DOI: https://doi.org/10.1007/978-3-642-54906-9_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-54905-2
Online ISBN: 978-3-642-54906-9
eBook Packages: Computer ScienceComputer Science (R0)