Abstract
Microarray cancer gene expression datasets are high dimensional and thus complex for efficient computational analysis. In this study, we address the problem of simultaneous gene selection and robust classification of cancerous samples by presenting two hybrid algorithms, namely Discrete firefly based Support Vector Machines (DFA-SVM) and DFA-Random Forests (DFA-RF) with weighted gene ranking as heuristics. The performances of the algorithms are then tested using two cancer gene expression datasets retrieved from the Kent Ridge Biomedical Dataset Repository. Our results show that both DFA-SVM and DFA-RF can help in extracting more informative genes aiding to building high performance prediction models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Patil, D., Raj, R., Shingade, P., Kulkarni, B., Jayaraman, V.K.: Feature selection and classification employing hybrid ant colony optimization-random forest methodology. Comb Chem High Throughput Screen, vol. 12, no. 5. 507–513 (2009).
Sharma, S.,,Ghosh, S., Anantharaman, N., Jayaraman, V.K., 2012.: Simultaneous informative gene extraction and cancer classification using aco-antminer and aco-random forests. Advances in Intelligent and Soft Computing. Springer, vol. 132. 755–761 (2012).
Gupta A., Jayaraman V. K., Kulkarni. B. D.: Feature selection for cancer classification using ant colony optimization and support vector machines. Analysis of Biological Data : A Soft Computing Approach. ser. World Scientific, Singapore. 259 –280(2006).
Nikumbh S., Ghosh S., Jayaraman V. K.: Biogeography-Based Informative Gene Selection and Cancer Classification Using SVM and Random Forests. In IEEE World Congress on Computational Intelligence (IEEE WCCI 2012), Australia, In IEEE Press.(2012).
John G. H., Kohavi R., and Pfleger K.: Irrelevant features and the subset selection problem. In Proceedings of the Eleventh International Conference on Machine Learning. 121–129.(1994).
Yang X-S.: Nature-Inspired Metaheuristic Algorithm. Luniver Press(2008).
Yang X-S.: Firefly algorithms for multimodal optimization, in: Stochastic Algorithms: Foundations and Applications, SAGA, Lecture Notes in Computer Sciences, 5792, 169-178.(2009).
Jati G. K. and Suyanto S.: Evolutionary discrete firefly algorithm for travelling salesman problem. In ICAIS2011. Lecture Notes in Artificial Intelligence (LNAI 6943). 393-403 (2011).
Palit S., Sinha S., Molla M., Khanra A., Kule M.: A cryptanalytic attack on the knapsack cryptosystem using binary Firefly algorithm. In 2nd Int. Conference on Computer and Communication Technology (ICCCT), 15-17 Sept 2011,India, pp. 428-432 (2011).
Sayadi M. K., Ramezanian R., Ghaffari-Nasab N.: A discrete firefly meta-heuristic with local search for makespan minimization in permutation flow shop scheduling problems. Int. J. of Industrial Engineering Computations 1: 1–10 (2010).
Aungkulanon P., Chai-ead, N., Luangpaiboon P.: Simulated manufacturing process improvement via particle swarm optimisation and firefly algorithms. In Prof. Int. Multiconference of Engineers and Computer Scientists 2: 1123–1128. (2011).
U. Hönig U.: A firefly algorithm-based approach for scheduling task graphs in homogenous systems. Proceeding Informatics. DOI: 10.2316/P.2010.724-033, 724 (2010).
Senthilnath J., Omkar S.N. and Mani V.: Clustering using firefly algorithm: Performance study, Swarm and Evolutionary Computation, June (2011).
Han J., Kamber M., and Pei J., Data Mining: Concepts and Techniques - Information Gain, ser. The Morgan Kaufmann Series in Data Management Systems. Morgan Kaufmann (2011).
Hall M., Frank E., Holmes G., Pfahringer B., Reutemann P., and Witten I. H.: The weka data mining software: An update. SIGKDD Explor, vol. 11. 130–133(2009).
C. N. Shawe-Taylor J.: Support Vector Machines and Other Kernel-based Methods. Cambridge, UK. Cambridge University Press. (2000).
Boser, Bernhard E., Guyon, Isabelle M., and Vapnik, Vladimir N.: Training algorithm for optimal margin classifiers. In 5th Annual ACM Workshop on COLT, 144–152, Pittsburgh, PA, 1992. ACM Press(1992).
Chang, C.-C and Lin, C.-J.: LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology. vol. 2. 27:1–27:27(2011).
Breiman L.: Random forests. Machine Learning. vol. 45. pp. 5–32. (2001).
Breiman L. and Stone F.O.: Classification and regression trees. Chapman and Hall. (1984).
Kent ridge bio-medical dataset. URL: http://datam.i2r.astar.edu.sg/datasets/krbd/.
Alon U., Barkai N., Notterman D.A., Gish K., Ybarra S., Mack D., and Levine A.J.,.: Broad patterns of gene expression revealed byclustering analysis of tumor and normal colon tissues probed byoligonucleotide arrays. Proceedings of the National Academy of Sciences. vol. 96. no. 12. pp. 6745–6750(1999).
Golub T.R., Slonim D.K., Tamayo P., Huard C., Gaasenbee M., Mesirov J.P., Coller H., Loh M. L., Downing J.R., Caligiuri M.A., Bloomfield C.D., and Lander E.S. : Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science. vol. 286. no. 5439. 531–537.(1999).
Guyon I., Weston J., Barnhill S., and Vapnik V.: Gene selection for cancer classification using support vector machines. Machine Learning. vol. 46. 389–422. (2002).
Mohammad S., Azadeh M. and Mansoor. S.; Identification of disease-causing genes using microarray data mining and gene ontology. BMC Medical Genomics. vol. 4. 4:12 (2011).
Liu Q., Sung A. H., Chen Z., Liu J., Chen L., Qiao M., Wang Z, Huang X. and Deng Y.: Gene selection and classification for cancer microarray data based on machine learning and similarity measures. BMC Genomics. vol. 12. 130–133(2011).
L. Sun, D. Miao, and H. Zhang.: Efficient gene selection with rough sets from gene expression data. In Rough Sets and Knowledge Technology, ser. Lecture Notes in Computer Science. vol. 5009. 164–171(2008).
Acknowledgments
VKJ gratefully acknowledges the Department of Science and Technology (DST), New Delhi, India for financial support.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer India
About this paper
Cite this paper
Srivastava, A., Chakrabarti, S., Das, S., Ghosh, S., Jayaraman, V.K. (2013). Hybrid Firefly Based Simultaneous Gene Selection and Cancer Classification Using Support Vector Machines and Random Forests. In: Bansal, J., Singh, P., Deep, K., Pant, M., Nagar, A. (eds) Proceedings of Seventh International Conference on Bio-Inspired Computing: Theories and Applications (BIC-TA 2012). Advances in Intelligent Systems and Computing, vol 201. Springer, India. https://doi.org/10.1007/978-81-322-1038-2_41
Download citation
DOI: https://doi.org/10.1007/978-81-322-1038-2_41
Published:
Publisher Name: Springer, India
Print ISBN: 978-81-322-1037-5
Online ISBN: 978-81-322-1038-2
eBook Packages: EngineeringEngineering (R0)