Skip to main content

Hybrid Firefly Based Simultaneous Gene Selection and Cancer Classification Using Support Vector Machines and Random Forests

  • Conference paper
  • First Online:
Proceedings of Seventh International Conference on Bio-Inspired Computing: Theories and Applications (BIC-TA 2012)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 201))

Abstract

Microarray cancer gene expression datasets are high dimensional and thus complex for efficient computational analysis. In this study, we address the problem of simultaneous gene selection and robust classification of cancerous samples by presenting two hybrid algorithms, namely Discrete firefly based Support Vector Machines (DFA-SVM) and DFA-Random Forests (DFA-RF) with weighted gene ranking as heuristics. The performances of the algorithms are then tested using two cancer gene expression datasets retrieved from the Kent Ridge Biomedical Dataset Repository. Our results show that both DFA-SVM and DFA-RF can help in extracting more informative genes aiding to building high performance prediction models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Patil, D., Raj, R., Shingade, P., Kulkarni, B., Jayaraman, V.K.: Feature selection and classification employing hybrid ant colony optimization-random forest methodology. Comb Chem High Throughput Screen, vol. 12, no. 5. 507–513 (2009).

    Google Scholar 

  • Sharma, S.,,Ghosh, S., Anantharaman, N., Jayaraman, V.K., 2012.: Simultaneous informative gene extraction and cancer classification using aco-antminer and aco-random forests. Advances in Intelligent and Soft Computing. Springer, vol. 132. 755–761 (2012).

    Google Scholar 

  • Gupta A., Jayaraman V. K., Kulkarni. B. D.: Feature selection for cancer classification using ant colony optimization and support vector machines. Analysis of Biological Data : A Soft Computing Approach. ser. World Scientific, Singapore. 259 –280(2006).

    Google Scholar 

  • Nikumbh S., Ghosh S., Jayaraman V. K.: Biogeography-Based Informative Gene Selection and Cancer Classification Using SVM and Random Forests. In IEEE World Congress on Computational Intelligence (IEEE WCCI 2012), Australia, In IEEE Press.(2012).

    Google Scholar 

  • John G. H., Kohavi R., and Pfleger K.: Irrelevant features and the subset selection problem. In Proceedings of the Eleventh International Conference on Machine Learning. 121–129.(1994).

    Google Scholar 

  • Yang X-S.: Nature-Inspired Metaheuristic Algorithm. Luniver Press(2008).

    Google Scholar 

  • Yang X-S.: Firefly algorithms for multimodal optimization, in: Stochastic Algorithms: Foundations and Applications, SAGA, Lecture Notes in Computer Sciences, 5792, 169-178.(2009).

    Google Scholar 

  • Jati G. K. and Suyanto S.: Evolutionary discrete firefly algorithm for travelling salesman problem. In ICAIS2011. Lecture Notes in Artificial Intelligence (LNAI 6943). 393-403 (2011).

    Google Scholar 

  • Palit S., Sinha S., Molla M., Khanra A., Kule M.: A cryptanalytic attack on the knapsack cryptosystem using binary Firefly algorithm. In 2nd Int. Conference on Computer and Communication Technology (ICCCT), 15-17 Sept 2011,India, pp. 428-432 (2011).

    Google Scholar 

  • Sayadi M. K., Ramezanian R., Ghaffari-Nasab N.: A discrete firefly meta-heuristic with local search for makespan minimization in permutation flow shop scheduling problems. Int. J. of Industrial Engineering Computations 1: 1–10 (2010).

    Google Scholar 

  • Aungkulanon P., Chai-ead, N., Luangpaiboon P.: Simulated manufacturing process improvement via particle swarm optimisation and firefly algorithms. In Prof. Int. Multiconference of Engineers and Computer Scientists 2: 1123–1128. (2011).

    Google Scholar 

  • U. Hönig U.: A firefly algorithm-based approach for scheduling task graphs in homogenous systems. Proceeding Informatics. DOI: 10.2316/P.2010.724-033, 724 (2010).

  • Senthilnath J., Omkar S.N. and Mani V.: Clustering using firefly algorithm: Performance study, Swarm and Evolutionary Computation, June (2011).

    Google Scholar 

  • Han J., Kamber M., and Pei J., Data Mining: Concepts and Techniques - Information Gain, ser. The Morgan Kaufmann Series in Data Management Systems. Morgan Kaufmann (2011).

    Google Scholar 

  • Hall M., Frank E., Holmes G., Pfahringer B., Reutemann P., and Witten I. H.: The weka data mining software: An update. SIGKDD Explor, vol. 11. 130–133(2009).

    Google Scholar 

  • C. N. Shawe-Taylor J.: Support Vector Machines and Other Kernel-based Methods. Cambridge, UK. Cambridge University Press. (2000).

    Google Scholar 

  • Boser, Bernhard E., Guyon, Isabelle M., and Vapnik, Vladimir N.: Training algorithm for optimal margin classifiers. In 5th Annual ACM Workshop on COLT, 144–152, Pittsburgh, PA, 1992. ACM Press(1992).

    Google Scholar 

  • Chang, C.-C and Lin, C.-J.: LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology. vol. 2. 27:1–27:27(2011).

    Google Scholar 

  • Breiman L.: Random forests. Machine Learning. vol. 45. pp. 5–32. (2001).

    Google Scholar 

  • Breiman L. and Stone F.O.: Classification and regression trees. Chapman and Hall. (1984).

    Google Scholar 

  • Kent ridge bio-medical dataset. URL: http://datam.i2r.astar.edu.sg/datasets/krbd/.

  • Alon U., Barkai N., Notterman D.A., Gish K., Ybarra S., Mack D., and Levine A.J.,.: Broad patterns of gene expression revealed byclustering analysis of tumor and normal colon tissues probed byoligonucleotide arrays. Proceedings of the National Academy of Sciences. vol. 96. no. 12. pp. 6745–6750(1999).

    Google Scholar 

  • Golub T.R., Slonim D.K., Tamayo P., Huard C., Gaasenbee M., Mesirov J.P., Coller H., Loh M. L., Downing J.R., Caligiuri M.A., Bloomfield C.D., and Lander E.S. : Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science. vol. 286. no. 5439. 531–537.(1999).

    Google Scholar 

  • Guyon I., Weston J., Barnhill S., and Vapnik V.: Gene selection for cancer classification using support vector machines. Machine Learning. vol. 46. 389–422. (2002).

    Google Scholar 

  • Mohammad S., Azadeh M. and Mansoor. S.; Identification of disease-causing genes using microarray data mining and gene ontology. BMC Medical Genomics. vol. 4. 4:12 (2011).

    Google Scholar 

  • Liu Q., Sung A. H., Chen Z., Liu J., Chen L., Qiao M., Wang Z, Huang X. and Deng Y.: Gene selection and classification for cancer microarray data based on machine learning and similarity measures. BMC Genomics. vol. 12. 130–133(2011).

    Google Scholar 

  • L. Sun, D. Miao, and H. Zhang.: Efficient gene selection with rough sets from gene expression data. In Rough Sets and Knowledge Technology, ser. Lecture Notes in Computer Science. vol. 5009. 164–171(2008).

    Google Scholar 

Download references

Acknowledgments

VKJ gratefully acknowledges the Department of Science and Technology (DST), New Delhi, India for financial support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to V. K. Jayaraman .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer India

About this paper

Cite this paper

Srivastava, A., Chakrabarti, S., Das, S., Ghosh, S., Jayaraman, V.K. (2013). Hybrid Firefly Based Simultaneous Gene Selection and Cancer Classification Using Support Vector Machines and Random Forests. In: Bansal, J., Singh, P., Deep, K., Pant, M., Nagar, A. (eds) Proceedings of Seventh International Conference on Bio-Inspired Computing: Theories and Applications (BIC-TA 2012). Advances in Intelligent Systems and Computing, vol 201. Springer, India. https://doi.org/10.1007/978-81-322-1038-2_41

Download citation

  • DOI: https://doi.org/10.1007/978-81-322-1038-2_41

  • Published:

  • Publisher Name: Springer, India

  • Print ISBN: 978-81-322-1037-5

  • Online ISBN: 978-81-322-1038-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics