Skip to main content

Advertisement

Log in

Incorporating EBO-HSIC with SVM for Gene Selection Associated with Cervical Cancer Classification

  • Transactional Processing Systems
  • Published:
Journal of Medical Systems Aims and scope Submit manuscript

Abstract

Microarray technology is utilized by the biologists, in order to compute the expression levels of thousands of genes. Cervical cancer classification utilizing gene expression data depends upon conventional supervised learning methods, wherein only labeled data could be used for learning. The previous methodologies had problem with appropriate feature selection as well as accurateness of classification outcomes. So, the entire performance of the cancer classification is decreased meaningfully. With the aim of overcoming the aforesaid problems, Enhanced Bat Optimization Algorithm with Hilbert-Schmidt Independence Criterion (EBO-HSIC) and Support Vector Machine (SVM) algorithm is presented in this research for identifying the specific genes from the gene expression dataset that belongs to cancer microarray. This proposed system contains phases of instance normalization, module detection, gene selection and classification. By Fuzzy C Means (FCM) algorithm, the normalization is performed for eliminating the inappropriate features from the gene dataset. Meanwhile, for effective feature selection, the EBO algorithm is used for producing more appropriate features via improved objective function values. For determining a subset of the most informative genes utilizing a rapid as well as scalable bat algorithm, this proposed method focuses on measuring the dependence amid Differentially Expressed Genes (DEGs) as well as the gene significance. The algorithm is dependent upon the HSIC and was partially enthused by EBO. With the help of SVM classifier, these gene features are categorized very precisely. Experimentation outcomes demonstrate that the presented EBO with SVM algorithm confirms a clear-cut classification performance for the given gene expression datasets. Hence the result provides higher performance by launching EBO with SVM algorithm to obtain greater accuracy, recall, precision, f-measure and less time complexity more willingly than the previous techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Denny, L., Cervical cancer: Prevention and treatment. Discov Med. 14:125–131, 2012.

    PubMed  Google Scholar 

  2. Satija, A., Cervical cancer in India. South Asia Centre for chronic disease.[accessed February16, 2014], 2014. Available from: http://sancd.Org/uploads/ pdf/cervical_cancer.Pdf, 2.

  3. Arbyn, M., Castellsague, X., DeSanjose, S. et al., Worldwide burden of cervical cancer. Ann. Oncol. 22:2675–2686, 2011.

    Article  CAS  Google Scholar 

  4. Yeole, B. B., Kumar, A. V., Kurkureet, A., and Sunny, L., Population-based survival from cancers of breast, cervix and ovary in women in Mumbai. Asian Pac. J Cancer Prev. 5:308–315, 2004.

    PubMed  Google Scholar 

  5. Bruni, L., Barrionuevo-Rosas, L., Albero, G., Serrano, B., Mena, M. and Gómez, D., ICO information Centre on HPV and Cancer. Human papillomavirus and related diseases in Ghana. Summary Report, HI Centre, Editor, 2015.

  6. Gadducci, A., Barsotti, C., Cosio, S., Domenici, L., and Riccardo, A. G., Smoking habit, immune suppression, oral contraceptive use, and hormone replacement therapy use and cervical carcinogenesis: A review of the literature. Gynecol. Endocrinol. 27(8):597–604, 2011.

    Article  Google Scholar 

  7. Stuart, C., and Ash, M., Gynaecology by ten teachers (18 ed.). London, U.K: Hodder education, 2006.

    Google Scholar 

  8. Croce, C. M., Oncogenes and cancer. N. Engl. J. Med. 358(5):502–511, 2008.

    Article  CAS  Google Scholar 

  9. Wang, S. S., Gonzalez, P., Yu, K., Porras, C., Li, Q., Safaeian, M., Rodriguez, A. C., Sherman, M. E., Bratti, C., Schiffman, M., and Wacholder, S., Common genetic variants and risk for HPV persistence and progression to cervical cancer. PloS one 5(1):e8667, 2010.

    Article  Google Scholar 

  10. Huang, D. S., and Yu, H. J., Normalized feature vectors: A novel alignment-free sequence comparison method based on the numbers of adjacent amino acids. IEEE/ACM Trans. Comput. Biol. Bioinformat. 10(2):457–467, 2013.

    Article  Google Scholar 

  11. Wang, S. L., Zhu, Y., Jia, W., and Huang, D. S., Robust classification method of tumor subtype by using correlation filters. IEEE/ACM Trans. Comput. Biol. Bioinformat. 9(2):580–591, 2012.

    Article  Google Scholar 

  12. Bergmann, S. et al., Similarities and differences in genome-wide expression data of six organisms. PLoSBiol 2:E9, 2004.

    Article  Google Scholar 

  13. Hudson, N. J., Reverter, A., and Dalrymple, B. P., A differential wiring analysis of expression data correctly identifies the gene containing the causal mutation. PLoSComput. Biol. 5(5):e1000382, 2009.

    Google Scholar 

  14. Maji, P., F-information measures for efficient selection of discriminative genes from microarray data. IEEE Trans. Biomed. Eng. 56(4):1063–1069, 2009.

    Article  CAS  Google Scholar 

  15. Guyon, I., and Elisseeff, A., An introduction to variable and feature selection. J. Mach. Learn. Res. 3:1157–1182, 2003.

    Google Scholar 

  16. Peng, H., Long, F., and Ding, C., Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8):1226–1238, 2005.

    Article  Google Scholar 

  17. Cheng, Q., Zhou, H., and Cheng, J., The fisher-Markov selector: Fast selecting maximally separable feature subset for multiclass classification with applications to high-dimensional data. IEEE Trans. Pattern Anal. Mach. Intell. 33(6):1217–1233, 2011.

    Article  Google Scholar 

  18. Lee, K. S., and Geem, Z. W., A new meta-heuristic algorithm for continuous engineering optimization: Harmony search theory and practice. Comput. Methods Appl .Mech. Eng. 194(36–38):3902–3933, 2005.

    Article  Google Scholar 

  19. Yang, X.S., A new metaheuristic bat-inspired algorithm. Nature inspired cooperative strategies for optimization (NICSO 2010) (pp. 65–74). Springer, Berlin, Heidelberg, 2010.

    Chapter  Google Scholar 

  20. Tang, E.K., Suganthan, P.N. and Yao, X., Feature selection for microarray data using least squares SVM and particle swarm optimization. IEEE Symp. Comput. Intell. Bioinform. Comput. Biol. 2005 (CIBCB'05), 1–8, 2005.

  21. Gretton, A., Bousquet, O., Smola, A. and Schölkopf, B., Measuring statistical dependence with Hilbert-Schmidt norms. In International conference on algorithmic learning theory (pp. 63–77). Springer, Berlin, Heidelberg, 2005.

    Google Scholar 

  22. Hernandez, J. C., Duval, B., and Hao, J.-K., SVM-based local search for gene selection and classification of microarray data. Bioinform. Res. Dev. Springer, Berlin, Heidelberg. 499–508, 2008.

  23. Chen, X., Jiang, J., Shen, H., and Hu, Z., Genetic susceptibility of cervical cancer. J. Biomed. Res. 25(3):155–164, 2011.

    Article  CAS  Google Scholar 

  24. Thomas, A., Mahantshetty, U., Kannan, S., Deodhar, K., Shrivastava, S. K., Kumar-Sinha, C., and Mulherkar, R., Expression profiling of cervical cancers in Indian women at different stages to identify gene signatures during progression of the disease. Canc. Med 2(6):836–848, 2013.

    Article  CAS  Google Scholar 

  25. Ongenaert, M., Wisman, G. B. A., Volders, H. H., Koning, A. J., van der Zee, A. G., Van Criekinge, W., and Schuuring, E., Discovery of DNA methylation markers in cervical cancer using relaxation ranking. BMC Med. Genom. 1(1):57, 2008.

    Article  Google Scholar 

  26. Viswanathan, V. and Vineetha, S., Early detection of cervical cancer using microarray analysis and gene regulatory rules. International Conference on Emerging Technological Trends (ICETT), pp. 1–6, 2016.

  27. Lee, H. S., Yun, J. H., Jung, J., Yang, Y., Kim, B. J., Lee, S. J., Yoon, J. H., Moon, Y., Kim, J. M., and Kwon, Y. I., Identification of differentially-expressed genes by DNA methylation in cervical cancer. Oncol. Lett. 9(4):1691–1698, 2015.

    Article  CAS  Google Scholar 

  28. Mine, K. L., Shulzhenko, N., Yambartsev, A., Rochman, M., Sanson, G. F., Lando, M., Varma, S., Skinner, J., Volfovsky, N., Deng, T., and Brenna, S. M., Gene network reconstruction reveals cell cycle and antiviral genes as major drivers of cervical cancer. Nat. Commun. 4(1806):1–11, 2013.

    Google Scholar 

  29. Langfelder, P., and Horvath, S., WGCNA: An R package for weighted correlation network analysis. BMC Bioinform. 9(1):1–13, 2008.

    Article  Google Scholar 

  30. DiLeo, M. V., Strahan, G. D., den Bakker, M., and Hoekenga, O. A., Weighted correlation network analysis (WGCNA) applied to the tomato fruit metabolome. PLoS One 6(10):e26683, 2011.

    Article  CAS  Google Scholar 

  31. Chuang, K. S., Tzeng, H. L., Chen, S., Wu, J., and Chen, T. J., Fuzzy c-means clustering with spatial information for image segmentation. Comput. Med. Imag. Graph. 30(1):9–15, 2006.

    Article  Google Scholar 

  32. Zhang, S., Wang, R. S., and Zhang, X. S., Identification of overlapping community structure in complex networks using fuzzy c-means clustering. Phys. A: Stat. Mech. Appl. 374(1):483–490, 2007.

    Article  Google Scholar 

  33. Van der Laan, M., Pollard, K., and Bryan, J., A new partitioning around medoids algorithm. J. Stat. Comput. Simul 73(8):575–584, 2003.

    Article  Google Scholar 

  34. Langfelder, P., Zhang, B., and Horvath, S., Defining clusters from a hierarchical cluster tree: The dynamic tree cut package for R. Bioinformatics 24(5):719–720, 2007.

    Article  Google Scholar 

  35. Rai, P., and Singh, S., A survey of clustering techniques. Int. J. Comput. Appl. 7(12):1–5, 2010.

    Google Scholar 

  36. Bhat, A., K-medoids clustering using partitioning around medoids for performing face recognition. Int. J. Soft Comput. Math. Contrl. 3(3):1–12, 2014.

    Article  Google Scholar 

  37. Song, J. B., Borgwardt, K. M., Gretton, A., and Smola, A. J., Gene selection via the BAHSIC family of algorithms. Bioinf. 23:i490–i498, 2007.

    Article  CAS  Google Scholar 

  38. Yang, X. S., and Hossein Gandomi, A., Bat algorithm: A novel approach for global engineering optimization. Eng. Comput. 29(5):464–483, 2012.

    Article  Google Scholar 

  39. Gandomi, A. H., Yang, X. S., Alavi, A. H., and Talatahari, S., Bat algorithm for constrained optimization tasks. Neural Comput. Appl. 22(6):1239–1255, 2013.

    Article  Google Scholar 

  40. Yang, X. S., Bat algorithm for multi-objective optimisation. Int. J. Bio-Inspired Comput. 3(5):267–274, 2011.

    Article  Google Scholar 

  41. Spitzer, F., Principles of random walk (Vol. 34). Springer Science & Business Media, 2013.

  42. Wang, L. Ed., 2005. Support vector machines: Theory and applications (Vol. 177). Springer Science & Business Media, 2005.

  43. Fung, G. M., and Mangasarian, O. L., Multicategory proximal support vector machine classifiers. Mach. Learn. 59(1–2):77–97, 2005.

    Article  Google Scholar 

  44. Min, J. H., and Lee, Y. C., Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters. Expert Syst. Appl. 28(4):603–614, 2005.

    Article  Google Scholar 

  45. Widodo, A., and Yang, B. S., Support vector machine in machine condition monitoring and fault diagnosis. Mech. Syst. Sign. Process. 21(6):2560–2574, 2007.

    Article  Google Scholar 

  46. Sokolova, M., and Lapalme, G., A systematic analysis of performance measures for classification tasks. Inform. Process. Manag. 45(4):427–437, 2009.

    Article  Google Scholar 

  47. García, S., Fernández, A., Luengo, J., and Herrera, F., A study of statistical techniques and performance measures for genetics-based machine learning: Accuracy and interpretability. Soft Comput. 13(10):959–977, 2009.

    Article  Google Scholar 

  48. Pepe, M. S., Feng, Z., Janes, H., Bossuyt, P. M., and Potter, J. D., Pivotal evaluation of the accuracy of a biomarker used for classification or prediction: Standards for study design. J. Natl. Cancer Instit. 100(20):1432–1438, 2008.

    Article  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. Geeitha.

Additional information

This article is part of the Topical Collection on Transactional Processing Systems

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Geeitha, S., Thangamani, M. Incorporating EBO-HSIC with SVM for Gene Selection Associated with Cervical Cancer Classification. J Med Syst 42, 225 (2018). https://doi.org/10.1007/s10916-018-1092-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10916-018-1092-5

Keywords

Navigation