Skip to main content

A Betweenness Centrality Guided Clustering Algorithm and Its Applications to Cancer Diagnosis

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10682))

Abstract

Clustering has become one of the important data analysis techniques for the discovery of cancer disease. Numerous clustering approaches have been proposed in the recent years. However, handling of high-dimensional cancer gene expression datasets remains an open challenge for clustering algorithms. In this paper, we present an improved graph based clustering algorithm by applying edge betweenness criterion on spanning subgraph. We carry out empirical analysis on artificial datasets and five cancer gene expression datasets. Results of the study show that the proposed algorithm can effectively discover the cancerous tissues and it performs better than two recent graph based clustering algorithms in terms of cluster quality as well as modularity index.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Bayá, A.E., Granitto, P.M.: Clustering gene expression data with a penalized graph-based metric. BMC Bioinform. 12(1), 2–19 (2011)

    Article  Google Scholar 

  2. Bayá, A.E., Larese, M.G., Granitto, P.M.: Clustering using PK-D: a connectivity and density dissimilarity. Expert Syst. Appl. 51(1), 151–160 (2016)

    Article  Google Scholar 

  3. Dost, B., Wu, C., Su, A., Bafna, V.: TCLUST: a fast method for clustering genome-scale expression data. IEEE/ACM Trans. Comput. Biol. Bioinform. (TCBB) 8(3), 808–818 (2011)

    Article  Google Scholar 

  4. Hoshida, Y., Brunet, J.P., Tamayo, P., Golub, T.R., Mesirov, J.P.: Subclass mapping: identifying common subtypes in independent disease data sets. PLoS ONE 2(11), e1195 (2007)

    Article  Google Scholar 

  5. Huttenhower, C., Flamholz, A.I., Landis, J.N., Sahi, S., Myers, C.L., Olszewski, K.L., Hibbs, M.A., Siemers, N.O., Troyanskaya, O.G., Coller, H.A.: Nearest Neighbor Networks: clustering expression data based on gene neighborhoods. BMC Bioinform. 8(250), 1–13 (2007)

    Google Scholar 

  6. Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. (CSUR) 31(3), 264–323 (1999)

    Article  Google Scholar 

  7. Jay, J.J., Eblen, J.D., Zhang, Y., Benson, M., Perkins, A.D., Saxton, A.M., Voy, B.H., Chesler, E.J., Langston, M.A.: A systematic comparison of genome-scale clustering algorithms. BMC Bioinform. 13(Suppl 10), S7 (2012)

    Article  Google Scholar 

  8. Jiang, D., Tang, C., Zhang, A.: Cluster analysis for gene expression data: a survey. IEEE Trans. Knowl. Data Eng. 16(11), 1370–1386 (2004)

    Article  Google Scholar 

  9. Jothi, R., Mohanty, S.K., Ojha, A.: Functional grouping of similar genes using eigenanalysis on minimum spanning tree based neighborhood graph. Comput. Biol. Med. 71, 135–148 (2016)

    Article  Google Scholar 

  10. Jothi, R., Mohanty, S.K., Ojha, A.: Fast approximate minimum spanning tree based clustering algorithm. Neurocomputing 272, 542–557 (2017)

    Article  Google Scholar 

  11. Newman, M.E.: Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74(3), 036104 (2006)

    Article  MathSciNet  Google Scholar 

  12. Pirim, H., Ekşioğlu, B., Perkins, A.D.: Clustering high throughput biological data with B-MST, a minimum spanning tree based heuristic. Comput. Biol. Med. 62, 94–102 (2015)

    Article  Google Scholar 

  13. Ruan, J., Dean, A.K., Zhang, W.: A general co-expression network-based approach to gene expression analysis: comparison and applications. BMC Syst. Biol. 4(1), 8 (2010)

    Article  Google Scholar 

  14. de Souto, M.C., Costa, I.G., de Araujo, D.S., Ludermir, T.B., Schliep, A.: Clustering cancer gene expression data: a comparative study. BMC Bioinform. 9(1), 1–14 (2008)

    Article  Google Scholar 

  15. Thalamuthu, A., Mukhopadhyay, I., Zheng, X., Tseng, G.C.: Evaluation and comparison of gene clustering methods in microarray analysis. Bioinformatics 22(19), 2405–2412 (2006)

    Article  Google Scholar 

  16. Xu, R., Wunsch, D.C.: Clustering algorithms in biomedical research: a review. IEEE Rev. Biomed. Eng. 3, 120–154 (2010)

    Article  Google Scholar 

  17. Yu, Z., Wong, H.S., Wang, H.: Graph-based consensus clustering for class discovery from gene expression data. Bioinformatics 23(21), 2888–2896 (2007)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to R. Jothi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jothi, R. (2017). A Betweenness Centrality Guided Clustering Algorithm and Its Applications to Cancer Diagnosis. In: Ghosh, A., Pal, R., Prasath, R. (eds) Mining Intelligence and Knowledge Exploration. MIKE 2017. Lecture Notes in Computer Science(), vol 10682. Springer, Cham. https://doi.org/10.1007/978-3-319-71928-3_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-71928-3_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-71927-6

  • Online ISBN: 978-3-319-71928-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics