Skip to main content

Efficient Bisecting k-Medoids and Its Application in Gene Expression Analysis

  • Conference paper
Image Analysis and Recognition (ICIAR 2008)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 5112))

Included in the following conference series:

  • 1792 Accesses

Abstract

The medoid-based clustering algorithm, Partition Around Medoids (PAM), is better than the centroid-based k-means because of its robustness to noisy data and outliers. PAM cannot recognize relatively small clusters in situations where good partitions around medoids clearly exist. Also PAM needs O(k(n-k)2) operations to cluster a given dataset, which is computationally prohibited for large n and k. In this paper, we propose a new bisecting k-medoids algorithm that is capable of grouping the co-expressed genes together with better clustering quality and time performances. The proposed algorithm is evaluated over three gene expression datasets in which noise components are involved. The proposed algorithm takes less computation time with comparable performance relative to the Partitioning Around Medoids algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Hartigan, J.: Clustering Algorithms, Wiley Series in Probability and Mathematical Statistics (1975)

    Google Scholar 

  2. Jain, A., Murty, M., Flynn, P.: Data Clustering: A Review. ACM computing surveys 31, 264–323 (1999)

    Article  Google Scholar 

  3. Ben-Dor, A., Shamir, R., Yakhini, Z.: Clustering gene expression patterns. Journal of Computational Biology 6(3/4), 281–297 (1999)

    Article  Google Scholar 

  4. Shamir, R., Sharan, R.: Algorithmic Approaches to Clustering Gene Expression Data. In: Current Topics in Computational Biology, pp. 269–299. MIT Press, Cambridge (2002)

    Google Scholar 

  5. Hartuv, E., Shamir, R.: A clustering algorithm based on graph connectivity. Information Processing Letters 76(200), 175–181 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  6. Hartigan, J., Wong, M.: A k-means Clustering Algorithm. Applied Statistics 28, 100–108 (1979)

    Article  MATH  Google Scholar 

  7. Bezdek, J., Ehrlich, R., Full, W.: The Fuzzy C-Means Clustering Algorithm. Computers and Geosciences 10, 191–203 (1984)

    Article  Google Scholar 

  8. Savaresi, S., Boley, D.: On the Performance of Bisecting K-means and PDDP. In: Proc. of the 1st SIAM Int. Conf. on Data Mining, pp. 1–14 (2001)

    Google Scholar 

  9. Yousri, N.A., Ismail, M.A., Kamel, M.S.: Discovering Connected Patterns in Gene Expression Arrays. In: IEEE Symposium on Computational intelligence and Bioinformatics and Computational Biology (CIBCB), pp. 113–120 (2007)

    Google Scholar 

  10. Lu, Y., Lu, S., Fotouhi, F., Deng, Y., Brown, S.: Incremental Genetic K-means Algorithm and its Application in Gene Expression Data Analysis. BMC Bioinformatics 5(172) (2004)

    Google Scholar 

  11. Golub, T., Slonim, D., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J., Coller, H., Loh, M., Downing, J., Caligiuri, M.A., Bloomfield, C., Lander, E.: Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring. Science 286(5439), 531–537 (1999)

    Article  Google Scholar 

  12. Monti, S., Tamayo, P., Mesirov, J., Golub, T.: Consensus Clustering: A resampling-based method for class discovery and visualization of gene expression microarray data. Kluwer Academic Publishers, Dordrecht (2003)

    Google Scholar 

  13. Kaufmann, L., Rousseeuw, P.: Finding groups in data. Wiley, Chichester (1990)

    Google Scholar 

  14. Kustra, R., Zaganski, A.: Incorporating Gene Ontology in Clustering Gene Expression Data. In: Proceedings of the 19th IEEE Symposium on Computer-Based Medical Systems, pp. 555–563 (2006)

    Google Scholar 

  15. Ng, R.T., Han, J.: Efficient and effective clustering methods for spatial data mining. In: VLDB, pp. 144–155 (1994)

    Google Scholar 

  16. Hammouda, K., Kamel, M.: Collaborative Document Clustering. In: SIAM Conference on Data Mining (SDM 2006), pp. 453–463 (2006)

    Google Scholar 

  17. Zhao, Y., Karypis, G.: Criterion Functions for Document Clustering: Experiments and Analysis, Technical report, Department of Computer Science,University of Minnesota, Minneapolis, MN (2002)

    Google Scholar 

  18. Bensaid, A., Hall, L.O., Bezdek, J., Clarke, L., Silbiger, M., Arrington, J., Murtagh, R.: Validity-guided (Re)Clustering with applications to imige segmentation. IEEE Transactions on Fuzzy Systems, 112–123 (1996)

    Google Scholar 

  19. Tavazoie, S., Hughes, J., Campbell, M., Cho, R., Church, G.: Systematic determination of genetic network architecture. Nature Genetics 22, 281–285 (1999)

    Article  Google Scholar 

  20. West, M., Blanchette, C., Dressman, H., Huang, E., Ishida, S., Spang, R., Zuzan, H., Olson, J., Marks, J., Nevins, J.: Predicting the clinical status of human breast cancer by using gene expression profiles. Proc. Natl. Acad. Sci. USA 98, 11462–11467 (2001)

    Article  Google Scholar 

  21. Tamayo, P., Slonim, D., Mesirov, J., Zhu, Q., Kitareewan, S., Dmitrovsky, E., Lander, E.S., Golub, T.: Interpreting patterns of gene expression with self-organizing maps: Methods and application to hematopoietic differentiation. PNAS 96, 2907–2912 (1999)

    Article  Google Scholar 

  22. Eisen, M., Spellman, P., Brown, P., Botstein, D.: Cluster Analysis and Display of Genome-wide Expression Patterns. Proceedings of the National Academy of Sciences of the United States of America 95(25), 14863–14868 (1998)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Aurélio Campilho Mohamed Kamel

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kashef, R., Kamel, M.S. (2008). Efficient Bisecting k-Medoids and Its Application in Gene Expression Analysis. In: Campilho, A., Kamel, M. (eds) Image Analysis and Recognition. ICIAR 2008. Lecture Notes in Computer Science, vol 5112. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69812-8_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-69812-8_42

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-69811-1

  • Online ISBN: 978-3-540-69812-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics