Skip to main content

Constrained Clustering for Gene Expression Data Mining

  • Conference paper
  • 2478 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5012))

Abstract

Constrained clustering algorithms have the advantage that domain-dependent constraints can be incorporated in clustering so as to achieve better clustering results. However, the existing constrained clustering algorithms are mostly k-means like methods, which may only deal with distance-based similarity measures. In this paper, we propose a constrained hierarchical clustering method, called Correlational-Constrained Complete Link (C-CCL), for gene expression analysis with the consideration of gene-pair constraints, while using correlation coefficients as the similarity measure. C-CCL was evaluated for the performance with the correlational version of COP-k-Means (C-CKM) method on a real yeast dataset. We evaluate both clustering methods with two validation measures and the results show that C-CCL outperforms C-CKM substantially in clustering quality.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Basu, S., Banerjee, A., Mooney, R.J.: Semi-supervised Clustering by Seeding. In: Proceedings of the 9th International Conference on Machine Learning, pp. 19–26 (2002)

    Google Scholar 

  2. Cho, S.B., Ryu, J.: Classifying Gene Expression Data of Cancer Using Classifier Ensemble with Mutually Exclusive Features. Proceedings of IEEE 90, 1744–1753 (2002)

    Article  Google Scholar 

  3. Chu, S., DeRisi, J., Eisen, M., Mulholland, J., Botstein, D., Brown, P.O., Herskowitz, I.: The Transcriptional Program of Sporulation in Budding Yeast. Science 282, 699–705 (1998)

    Article  Google Scholar 

  4. Cormen, T.H., Leiserson, C.E., Rivest, R.L.: Introduction to Algorithms, 2nd edn. MIT Press, Cambridge (2001)

    MATH  Google Scholar 

  5. Davidson, I., Ravi, S.S.: Clustering With Constraints: Feasibility Issues and the k-Means Algorithm. In: Proceedings of the SIAM International Conference on Data Mining (2005)

    Google Scholar 

  6. Fisher, D.H.: Knowledge Acquisition via Incremental Conceptual Clustering. Machine Learning 2, 139–172 (1987)

    Google Scholar 

  7. Gordon, A.D.: Classification, 2nd edn. Monographs on Statistics and Applied Probability 82. Chapman and Hall/CRC, NY (1999)

    MATH  Google Scholar 

  8. Klein, D., Kamvar, S., Manning, C.: From Instance-level Constraints to Space-level Constraints: Making the Most of Prior Knowledge in Data Clustering. In: Proceedings of the 9th International Conference on Machine Learning, pp. 307–314 (2002)

    Google Scholar 

  9. Tseng, V.S., Kao, C.P.: Efficiently Mining Gene Expression Data via a Novel Parameterless Clustering Method. IEEE/ACM Transactions on Computational Biology and Bioinformatics 2, 355–365 (2005)

    Article  Google Scholar 

  10. Wagstaff, K., Cardie, C.: Clustering with Instance-level Constraints. In: 17th International Conference on Machine Learning, pp. 1103–1110 (2000)

    Google Scholar 

  11. Wagstaff, K., Cardie, C., Rogers, S., Schroedl, S.: Constrained k-means Clustering with Background Knowledge. In: Proceedings of the 19th International Conference on Machine Learning, pp. 577–584 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Takashi Washio Einoshin Suzuki Kai Ming Ting Akihiro Inokuchi

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Tseng, V.S., Chen, LC., Kao, CP. (2008). Constrained Clustering for Gene Expression Data Mining. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2008. Lecture Notes in Computer Science(), vol 5012. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68125-0_73

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-68125-0_73

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68124-3

  • Online ISBN: 978-3-540-68125-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics