Semi Supervised Spectral Clustering for Regulatory Module Discovery

Mishra, Alok; Gillies, Duncan

doi:10.1007/978-3-540-69828-9_19

Semi Supervised Spectral Clustering for Regulatory Module Discovery

Alok Mishra¹ &
Duncan Gillies¹

Conference paper

855 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 5109))

Abstract

We propose a novel semi-supervised clustering method for the task of gene regulatory module discovery. The technique uses data on dna binding as prior knowledge to guide the process of spectral clustering of microarray experiments. The microarray data from a set of repeat experiments are converted to an affinity, or similarity, matrix using a Gaussian function. We have investigated two methods to determine the optimal Gaussian variance for this purpose. The first method was based on a statistical measure of cluster coherence, and the second on optimising the number of constraints satisfied in the clustering process. The constraints, which were derived from dna-binding data, were used to adjust the affinity matrix to include known gene-gene interactions. Clusters were found using a spectrical clustering algorithm, and validated by using a biological significance score which was the proportion of gene pairs sharing a common transcription factor in the resulting clusters. Our results indicate that our technique can successfully leverage the information available in the dna-binding data. To the best of our knowledge this is a novel formulation for the purpose of gene module discovery.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Segal, E., Shapira, M., Regev, A., Pe’er, D., Botstein, D., Koller, D., Friedman, N.: Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nature Genetics 34(2), 166–176 (2003)
Article Google Scholar
Bar-Joseph, Z., Gerber, G.K., Lee, T.I., Rinaldi, N.J., Yoo, J.Y., Robert, F., Gordon, D.B., Fraenkel, E., Jaakkola, T.S., Young, R.A., Gifford, D.K.: Computational discovery of gene modules and regulatory networks. Nature Biotechnology 21(11), 1337–1342 (2003)
Article Google Scholar
Tanay, A., Sharan, R., Kupiec, M., Shamir, R.: Revealing modularity and organization in the yeast molecular network by integrated analysis of highly heterogeneous genomewide data. PNAS 101(9), 2981–2986 (2004)
Article Google Scholar
Chapelle, O., Schölkopf, B., Zien, A. (eds.): Semi-Supervised Learning (Adaptive Computation and Machine Learning). MIT Press, Cambridge (2006)
Google Scholar
Bradley, P.S., Bennett, K.P., Demiriz, A.: Constrained k-means clustering
Google Scholar
Vert, J.-P., Thurman, R., Noble, W.S.: Kernels for gene regulatory regions. In: Weiss, Y., Schölkopf, B., Platt, J. (eds.) Advances in Neural Information Processing Systems 18, vol. 18, pp. 1401–1408. MIT Press, Cambridge (2006)
Google Scholar
Kondor, R.I., Lafferty, J.D.: Diffusion kernels on graphs and other discrete input spaces. In: ICML, pp. 315–322 (2002)
Google Scholar
Hanisch, D., Zien, A., Zimmer, R., Lengauer, T.: Co-clustering of biological networks and gene expression data. Bioinformatics 18 (suppl. 1) (2002)
Google Scholar
Mewes, H.W., Amid, C., Arnold, R., Frishman, D., Gueldener, U., Mannhaupt, G., Muensterkoetter, M., Pagel, P., Strack, N., Stuempflen, V., Warfsmann, J., Ruepp, A.: Mips: analysis and annotation of proteins from whole genomes. Nucleic Acids Res. 32 Database issue (January 2004)
Google Scholar
Huang, D., Pan, W.: Incorporating biological knowledge into distance-based clustering analysis of microarray gene expression data. Bioinformatics 22(10), 1259–1268 (2006)
Article Google Scholar
Donath, W.E., Hoffman, A.J.: Lower bounds for the partitioning of graphs. IBM J. Res. Dev 17(5), 420–425 (1973)
Article MATH MathSciNet Google Scholar
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(8), 888–905 (2000)
Article Google Scholar
Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: Analysis and an algorithm. In: NIPS, pp. 849–856 (2001)
Google Scholar
Speer, N., Frlich, H., Spieth, C., Zell, A.: Functional grouping of genes using spectral clustering and gene ontology. In: Proceedings of the IEEE International Joint Conference on Neural Networks, pp. 298–303. IEEE Computer Society Press, Los Alamitos (2005)
Chapter Google Scholar
Gasch, A.P., Spellman, P.T., Kao, C.M., Carmel-Harel, O., Eisen, M.B., Storz, G., Botstein, D., Brown, P.O.: Genomic expression programs in the response of yeast cells to environmental changes. Mol. Biol. Cell 11(12), 4241–4257 (2000)
Google Scholar
Harbison, C.T., Gordon, B.D., Lee, T.I., Rinaldi, N.J., Macisaac, K.D., Danford, T.W., Hannett, N.M., Tagne, J.B., Reynolds, D.B., Yoo, J., Jennings, E.G., Zeitlinger, J., Pokholok, D.K., Kellis, M., Rolfe, A.P., Takusagawa, K.T., Lander, E.S., Gifford, D.K., Fraenkel, E., Young, R.A.: Transcriptional regulatory code of a eukaryotic genome. Nature 431(7004), 99–104 (2004)
Article Google Scholar
Ihmels, J., Friedlander, G., Bergmann, S., Sarig, O., Ziv, Y., Barkai, N.: Revealing modular organization in the yeast transcriptional network. Nature Genet. 31, 370–377 (2002)
Google Scholar
Dunn, J.: Well-separated clusters and optimal fuzzy partitions. Journal of Cybernetics 4, 95–104 (1974)
Article MathSciNet Google Scholar
Gibbons, F.D., Roth, F.P.: Judging the Quality of Gene Expression-Based Clustering Methods Using Gene Annotation. Genome Res. 12(10), 1574–1581 (2002)
Article Google Scholar
Gat-Viks, I., Sharan, R., Shamir, R.: Scoring clustering solutions by their biological relevance. Bioinformatics 19(18), 2381–2389 (2003)
Article Google Scholar
Teixeira, M.C., Monteiro, P., Jain, P., Tenreiro, S., Fernandes, A.R., Mira, N.P., Alenquer, M., Freitas, A.T., Oliveira, A.L., Sa-Correia, I.: The YEASTRACT database: a tool for the analysis of transcription regulatory associations in Saccharomyces cerevisiae. Nucl. Acids Res. 34(1), 446–451 (2006)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Imperial College, London, SW7 2AZ, UK
Alok Mishra & Duncan Gillies

Authors

Alok Mishra
View author publications
You can also search for this author in PubMed Google Scholar
Duncan Gillies
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Amos Bairoch Sarah Cohen-Boulakia Christine Froidevaux

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mishra, A., Gillies, D. (2008). Semi Supervised Spectral Clustering for Regulatory Module Discovery. In: Bairoch, A., Cohen-Boulakia, S., Froidevaux, C. (eds) Data Integration in the Life Sciences. DILS 2008. Lecture Notes in Computer Science(), vol 5109. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69828-9_19

Download citation

DOI: https://doi.org/10.1007/978-3-540-69828-9_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69827-2
Online ISBN: 978-3-540-69828-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics