ABSTRACT
This paper studies efficient mining of negative correlations that pace in collaboration. A collaborating negative correlation is a negative correlation between two sets of variables rather than traditionally between a pair of variables. It signifies a synchronized value rise or fall of all variables within one set whenever all variables in the other set go jointly at the opposite trend. The time complexity is exponential in mining. The high efficiency of our algorithm is attributed to two factors: (i) the transformation of the original data into a bipartite graph database, and (ii) the mining of transpose closures from a wide transactional database. Applying to a Yeast gene expression data, we evaluate, by using Pearson's correlation coefficient and P-value, the biological relevance of collaborating negative correlations as an example among many real-life domains.
Supplemental Material
- R. Agrawal, T. Imielinski, and A. N. Swami. Mining association rules between sets of items in large databases. In SIGMOD, pages 207--216, 1993. Google ScholarDigital Library
- J. S. Aguilar-Ruiz. Shifting and scaling patterns from gene expression data. Bioinformatics, 21(20):3840--3845, 2005. Google ScholarDigital Library
- Ashburner et al . Gene ontology: tool for the unification of biology. Nature Genetics, 25(1):25--29, 2000.Google ScholarCross Ref
- Cherry et al. SGD: Saccharomyces genome database. Nucl. Acids Res., 26(1):73--79, 1998.Google ScholarCross Ref
- Cho et al . A genome-wide transcriptional analysis of the mitotic cell cycle. Molecular Cell, 2(1):65--73, 1998.Google ScholarCross Ref
- Chuang et al . A pattern recognition approach to infer time-lagged genetic interactions. Bioinformatics, 24(9):1183--1190, 2008. Google ScholarDigital Library
- C. Ding and H. Peng. Minimum redundancy feature selection from microarray gene expression data. J Bioinform Comput Biol, 3(2):185--205, April 2005.Google ScholarCross Ref
- G. Grahne and J. Zhu. Efficiently using prefix-trees in mining frequent itemsets. In B. Goethals and M. J. Zaki, editors, FIMI, volume 90 of CEUR Workshop Proceedings. CEUR-WS.org, 2003.Google Scholar
- G. Grahne and J. Zhu. Fast algorithms for frequent itemset mining using fp-trees. IEEE TKDE, 17(10):1347--1362, 2005. Google ScholarDigital Library
- James et al. Microarray analyses of gene expression during chondrocyte differentiation identifies novel regulators of hypertrophy. Molecular Biology of the Cell, 16(11):5316--5333, 2005.Google ScholarCross Ref
- L. Ji and K.-L. Tan. Mining gene expression data for positive and negative co-regulated gene clusters. Bioinformatics, 20(16):2711--2718, 2004. Google ScholarDigital Library
- K. Koch, S. Schonauer, I. Jansen, J. van den Bussche, and T. Burzykowski. Finding clusters of positive and negative coregulated genes in gene expression data. In BIBE, page 93--99, 2007.Google Scholar
- Lee et al. High-resolution analysis of condition-specific regulatory modules in saccharomyces cerevisiae. Genome Biology, 9:R2.1--R2.21, 2008.Google Scholar
- S. C. Madeira and A. L. Oliveira. Biclustering algorithms for biological data analysis: a survey. IEEE/ACM TCBB, 1(1):24--45, 2004. Google ScholarDigital Library
- S. C. Madeira and A. L. Oliveira. A polynomial time biclustering algorithm for finding approximate expression patterns in gene expression time series. Algorithms for Molecular Biology, 4(8), 2009.Google Scholar
- Y. Matsuo and H. Yamamoto. Community gravity: measuring bidirectional e ects by trust and rating on online social networks. In Quemada et al., editor, WWW, pages 751{760. ACM, 2009. Google ScholarDigital Library
- C. Missero, M. T. Pirro, and R. Di Lauro. Multiple ras downstream pathways mediate functional repression of the homeobox gene product TTF-1. Molecular and Cellular Biology, 20(8):2783--2793, 2000.Google ScholarCross Ref
- L. Parsons, E. Haque, and H. Liu. Subspace clustering for high dimensional data: a review. SIGKDD Explor. Newsl., 6(1):90--105, June 2004. Google ScholarDigital Library
- Pasquier et al. Discovering frequent closed itemsets for association rules. In ICDT'99, pages 398--416, London, UK, 1999. Springer-Verlag. Google ScholarDigital Library
- Santos et al. Zinc suppresses the iron-accumulation phenotype of saccharomyces cerevisiae lacking the yeast frataxin homologue (yfh1). Biochem. J., 375:247--254, 2003.Google ScholarCross Ref
- Schmid et al . A gene expression map of arabidopsis thaliana development. Nature Genetics, 37(5):501--506, 2005.Google ScholarCross Ref
- S. Skiena. Implementing Discrete Mathematics: Combinatorics and Graph Theory with Mathematica. Addison-Wesley, MA, 1990. Google ScholarDigital Library
- Spellman et al. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Molecular Biology of the Cell, 9(12):3273--3297, 1998.Google ScholarCross Ref
- D. J. Stekel and D. J. Jenkins. Strong negative self regulation of prokaryotic transcription factors increases the intrinsic noise of protein expression. BMC Systems Biology, 2(6), 2008.Google Scholar
- M. D. Stern. The power of low-correlation investing. http://www.nysscpa.org/cpajournal/2003/1103/features/f114203.htm, Last accessed time: Jan, 2010.Google Scholar
- Vandeputte et al. A nonsense mutation in the ERG6 gene leads to reduced susceptibility to polyenes in a clinical isolate of candida glabrata. Antimicrobial Agents and Chemotherapy, 52(10):3701--3709, 2008.Google ScholarCross Ref
- M. Veen, U. Stahl, and C. Lang. Combined overexpression of genes of the ergosterol biosynthetic pathway leads to accumulation of sterols in saccharomyces cerevisiae. FEMS Yeast Research, 4(1):87--95, 2003.Google ScholarCross Ref
- Wu et al. Repression of sulfate assimilation is an adaptive response of yeast to the oxidative stress of zinc deficiency. JBC Papers in Press, 284(40):27544--56, 2009.Google Scholar
- X. Xu, Y. Liu, A. Tung, and W. Wang. Mining shifting-and-scaling co-regulation patterns on gene expression profiles. In ICDE, pages 89--98, 2006. Google ScholarDigital Library
- T. Zeng and J. Li. Maximization of negative correlations in time-course gene expression data for enhancing understanding of molecular pathways. Nucl. Acids Res., 38(1):gkp822+, January 2010.Google Scholar
Index Terms
- Negative correlations in collaboration: concepts and algorithms
Recommendations
Using Formal Concept Analysis to Identify Negative Correlations in Gene Expression Data
Recently, many biological studies reported that two groups of genes tend to show negatively correlated or opposite expression tendency in many biological processes or pathways. The negative correlation between genes may imply an important biological ...
Scatter search-based identification of local patterns with positive and negative correlations in gene expression data
Graphical abstractDisplay Omitted HighlightsBiclustering of gene expression data.Scatter search metaheuristic.Correlation-based merit function.Positive and negative correlations among genes.Comparison is based on a priori biological information. This ...
Easily simulated multivariate binary distributions with given positive and negative correlations
We consider the problem of defining a multivariate distribution of binary variables, with given first two moments, from which values can be easily simulated. Oman and Zucker [Oman, S.D., Zucker, D.M., 2001. Modelling and generating correlated binary ...
Comments