ABSTRACT
Bioinformatics is the science of managing, mining and interpreting information from biological sequences and structures. DNA Microarrays, also known as gene chips, provide an effective tool for monitoring and profiling gene expression patterns by measuring the expression levels of thousands of genes simultaneously. Clustering is a popular technique for microarray data to finding groups of genes with similar functionalities based on GO Ontology. In this paper, data mining technique, clustering is used on microarray data to group genes with similar functionalities based on Go ontology. Gene Ontology is used to provide external validation for the clusters to determine if the genes in a cluster belong to a specific Biological Process, Cellular Component and Molecular Function. A functionally meaningful cluster contains many genes that are annotated to a specific GO terms. To prove that each of these new cluster sets reveal biological associations that were not apparent from clustering the original gene expression data.
- }}Ben-Dor, R. Shamir, and Z. Yakhini, "Clustering gene expression Patterns" in J Comput Biol 6(3--4):281--97.Google Scholar
- }}A. Gruzdz, A. Ihnatowicz, J. Siddiqi, and B. Akhgar, "Mining Genes Relations in Microarray Data Combined with Ontology in Colon Cancer Automated Diagnosis System", Vol. 16, Nov 2006, ISSN 1307--6884.Google Scholar
- }}Bingham E, Mannila H: Random projection in dimensionality reduction: applications to image and text data. Knowledge Discovery and Data Mining 2001:245--250. Google ScholarDigital Library
- }}Cheng Y, Church GM: Biclustering of Expression Data. Eighth International Conference on Intelligent Systems for Molecular Biology 2000:93--103. Google ScholarDigital Library
- }}ChodziwadziwaWhitesonKabudula, "Overview of Tools for Microarray Data Analysis and Comparison Analysis", 2006.Google Scholar
- }}Draghici S, Khatri P, Martins RP, Ostermeier GC, Krawetz SA: Global functional profiling of gene expression. Genomics 2003, 81(2):98--104.Google Scholar
- }}G. Salton, A. Wong, and C. S. Yang, "A Vector Space Model for Automatic Indexing", Communications of the ACM, Vol. 18, no. 11, Pages 613--620. Google ScholarDigital Library
- }}Huang D, Wei P, Pan W: Combining Gene Annotations and Gene Expression Data in Model-Based Clustering: Weighted Method. OMICS: A Journal of Integrative Biology 2006, 10:28.Google Scholar
- }}Hwangmin Ki," Microarray Data Analysis Methods Comparison: A Review".Google Scholar
- }}Izumi M, Yatagai F, Hanaoka F: Cell cycle-dependent proteolysis and phosphorylation of human Mcm10. J Biol Chem 2001, 276(51): M107190200.Google Scholar
- }}Jason T. L. Wang, Mohammed J. Zaki, Hannu T. T. Toivonen and Dennis Shasha, "Data Mining in Bioinformatics", Springer International Edition, pg no. 654.Google Scholar
- }}Julia Handl, Joshua Knowles and Douglas B. Kell, "Computational cluster validation in Post genomic data analysis", Vol. 21, no. 152005, pages 3201--3212. Google ScholarDigital Library
- }}Jyotsna Kasturi, Raj Acharya and Murali Ramanathan, "An information theoretic approach for analyzing temporal patterns of gene expression", Vol. 19, no. 4 2003, pages 449--458.Google Scholar
- }}Khatri P, Draghici S: Ontological analysis of gene expression data: current tools, limitations, and open problems. Bioinformatics 2005, 21(18):3587--3595. Google ScholarDigital Library
- }}Nadia Bolshakova, Francisco Azuaje, Padraig Cunningham, "An integrated tool for Microarray data clustering and cluster validity assessment", 2008.Google Scholar
- }}Parsons L, Haque E, Liu H: Subspace clustering for high dimensional data: A review. SIGKDD Explor Newsl 2004, 6:90--105. Google ScholarDigital Library
- }}Pasquier. C. The: Ontology -- driven analysis of microarray data. Bioinformatics 20(16), 2636--2643, 2004. (www.Wisegeek.com/what - is - microarray.htm.) Google ScholarDigital Library
- }}Pankaj Chopra, Jaewoo Kang, Jiong Yang, Hyung Jun Cho, Heenam, Stancey and Min-GOO Lee, "Microarray data mining using landmark gene-guided clustering", BMC Bioinformatics, 2008, 9:92, February 2009.Google ScholarCross Ref
- }}Taraskina A. S, Cheremushkin E. S, "The Modified Fuzzy C-Means Method for Clustering Of Microarray Data", BGRS 2006.Google Scholar
- }}The Gene Ontology Consortium. The gene ontology (GO) database and informatics resource. Nucleic Acids Research, 32: D258--D261, 2004.Google ScholarCross Ref
- }}Zhou XJ, Kao MCJ, Huang H, Wong A, Nunez-Iglesias J, Primig M, Aparicio OM, Finch CE, Morgan TE, Wong WH: Functional annotation and network reconstruction through Cross-platform integration of microarray data. Nature Biotechnology 2005, 23(2):238--243.Google Scholar
- }}http://www.lfcs.inf.edu.ac.uk/research.database/datamining.htmGoogle Scholar
- }}http://www.ebi.ac.uk/2can/bioinformatics/bioinf_what_1.htmlGoogle Scholar
- }}http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE28Google Scholar
- }}http://www.geneontology.org/GO.current.annotations.shtmlGoogle Scholar
Index Terms
- Finding microarray genes using GO ontology
Recommendations
Using Gene Ontology annotations in exploratory microarray clustering to understand cancer etiology
Gene expression profiling provides insight into the functions of genes at a molecular level. Clustering of gene expression profiles can facilitate the identification of the underlying driving biological program causing genes' co-expression. Standard ...
Gene Ontology Assisted Exploratory Microarray Clustering and Its Application to Cancer
PRIB '08: Proceedings of the Third IAPR International Conference on Pattern Recognition in BioinformaticsGene expression profiling provides insight into the functions of genes at a molecular level. Clustering of gene expression profiles can facilitate the identification of the underlying driving biological program causing genes' co-expression. Standard ...
Set association analysis of SNP case-control and microarray data
RECOMB '02: Proceedings of the sixth annual international conference on Computational biologyCommon heritable diseases ("complex traits") are assumed to be due to multiple underlying susceptibility genes. While genetic mapping methods for mendelian disorders have been very successful, the search for genes underlying complex traits has been ...
Comments