Efficient mining of multilevel gene association rules from microarray and gene ontology

Tseng, Vincent S.; Yu, Hsieh-Hui; Yang, Shih-Chiang

doi:10.1007/s10796-009-9156-1

Efficient mining of multilevel gene association rules from microarray and gene ontology

Published: 03 March 2009

Volume 11, pages 433–447, (2009)
Cite this article

Information Systems Frontiers Aims and scope Submit manuscript

Vincent S. Tseng^1,2,
Hsieh-Hui Yu¹ &
Shih-Chiang Yang¹

222 Accesses
8 Citations
Explore all metrics

Abstract

Some recent studies have shown that association rules can reveal the interactions between genes that might not have been revealed using traditional analysis methods like clustering. However, the existing studies consider only the association rules among individual genes. In this paper, we propose a new data mining method named MAGO for discovering the multilevel gene association rules from the gene microarray data and the concept hierarchy of Gene Ontology (GO). The proposed method can efficiently find out the relations between GO terms by analyzing the gene expressions with the hierarchy of GO. For example, with the biological process in GO, some rules like Process A (up) → Process B (up) cab be discovered, which indicates that the genes involved in Process B of GO are likely to be up-regulated whenever those involved in Process A are up-regulated. Moreover, we also propose a constrained mining method named CMAGO for discovering the multilevel gene expression rules with user-specified constraints. Through empirical evaluation, the proposed methods are shown to have excellent performance in discovering the hidden multilevel gene association rules.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mining Gene Expression Data: Patterns Extraction for Gene Regulatory Networks

An Efficient and Scalable Algorithm for Mining Maximal

Boolean Association Rule Mining on Microarray Gene Expression Data

References

Ableson, A., & Glasgow, J. I. (2003). Efficient Statistical Pruning of Association Rules. In: Proceedings of the 7th European Conference on Principles and Practice of Knowledge Discovery in Databases, September 22–26, Cavtat-Dubrovnik, Croatia, 23–34.
Agrawal, R., Imielinski, T., & Swami, A. (1993). Mining association rules between sets of items in large databases. In: Proceedings of the ACM SIGMOD Conference on Management of Data, May, Washington, D. C., 207–216.
Agrawal, R., & Srikant, R. (1994). Fast Algorithms for Mining Association Rules. In: Proceedings of the 20th International Conference on Very Large Data Bases, Santiago, Chile, 487–499.
Berrar, D., Dubitzky, W., Granzow, M., & Ells, R. (2001). Analysis of Gene Expression and Drug Activity Data by Knowledge-based Association Mining. In: Proceedings of Critical Assessment of Techniques for Microarray Data Analysis, Duke University, NC, USA, 23–28.
Brown, M. P. S., Grundy, W. N., Lin, D., Cristianini, N., Sug-net, C. W., Furey, T. S., et al. (2000). Know-ledge-based analysis of microarray gene expression data by using support vector machines. Proceedings of the National Academy of Sciences, USA, 97(1), 262–267.
Article Google Scholar
Carmona-Saez, P., Chagoyen, M., Rodriguez, A., Trelles, O., Carazo, J. M., & Pascual-Montano, A. (2006). Integrated analysis of gene expression by association rules discovery. BMC Bioinformatics, 7(54), 1–16.
Google Scholar
Chen, R., Jiang, Q., Yuan, H., & Gruenwald, L. (2001). Mining Association Rules in Analysis of Transcription Factors Essential to Gene Expressions. In: Proceedings of The Atlantic Symposium on Computational Biology and genome Information Systems & Technology, Durham, NC, USA.
Chuang, J. H., Huang, Y. H., Yu, H. H., & Tseng, V. S. (2006). Liver hepcidin and stainable iron expression in biliary atresia. Pediatric Research, 59(5), 662–666.
Article Google Scholar
Creighton, C., & Hanash, S. (2003). mining gene expression databases for association rules. Bioinformatics, 19, 79–86.
Article Google Scholar
Golub, T. R., Slonim, D. K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J. P., et al. (1999). Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science, 286, 531–537.
Article Google Scholar
Gruźdź, A., Ihnatowicz, A., Śl, , & zak, D. (2006). Interactive gene clustering—a case study of breast cancer microarray data. Information Systems Frontiers, 8(1), 21–27.
Article Google Scholar
Han, J., & Fu, Y. (1995). Discovery of Multiple-Level Association Rules from Large Databases. In: Proceedings of the 21st International Conference on Very Large Data Bases, 420–431.
Huang, Z., Li, J., Su, H., Watts, G. S., & Chen, H. (2007). Large-scale regulatory network analysis from microarray data: modified Bayesian network learning and association rule mining. Decision Support Systems, 43(4), 1207–1225.
Article Google Scholar
Hughes, T. R., Marton, M. J., Jones, A. R., Roberts, C. J., Stoughton, R., Armour, C. D., et al. (2000). Functional Discovery via a compendium of expression profiles. Cell, 102, 109–126.
Article Google Scholar
Hvidsten, T. R., Lægreid, A., & Komorowski, J. (2003). Learning rule-based models of biological process from gene expression time profiles using Gene Ontology. Bioinformatics, 19, 1116–1123.
Article Google Scholar
Icev, A., Ruiz, C., & Ryder, E. F. (2003). Distance-Enhanced Association Rules for Gene Expression. In: Proceedings of the 3rd ACM SIGKDD Workshop on Data Mining in Bioinformatics, 34–40.
Johnson, S. C. (1967). Hierarchical Clustering Schemes. Psychometrika, 2, 241–254.
Article Google Scholar
Kotala, P., Zhou, P., Mudivarthy, S., Perrizo, W., & Deckard, E. (2001). Gene Expression Profiling of DNA Microarray Data using Peano Count Trees (P-trees). In Online Proceedings of the First Virtual Conference on Genomics and Bioinformatics, 15–16.
Kotlyar, M., & Jurisica, I. (2006). Predicting protein–protein interactions by association mining. Information Systems Frontiers, 8(1), 37–47.
Article Google Scholar
Lee, C. F., Changchien, S. W., Wang, W. T., & Shen, J. J. (2006). A data mining approach to database compression. Information Systems Frontiers, 8(3), 147–161.
Article Google Scholar
Li, J., & Wong, L. (2002). Identifying good diagnostic gene groups from gene expression profiles using the concept of emerging patterns. Bioinformatics, 18, 725–734.
Article Google Scholar
MacQueen, J. B. (1967). Some Methods for classification and Analysis of Multivariate Observations. In: Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability. Berkeley: University of California Press, 1, 281–297.
Pasquier, N., Bastide, Y., Taouil, R., & Lakhal, L. (1999). Discovering frequent closed itemsets for association rules. Lecture Notes in Computer Science, 1540, 398–416.
Article Google Scholar
Pe’er, D., Regev, A., Elidan, G., & Friedman, N. (2001). Inferring subnetworks from perturbed expression profiles. Bioinformatics, 17, 215–224.
Google Scholar
Tamayo, P., et al. (1996). Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. In: Proceedings of the National Academy of Sciences, USA, 96, 2907–2912.
Article Google Scholar
The Gene Ontology (GO) Consortium (2000). Gene Ontology: tool for the unification of biology. Nature Genetics, 25, 25–29.
Article Google Scholar
The Gene Ontology (GO) Consortium (2001). Creating the Gene Ontology resource: design and implementation. Genome Research, 11, 1425–1433.
Article Google Scholar
Toivonen, H., Klemettinen, M., Ronkainen, P., Hätönen, K., & Mannila, H. (1995). Pruning and Grouping Discovered Association Rules. In: Proceedings of the MLnet Familiarization Workshop on Statistics, Machine Learning and Knowledge Discovery in Databases, 47–52.
Tseng, V. S., & Kao, C.-P. (2005). Efficiently mining gene expression data via a novel parameterless clustering method. In: IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2(4), 355–365.
Article Google Scholar
Tseng, V. S., & Kao, C.-P. (2007). A novel similarity-based fuzzy clustering algorithm by integrating PCM and Mountain Method. In: IEEE Transactions on Fuzzy Systems, 15(6), 1188–1196.
Article Google Scholar
Tuzhilin, A., & Adomavicius, G. (2002). Handling Very Large Numbers of Association Rules in the Analysis of Microarray Data. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 296–304.
Umebayashi, K., & Nakano, A. (2003). Ergosterol is required for targeting of tryptophan permease to the yeast plasma membrane. Journal of Cell Biology, 11, 1117–1131.
Article Google Scholar
Wang, L., Zhu, J., & Zou, H. (2008). Hybrid huberized support vector machines for microarray classification and gene selection. Bioinformatics, 24, 412–419.
Article Google Scholar

Download references

Acknowledgement

This research was supported by the Landmark Project of National Cheng Kung University, Taiwan and National Science Council, Taiwan, R.O.C. under grant no. NSC 96-2221-E-006-143-MY3.

Author information

Authors and Affiliations

Department Computer Science and Information Engineering, National Cheng Kung University, Taiwan, ROC
Vincent S. Tseng, Hsieh-Hui Yu & Shih-Chiang Yang
Institute of Medical Informatics, National Cheng Kung University, Taiwan, ROC
Vincent S. Tseng

Authors

Vincent S. Tseng
View author publications
You can also search for this author in PubMed Google Scholar
Hsieh-Hui Yu
View author publications
You can also search for this author in PubMed Google Scholar
Shih-Chiang Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vincent S. Tseng.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tseng, V.S., Yu, HH. & Yang, SC. Efficient mining of multilevel gene association rules from microarray and gene ontology. Inf Syst Front 11, 433–447 (2009). https://doi.org/10.1007/s10796-009-9156-1

Download citation

Published: 03 March 2009
Issue Date: September 2009
DOI: https://doi.org/10.1007/s10796-009-9156-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient mining of multilevel gene association rules from microarray and gene ontology

Abstract

Access this article

Similar content being viewed by others

Mining Gene Expression Data: Patterns Extraction for Gene Regulatory Networks

An Efficient and Scalable Algorithm for Mining Maximal

Boolean Association Rule Mining on Microarray Gene Expression Data

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Efficient mining of multilevel gene association rules from microarray and gene ontology

Abstract

Access this article

Similar content being viewed by others

Mining Gene Expression Data: Patterns Extraction for Gene Regulatory Networks

An Efficient and Scalable Algorithm for Mining Maximal

Boolean Association Rule Mining on Microarray Gene Expression Data

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation