Two novel interestingness measures for gene association rule mining

Wang, Meihua; Wu, Shumin; Cai, Ruichu

doi:10.1007/s00521-012-1005-3

Two novel interestingness measures for gene association rule mining

Original Article
Published: 22 July 2012

Volume 23, pages 835–841, (2013)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Meihua Wang¹,
Shumin Wu¹ &
Ruichu Cai^2,3

305 Accesses
2 Citations
Explore all metrics

Abstract

Recent research has shown that association rules are useful in gene expression data analysis. Interestingness measure plays an important role in the association rule mining on small sample size, high dimensionality, and noisy gene expression data. This work introduces two interestingness measures by exploring prior knowledge contained in open biological databases. They are Max-Pathway-Distance (MaxPD), which explores the gene’s relativity in Kyoto encyclopedia of genes and genomes Pathway, and Max-Chromosomal-Distance (MaxCD), which makes use of the distance among genes in the chromosome. The properties of our proposed interestingness measures are also explored to mine the interesting rules efficiently. Experimental results on four real-life gene expression datasets show the effectiveness of MaxPD and MaxCD in both classification accuracy and biological interpretability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Boolean Association Rule Mining on Microarray Gene Expression Data

An Efficient and Scalable Algorithm for Mining Maximal

Clustering of Association Rules on Microarray Gene Expression Data

References

Cai R, Hao Z, Wen W, Huang H (2010) Kernel based gene expression pattern discovery and its application on cancer classification. Neurocomputing 73:2562–2570
Article Google Scholar
Cai R, Tung AKH, Zhang Z, Hao Z (2011) What is unequal among the equals? Ranking equivalent rules from gene expression data. In: IEEE transactions on knowledge and data engineering
Callegaro A, Basso D et al (2006) A locally adaptive statistical procedure (lap) to identify differentially expressed chromosomal regions. Bioinformatics 22(21):2658–2666
Article Google Scholar
Caron H et al (2001) The human transcriptome map: clustering of highly expressed genes in chromosomal domains. Science 291:1289–1292
Article Google Scholar
Cheng H, Yan X, Han J, Hsu C-W (2007) Discriminative frequent pattern analysis for effective classification. In: ICDE
Cheng H, Yan X, Han J, Yu PS (2008) Direct discriminative pattern mining for effective classification. In: ICDE
Cong G, Tan K-L, Tung AKH, Xu X, Pan F, Yang J (2004) Farmer: finding interesting rule groups in microarray datasets. In: SIGMOD
Cong G, Tan K-L, Tung AKH, Xu X (2005) Mining top-k covering rule groups for gene expression data. In: SIGMOD
Crawley JJ, Furge KA (2002) Identification of frequent cytogenetic aberrations in hepatocellular carcinoma using gene-expression microarray data. Genome Biol 3(12):1–8
Article Google Scholar
Floyd RW (1962) Algorithm 97: shortest path. Commun ACM 5:(6)345
Geest CR, Coffer PJ (2009) MAPK signaling pathways in the regulation of hematopoiesis. J Leukoc Biol 86(2):237–250
Article Google Scholar
Golub TR, Slonim DK, Tamayo P et al (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439):531–537
Article Google Scholar
Gordon GJ, Jensen RV, Hsiao LL et al (2002) Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Res 62(17):4963–4967
Google Scholar
http://www.genomesonline.org
http://www.kegg.jp/kegg/pathway.html
http://www.kegg.jp
http://www.kegg.jp/kegg/soap/
http://www.kegg.jp/kegg/xml/
http://www.stjuderesearch.org/hcnetdat/webFront/searchMainPage.php
http://www.kegg.jp/kegg/pathway/hsa/hsa05200.html
http://www.kegg.jp/kegg/pathway/hsa/hsa05221.html
Janssens D, Brijs T, Vanhoof K, Wets G (2006) Evaluating the performance of cost-based discretization versus entropy and error based discretization. Comput Oper Res 33(11):3107–3123
Article MATH Google Scholar
Schena M, Shalon D, Davis RW, Brown PO (1995) Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270(5235):467–470
Article Google Scholar
Seifert M, Strickert M, Schliep A, Grosse I (2011) Exploiting prior knowledge and gene distances in the analysis of tumor expression profiles with extended Hidden Markov Models. Bioinformatics 27(12):1645–1652
Article Google Scholar
Shipp MA, Ross KN, Tamayo P et al (2002) Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nat Med 8(1):68–74
Article Google Scholar
Singh D et al (2002) Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1(2):203–209
Article Google Scholar
Wu S, Gessner R, von Stackelberg A, Kirchner R, Henze G, Seeger K (2005) Cytokine/cytokine receptor gene expression in childhood acute lymphoblastic leukemia. Cancer 103(5):1054–1063
Article Google Scholar

Download references

Acknowledgments

This work was financially supported by Natural Science Foundation of China (61100148), Natural Science Foundation of Guangdong province (S2011040004804), Key Technology Research and Development Programs of Guangdong Province (2010B080701070), Opening Project of the State Key Laboratory for Novel Software Technology (KFKT2011B19), Foundation for Distinguished Young Talents in Higher Education of Guangdong, China (LYM11060).

Author information

Authors and Affiliations

College of Informatics, South China Agricultural University, Guangzhou, People’s Republic of China
Meihua Wang & Shumin Wu
Faculty of Computer Science, Guangdong University of Technology, Guangzhou, People’s Republic of China
Ruichu Cai
State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, People’s Republic of China
Ruichu Cai

Authors

Meihua Wang
View author publications
You can also search for this author in PubMed Google Scholar
Shumin Wu
View author publications
You can also search for this author in PubMed Google Scholar
Ruichu Cai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shumin Wu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, M., Wu, S. & Cai, R. Two novel interestingness measures for gene association rule mining. Neural Comput & Applic 23, 835–841 (2013). https://doi.org/10.1007/s00521-012-1005-3

Download citation

Received: 15 January 2012
Accepted: 06 June 2012
Published: 22 July 2012
Issue Date: September 2013
DOI: https://doi.org/10.1007/s00521-012-1005-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Two novel interestingness measures for gene association rule mining

Abstract

Access this article

Similar content being viewed by others

Boolean Association Rule Mining on Microarray Gene Expression Data

An Efficient and Scalable Algorithm for Mining Maximal

Clustering of Association Rules on Microarray Gene Expression Data

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Two novel interestingness measures for gene association rule mining

Abstract

Access this article

Similar content being viewed by others

Boolean Association Rule Mining on Microarray Gene Expression Data

An Efficient and Scalable Algorithm for Mining Maximal

Clustering of Association Rules on Microarray Gene Expression Data

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation