An Incremental Updating Based Fast Phenotype Structure Learning Algorithm

Cheng, Hao; Zhao, Yu-Hai; Yin, Ying; Zhang, Li-Jun

doi:10.1007/978-3-319-09330-7_12

An Incremental Updating Based Fast Phenotype Structure Learning Algorithm

Hao Cheng²¹,
Yu-Hai Zhao²²,
Ying Yin²² &
…
Li-Jun Zhang²²

Conference paper

3395 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 8590))

Abstract

Unsupervised phenotype structure learning is important in microarray data analysis. The goal is to (1) find groups of samples corresponding to different phenotypes (e.g. disease or normal), and (2) find a subset of genes that can distinguish different groups. Due to the large number of genes and a mass of noise in microarray data, the existing methods are often of some limitations in terms of efficicency and effectiveness. In this paper, we develop an incremental updating based phenotype structure learning algorithm, namely FPLA. With a randomly selected initial state, the algorithm iteratively tries three possible adjustments, i.e. gene addition, gene deletion and sample move, to improve the quality of the current result. Accordingly, four incremental updating based optimization strategies are devised to eliminate the redundancy computations in each iteration. Further, by utilizing a harmonic quality function, it improves the result accuracy by penalizing the “outlier” effect. The experiments conducted on several real microarray datasets show that FPLA outperforms the two representative competing algorithms on both effectiveness and efficiency.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Zhao, Y.H., Wang, G.R., Li, Y., Wang, Z.H.: Finding Novel Diagnostic Gene Patterns Based on Interesting Non-Redundant Contrast Sequence Rules. In: ICDM, pp. 972–981 (2011)
Google Scholar
Kirkpatrick, S., Gelatt Jr., C.D., Vecchi, M.P.: Optimization by Simulated Annealing. Science 220, 671–680 (1983)
Article MATH MathSciNet Google Scholar
Tang, C., Zhang, A.D., Pei, J.: Mining Phenotypes and Informative Genes From Gene Expression Data. In: SIGKDD 2003, Washington, DC, USA, pp. 655–660 (2000)
Google Scholar
Golub, T.R., Slonim, D.K., et al.: Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring. Science 286, 531–537 (1999)
Article Google Scholar
Shipp, M.A., Ross, K.N., Tamayo, P., et al.: Diffuse Large B-Cell Lymphoma Outcome Prediction by Gene-Expression Profiling and Supervised Machine Learning. Nat. Med. 8(1), 68–74 (2002)
Article Google Scholar
Hedenfalk, I., Duggam, D., et al.: Gene-Expression Profiles in Hereditary Breast Cancer. N. Eng. J. Med. 344(8), 539–548 (2001)
Article Google Scholar
Rand, W.M.: Objective Criteria for Evaluation of Clustering Methods. L. Am. Stat. Assoc., 846–850 (1971)
Google Scholar
Rhodes, D.R., Miller, J.C., Haab, B.B., Furge, K.A.: CIT: Identification Of Differentially Expressed Clusters of Genes From Microarray Data. Bioinformatics 18, 205–206 (2001)
Article Google Scholar
Herrero, J., Valencia, A., Dopazo, J.: A Hierarchical Unsupervised Growing Neural Network for Clustering Gene Expression Patterns. Bioinformatics 17(1), 126–136 (2001)
Article Google Scholar
Schloegel, K., Karypis, G.: CRPC Parallel Computing Handbook, Chapter Graph Partitioning for High Performance Scientific Simulations. Morgan Kaufmann (2002)
Google Scholar
Toronen, P., Kolehmainen, M., Wong, G., et al.: Analysis of Gene Expression Data Using Self-Organizing Maps. FEBS Lett. 45(1), 142–146 (1999)
Article Google Scholar
Ding, C., He, X.: Principal Components and K-Means Clustering. In: Proc. of the 4th SIAM International Conference on Data Mining, pp. 23–32 (2004)
Google Scholar
Yang, J., Wang, W., et al.: Δ-Cluster: Capturing Subspace Correalation in Alarge Data Set. In: Proceedings of 18th International Conference on Data Engineering (ICDE 2002), pp. 517–528 (2002)
Google Scholar
Thomas, J.G., Olson, J.M., Tapscott, S.J., Zhao, L.P.: An Efficient and Robust Statistical Modeling Approach to Discover Differentially Expressed Genes Using Genomic Expression Profiles. Genome Research 11(7), 1227–1236 (2001)
Article Google Scholar
Fang, G., Kuang, R., Pandey, G., et al.: Subspace Differential Coexpression Analysis: Problem Definition and A General Approach. In: Pacific Symposium on Biocomputing, pp. 145–156 (2010)
Google Scholar
Zintzaras, E., Kowald, A.: Forest Classification Trees and Forest Support Vector Machines Algorithms: Demonstration Using Microarray Data. Comp. in Bio. and Med. (CBM) 40(5), 519–524 (2010)
Google Scholar
Hastie, T., Tibshirani, R., Boststein, D., Brown, P.: Supervised Harvesting of Expression Trees. Genome Biol. 2(1), 0003.1–0003.12 (2001)
Google Scholar
Horng, J.T., Wu, L.C., et al.: An Expert System to Classify Microarray Gene Expression Data Using Gene Selection by Decision Tree. Expert Syst. Appl. (ESWA) 36(5), 072-9081 (2009)
Google Scholar
Yu, W., Wong, H.S., Wang, H.Q.: Graph-Based Consensus Clustering for Class Discovery From Gene Expression Data. Bioinformatics 23(21), 2888–2896 (2007)
Article Google Scholar
Zhao, Y.H., et al.: Maximal Subspace Coregulated Gene Clustering. IEEE Trans. on Knowledge and Data Engineering 20(1), 83–98 (2008)
Article Google Scholar

Download references

Author information

Authors and Affiliations

College of Sciences, Northeastern University, China
Hao Cheng
College of Information Science and Engineering, Northeastern University, China
Yu-Hai Zhao, Ying Yin & Li-Jun Zhang

Authors

Hao Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Yu-Hai Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Ying Yin
View author publications
You can also search for this author in PubMed Google Scholar
Li-Jun Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Electronics and Information Engineering, Tongji University, 4800 Caoan Road, 201804, Shanghai, China
De-Shuang Huang
School of Computer Science and Engineering Inha University, Incheon, South Korea
Kyungsook Han
Department of Biotechnology, Indian Institute of Technology Madras, 600 036, Chennai, Tamilnadu, India
Michael Gromiha

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cheng, H., Zhao, YH., Yin, Y., Zhang, LJ. (2014). An Incremental Updating Based Fast Phenotype Structure Learning Algorithm. In: Huang, DS., Han, K., Gromiha, M. (eds) Intelligent Computing in Bioinformatics. ICIC 2014. Lecture Notes in Computer Science(), vol 8590. Springer, Cham. https://doi.org/10.1007/978-3-319-09330-7_12

Download citation

DOI: https://doi.org/10.1007/978-3-319-09330-7_12
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09329-1
Online ISBN: 978-3-319-09330-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics