Analyzing Gene Expression Data with Predictive Clustering Trees

Slavkov, Ivica; Džeroski, Sašo

doi:10.1007/978-1-4419-7738-0_16

Ivica Slavkov⁴ &
Sašo Džeroski⁴

676 Accesses
1 Citations

Abstract

In this work we investigate the application of predictive clustering trees (PCTs) for analysing gene expression data. PCTs provide a flexible approach for both predictive and descriptive analysis, both often used on gene expression data. To begin with, we use gene expression data for building predictive models for associated clinical data, where we compare single-target with multi-target models. Related to this, random forests of PCTs (single and multi-target) are used to assess the importance of individual genes w.r.t. the clinical parameters. For a more descriptive analysis, we perform a so-called constrained clustering of expression data. Also, we extend the descriptive analysis to take into account a temporal component, by using PCTs for finding descriptions of short time series of gene expression data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unsupervised Gene Network Inference with Decision Trees and Random Forests

Random Forests with Latent Variables to Foster Feature Selection in the Context of Highly Correlated Variables. Illustration with a Bioinformatics Application.

Ensemble methods of rank-based trees for single sample classification with gene expression profiles

Article Open access 07 February 2024

References

H. Blockeel, L. De Raedt, and J. Ramon. Top-down induction of clustering trees. In Proc.15th Int’l Conf. on Machine Learning, pages 55–63. Morgan Kaufman, 1998.
Google Scholar
L. Breiman. Random forests. Machine Learning, 45(1):5–32, 2001.
Article MATH Google Scholar
S. Džeroski, V. Gjorgjioski, I. Slavkov, and J. Struyf. Analysis of time series data with predictive clustering trees. In 5th Int’l Workshop on Knowledge Discovery in Inductive Databases: Revised Selected and Invited Papers, pages 63–80, Springer Berlin, 2007.
Google Scholar
Ashburner, M., Ball, C., Blake, J., Botstein, D., Butler, H., Cherry, J., Davis, A., Dolinski, K., Dwight, S., Eppig, J., Harris, M., Hill, D., Issel-Tarver, L., Kasarskis, A., Lewis, S., Matese, J., Richardson, J., Ringwald, M., Rubin, G., Sherlock, G.: Gene Ontology: Tool for the unification of biology. The Gene Ontology Consortium. Nature Genetics 25(1): 25–29, 2000
Article Google Scholar
A. Gasch, P. Spellman, C. Kao, O. Carmel-Harel, M. Eisen, G. Storz, D. Botstein, and P. Brown. Genomic expression program in the response of yeast cells to environmental changes. Molecular Biology of the Cell, 11:4241–4257, 2000.
Google Scholar
D. Kocev, I. Slavkov, and S. Džeroski. More is better: ranking with multiple targets for biomarker discovery. In Proc. 2nd Int’l Wsp on Machine Learning in Systems Biology, page 133, University of Liege 2008.
Google Scholar
D. Kocev, J. Struyf, and S. Džeroski. Beam search induction and similarity constraints for predictive clustering trees. In 5th Int’l Workshop on Knowledge Discovery in Inductive Databases: Revised Selected and Invited Papers, pages 134–151. Springer, Berlin 2007.
Google Scholar
J. M. Maris. The biologic basis for neuroblastoma heterogeneity and risk stratification. Current Opinion in Pediatrics, 17(1):7–13, 2005.
Article MathSciNet Google Scholar
J.R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA 1993.
Google Scholar
A. Schramm, J. H. Schulte, L. Klein-Hitpass, W. Havers, H. Sieverts, B. Berwanger, H. Christiansen, P.Warnat, B. Brors, J. Eils, R. Eils, and A. Eggert. Prediction of clinical outcome and biological characterization of neuroblastoma by expression profiling. Oncogene, 7902–7912, 2005.
Google Scholar
J. Sese, Y. Kurokawa, M. Monden, K. Kato, and S. Morishita. Constrained clusters of gene expression profiles with pathological features. Bioinformatics, 20:3137–3145, 2004.
Article Google Scholar
I. Slavkov, S. Džeroski, B. Peterlin, and L. Lovrečić. Analysis of huntington’s disease gene expression profiles using constrained clustering. Informatica Medica Slovenica, 11(2):43–51, 2006.
Google Scholar
I. Slavkov, V. Gjorgjioski, J. Struyf, and S. Džeroski. Finding explained groups of time-course gene expression profiles with predictive clustering trees. Molecular bioSystems, 6(7):729–740, 2010.
Article Google Scholar
I. Slavkov, B. Ženko, and S. Džeroski. Evaluation method for feature rankings and their aggregations for biomarker discover. In Proc. 3rd Intl Wshp on Machine Learning in Systems Biology, JMLR: Workshop and Conference Proceedings 8: 122–135 (2010)
Google Scholar
J. Struyf and S. Džeroski. Constraint based induction of multi-objective regression trees. In 4th Int’l Workshop on Knowledge Discovery in Inductive Databases: Revised Selected and Invited Papers, pages 222–233. Springer, Berlin 2006.
Google Scholar
J. Struyf, S. Dzeroski, H. Blockeel, and A. Clare. Hierarchical multi-classification with predictive clustering trees in functional genomics. In 12th Portuguese Conference on Artificial Intelligence, pages 272–283. Springer 2005.
Google Scholar
L. Todorovski, B. Cestnik, M. Kline, N. Lavrač, and S. Džeroski. Qualitative clustering of short time-series: A case study of firms reputation data. In Proc. Wshp on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning, pages 141–149, ECML/PKDD 2002.
Google Scholar
B. Ženko, S. Džeroski, and J. Struyf. Learning predictive clustering rules. In 4th Int’l Workshop on Knowledge Discovery in Inductive Databases: Revised Selected and Invited Papers, pages 234–250. Springer, Berlin 2005.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Knowledge Technologies, Jožef Stefan Institute, Jamova cesta 39, 1000, Ljubljana, Slovenia
Ivica Slavkov & Sašo Džeroski

Authors

Ivica Slavkov
View author publications
You can also search for this author in PubMed Google Scholar
Sašo Džeroski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ivica Slavkov .

Editor information

Editors and Affiliations

, Department of Knowledge Technologies, Jozef Stefan Institute, Jamova 39, Ljubljana, 1000, Slovenia
Sašo Džeroski
, Mathematics and Computer Science, University of Antwerp, Middelheimlaan 1, Antwerpen, B-2020, Belgium
Bart Goethals
, Dept. of Knowledge Technologies, Jožef Stefan Institute, Jamova cesta 39, Ljubljana, SI-1000, Slovenia
Panče Panov

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Slavkov, I., Džeroski, S. (2010). Analyzing Gene Expression Data with Predictive Clustering Trees. In: Džeroski, S., Goethals, B., Panov, P. (eds) Inductive Databases and Constraint-Based Data Mining. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-7738-0_16

Download citation

DOI: https://doi.org/10.1007/978-1-4419-7738-0_16
Published: 18 November 2010
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4419-7737-3
Online ISBN: 978-1-4419-7738-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics