Skip to main content

Explaining Subgroups through Ontologies

  • Conference paper
PRICAI 2012: Trends in Artificial Intelligence (PRICAI 2012)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7458))

Included in the following conference series:

Abstract

Subgroup discovery (SD) methods can be used to find interesting subsets of objects of a given class. Subgroup descriptions (rules) are themselves good explanations of the subgroups. Domain ontologies provide additional descriptions to data and can provide alternative explanations of discovered rules; such explanations in terms of higher level ontology concepts have the potential of providing new insights into the domain of investigation. We show that this additional explanatory power can be ensured by using recently developed semantic SD methods. We present the new approach to explaining subgroups through ontologies and demonstrate its utility on a gene expression profiling use case where groups of patients, identified through SD in terms of gene expression, are further explained through concepts from the Gene Ontology and KEGG orthology.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Atzmüller, M., Puppe, F.: SD-Map – A Fast Algorithm for Exhaustive Subgroup Discovery. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, pp. 6–17. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  2. Bay, S.D., Pazzani, M.J.: Detecting group differences: Mining contrast sets. Data Mining and Knowledge Discovery 5(3), 213–246 (2001)

    Article  MATH  Google Scholar 

  3. Demšar, J., Zupan, B., Leban, G.: From experimental machine learning to interactive data mining, white paper. Faculty of Computer and Information Science. University of Ljubljana (2004), http://www.ailab.si/orange

  4. Dong, G., Li, J.: Efficient mining of emerging patterns: Discovering trends and differences. In: Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 1999, pp. 43–52 (1999)

    Google Scholar 

  5. Elston, C.W., Ellis, I.O.: Pathological prognostic factors in breast cancer. I. The value of histological grade in breast cancer: experience from a large study with long-term follow-up. Histopathology 19(5), 403–410 (1991)

    Article  Google Scholar 

  6. Galea, M., Blamey, R., Elston, C., Ellis, I.: The Nottingham prognostic index in primary breast cancer. Breast Cancer Research and Treatment 22, 207–219 (1992)

    Article  Google Scholar 

  7. Gamberger, D., Lavrač, N.: Expert-guided subgroup discovery: Methodology and application. Journal of Artificial Intelligence Research 17, 501–527 (2002)

    MATH  Google Scholar 

  8. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. Newsl. 11, 10–18 (2009)

    Article  Google Scholar 

  9. Kavšek, B., Lavrač, N.: APRIORI-SD: Adapting association rule learning to subgroup discovery. Applied Artificial Intelligence 20(7), 543–583 (2006)

    Article  Google Scholar 

  10. Klösgen, W.: Explora: a multipattern and multistrategy discovery assistant. In: Advances in Knowledge Discovery and Data Mining, pp. 249–271. American Association for Artificial Intelligence, Menlo Park (1996)

    Google Scholar 

  11. Kralj Novak, P., Lavrač, N., Webb, G.I.: Supervised descriptive rule discovery: A unifying survey of contrast set, emerging pattern and subgroup mining. Journal of Machine Learning Research 10, 377–403 (2009)

    MATH  Google Scholar 

  12. Lavrač, N., Vavpetič, A., Soldatova, L., Trajkovski, I., Novak, P.K.: Using Ontologies in Semantic Data Mining with SEGS and g-SEGS. In: Elomaa, T., Hollmén, J., Mannila, H. (eds.) DS 2011. LNCS, vol. 6926, pp. 165–178. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  13. Lavrač, N., Kavšek, B., Flach, P.A., Todorovski, L.: Subgroup discovery with CN2-SD. Journal of Machine Learning Research 5, 153–188 (2004)

    Google Scholar 

  14. Maglott, D., Ostell, J., Pruitt, K.D., Tatusova, T.: Entrez gene: gene-centered information at NCBI. Nucleic Acids Research 33(Database issue) (2005)

    Google Scholar 

  15. McCall, M.N., Bolstad, B.M., Irizarry, R.A.: Frozen robust multiarray analysis (fRMA). Biostatistics 11(2), 242–253 (2010)

    Article  Google Scholar 

  16. Podpečan, V., Zemenova, M., Lavrač, N.: Orange4WS environment for service-oriented data mining. The Computer Journal Online Access (2011); advanced Access Published August 7, 2011: 10.1093/comjnl/bxr077

    Google Scholar 

  17. Podpečan, V., Lavrač, N., Mozetič, I., Novak, P.K., Trajkovski, I., Langohr, L., Kulovesi, K., Toivonen, H., Petek, M., Motaln, H., Gruden, K.: SegMine workflows for semantic microarray data analysis in Orange4WS. BMC Bioinformatics 12, 416 (2011)

    Article  Google Scholar 

  18. Robnik-Šikonja, M., Kononenko, I.: Theoretical and empirical analysis of ReliefF and RReliefF. Machine Learning 53, 23–69 (2003)

    Article  MATH  Google Scholar 

  19. Sotiriou, C., Wirapati, P., Loi, S., Harris, A., Fox, S., Smeds, J., Nordgren, H., Farmer, P., Praz, V., Haibe-Kains, B., Desmedt, C., Larsimont, D., Cardoso, F., Peterse, H., Nuyten, D., Buyse, M., Van de Vijver, M.J., Bergh, J., Piccart, M., Delorenzi, M.: Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis 98(4), 262–272 (2006)

    Google Scholar 

  20. Subramanian, A., Tamayo, P., Mootha, V.K., Mukherjee, S., Ebert, B.L., Gillette, M.A., Paulovich, A., Pomeroy, S.L., Golub, T.R., Lander, E.S., Mesirov, J.P.: Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proceedings of the National Academy of Sciences of the United States of America 102(43), 15545–15550 (2005)

    Article  Google Scholar 

  21. Suzuki, E.: Autonomous discovery of reliable exception rules. In: Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, pp. 259–262 (1997)

    Google Scholar 

  22. Suzuki, E.: Data mining methods for discovering interesting exceptions from an unsupervised table. Journal of Universal Computer Science 12(6), 627–653 (2006)

    Google Scholar 

  23. Taminau, J., Steenhoff, D., Coletta, A., Meganck, S., Lazar, C., de Schaetzen, V., Duque, R., Molter, C., Bersini, H., Nowé, A., Weiss Solís, D.Y.: InSilicoDB: an R/Bioconductor package for accessing human Affymetrix expert-curated datasets from GEO. Bioinformatics (2011)

    Google Scholar 

  24. Trajkovski, I., Lavrač, N., Tolar, J.: SEGS: Search for enriched gene sets in microarray data. Journal of Biomedical Informatics 41(4), 588–601 (2008)

    Article  Google Scholar 

  25. Vavpetič, A., Lavrač, N.: Semantic data mining system g-SEGS. In: Proceedings of the Workshop on Planning to Learn and Service-Oriented Knowledge Discovery, PlanSoKD 2011, ECML PKDD Conference, Athens, Greece, September 5-9, pp. 17–29 (2011)

    Google Scholar 

  26. Webb, G.I., Butler, S.M., Newlands, D.: On detecting differences between groups. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2003, pp. 256–265 (2003)

    Google Scholar 

  27. Wrobel, S.: An Algorithm for Multi-relational Discovery of Subgroups. In: Komorowski, J., Żytkow, J.M. (eds.) PKDD 1997. LNCS, vol. 1263, pp. 78–87. Springer, Heidelberg (1997)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Vavpetič, A., Podpečan, V., Meganck, S., Lavrač, N. (2012). Explaining Subgroups through Ontologies. In: Anthony, P., Ishizuka, M., Lukose, D. (eds) PRICAI 2012: Trends in Artificial Intelligence. PRICAI 2012. Lecture Notes in Computer Science(), vol 7458. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32695-0_55

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-32695-0_55

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-32694-3

  • Online ISBN: 978-3-642-32695-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics