Skip to main content

Dynamic Rule-Based Similarity Model for DNA Microarray Data

  • Chapter
Transactions on Rough Sets XV

Part of the book series: Lecture Notes in Computer Science ((TRS,volume 7255))

Abstract

Rules-based Similarity (RBS) is a framework in which concepts from rough set theory are used for learning a similarity relation from data. This paper presents an extension of RBS called Dynamic Rules-based Similarity model (DRBS) which is designed to boost the quality of the learned relation in case of highly dimensional data. Rules-based Similarity utilizes a notion of a reduct to construct new features which can be interpreted as important aspects of a similarity in the classification context. Having defined such features it is possible to utilize the idea of Tversky’s feature contrast similarity model in order to design an accurate and psychologically plausible similarity relation for a given domain of objects. DRBS tries to incorporate a broader array of aspects of the similarity into the model by constructing many heterogeneous sets of features from multiple decision reducts. To ensure diversity, the reducts are computed on random subsets of objects and attributes. This approach is particularly well-suited for dealing with “few-objects-many-attributes” problem, such as mining of DNA microarray data. The induced similarity relation and the resulting similarity function can be used to perform an accurate classification of previously unseen objects in a case-based fashion. Experiments, whose results are also presented in the paper, show that the proposed model can successfully compete with other state-of-the-art algorithms such as Random Forest or SVM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 16.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Pawlak, Z.: Information systems, theoretical foundations. Information Systems 3(6), 205–218 (1981)

    Article  Google Scholar 

  2. Skowron, A., Stepaniuk, J.: Approximation of relations. In: RSKD 1993: Proceedings of the International Workshop on Rough Sets and Knowledge Discovery, pp. 161–166. Springer, London (1994)

    Chapter  Google Scholar 

  3. Greco, S., Matarazzo, B., Slowinski, R.: Dominance-Based Rough Set Approach to Case-Based Reasoning. In: Torra, V., Narukawa, Y., Valls, A., Domingo-Ferrer, J. (eds.) MDAI 2006. LNCS (LNAI), vol. 3885, pp. 7–18. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  4. Ngo, C.L., Nguyen, H.S.: A Tolerance Rough Set Approach to Clustering Web Search Results. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) PKDD 2004. LNCS (LNAI), vol. 3202, pp. 515–517. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  5. Szczuka, M., Janusz, A., Herba, K.: Clustering of Rough Set Related Documents with Use of Knowledge from DBpedia. In: Yao, J., Ramanna, S., Wang, G., Suraj, Z. (eds.) RSKT 2011. LNCS, vol. 6954, pp. 394–403. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  6. Slowinski, R., Vanderpooten, D.: Similarity relation as a basis for rough approximations. In: Wang, P. (ed.) Advances in Machine Intelligence and Soft-Computing, vol. IV, pp. 17–33. Duke University Press, Durham (1997)

    Google Scholar 

  7. Slowinski, R., Vanderpooten, D.: A generalized definition of rough approximations based on similarity. IEEE Transactions on Data and Knowledge Engineering 12, 331–336 (2000)

    Article  Google Scholar 

  8. Stepaniuk, J.: Rough - Granular Computing in Knowledge Discovery and Data Mining. Springer, Heidelberg (2010)

    Google Scholar 

  9. Aamodt, A., Plaza, E.: Case-based reasoning: Foundational issues, methodological variations, and system approaches. Artificial Intelligence Communications 7(1), 39–59 (1994)

    Google Scholar 

  10. Tversky, A.: Features of similarity. Psychological Review 84, 327–352 (1977)

    Article  Google Scholar 

  11. Goldstone, R., Medin, D., Gentner, D.: Relational similarity and the nonindependence of features in similarity judgments. Cognitive Psychology 23, 222–262 (1991)

    Article  Google Scholar 

  12. Bazan, J.G.: Hierarchical Classifiers for Complex Spatio-temporal Concepts. In: Peters, J.F., Skowron, A., Rybiński, H. (eds.) Transactions on Rough Sets IX. LNCS, vol. 5390, pp. 474–750. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  13. Nguyen, S.H.T.: Regularity analysis and its applications in data mining. PhD thesis, Warsaw University, Faculty of Mathematics, Informatics and Mechanics (1999) Part II: Relational Patterns

    Google Scholar 

  14. Martín-Merino, M., De Las Rivas, J.: Improving k-NN for Human Cancer Classification Using the Gene Expression Profiles. In: Adams, N.M., Robardet, C., Siebes, A., Boulicaut, J.-F. (eds.) IDA 2009. LNCS, vol. 5772, pp. 107–118. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  15. Ben-Dor, A., Bruhn, L., Friedman, N., Nachman, I., Schummer, M., Yakhini, Z.: Tissue classification with gene expression profiles. Journal of Computational Biology 7(3-4), 559–583 (2000)

    Article  Google Scholar 

  16. Stahl, A., Gabel, T.: Using Evolution Programs to Learn Local Similarity Measures. In: Ashley, K.D., Bridge, D.G. (eds.) ICCBR 2003. LNCS, vol. 2689, pp. 537–551. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  17. Janusz, A.: Similarity Relation in Classification Problems. In: Chan, C.-C., Grzymala-Busse, J.W., Ziarko, W.P. (eds.) RSCTC 2008. LNCS (LNAI), vol. 5306, pp. 211–222. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  18. Janusz, A.: Rule-based similarity for classification. In: Proceedings of the WI/IAT 2009 Workshops, September 15-18, pp. 449–452. IEEE Computer Society, Milan (2009)

    Google Scholar 

  19. Janusz, A.: Discovering Rules-Based Similarity in Microarray Data. In: Hüllermeier, E., Kruse, R., Hoffmann, F. (eds.) IPMU 2010. LNCS (LNAI), vol. 6178, pp. 49–58. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  20. Bazan, J.G., Skowron, A., Synak, P.: Dynamic Reducts as a Tool for Extracting Laws from Decisions Tables. In: Raś, Z.W., Zemankova, M. (eds.) ISMIS 1994. LNCS, vol. 869, pp. 346–355. Springer, Heidelberg (1994)

    Chapter  Google Scholar 

  21. Slezak, D.: Approximate reducts in decision tables. In: Proceedings of IPMU 1996 (1996)

    Google Scholar 

  22. Ślęzak, D., Janusz, A.: Ensembles of Bireducts: Towards Robust Classification and Simple Representation. In: Kim, T.-H., Adeli, H., Slezak, D., Sandnes, F.E., Song, X., Chung, K.-I., Arnett, K.P. (eds.) FGIT 2011. LNCS, vol. 7105, pp. 64–77. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  23. Śl\k{e}zak, D., Wróblewski, J.: Roughfication of Numeric Decision Tables: The Case Study of Gene Expression Data. In: Yao, J., Lingras, P., Wu, W.Z., Szczuka, M., Cercone, N., Slezak, D. (eds.) RSKT 2007. LNCS (LNAI), vol. 4481, pp. 316–323. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  24. Nguyen, H.S., Ślęzak, D.: Approximate Reducts and Association Rules - Correspondence and Complexity Results. In: Zhong, N., Skowron, A., Ohsuga, S. (eds.) RSFDGrC 1999. LNCS (LNAI), vol. 1711, pp. 137–145. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  25. Ślęzak, D.: Rough Sets and Functional Dependencies in Data: Foundations of Association Reducts. In: Gavrilova, M.L., Tan, C.J.K., Wang, Y., Chan, K.C.C. (eds.) Transactions on Computational Science V. LNCS, vol. 5540, pp. 182–205. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  26. Diaz-Uriarte, R., Alvarez de Andres, S.: Gene selection and classification of microarray data using random forest. BMC Bioinformatics 7(1), 3 (2006)

    Article  Google Scholar 

  27. Furey, T.S., Duffy, N., David, W., Haussler, D.: Support vector machine classification and validation of cancer tissue samples using microarray expression data (2000)

    Google Scholar 

  28. Pawlak, Z.: Rough sets - Theoretical Aspects of Reasoning about Data. Kluwer Academic Publishers (1991)

    Google Scholar 

  29. Pawlak, Z., Skowron, A.: Rudiments of rough sets. Information Sciences 177(1), 3–27 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  30. Pawlak, Z., Skowron, A.: Rough sets: Some extensions. Information Sciences 177(1), 28–40 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  31. Pawlak, Z., Skowron, A.: Rough sets and boolean reasoning. Information Sciences 177(1), 41–73 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  32. Nguyen, H.S.: On efficient handling of continuous attributes in large data bases. Fundamenta Informaticae 48(1), 61–81 (2001)

    MathSciNet  MATH  Google Scholar 

  33. Nguyen, H.S.: Approximate Boolean Reasoning: Foundations and Applications in Data Mining. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets V. LNCS, vol. 4100, pp. 334–506. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  34. Skowron, A., Rauszer, C.: The Discernibility Matrices and Functions in Information Systems, pp. 331–362. Kluwer, Dordrecht

    Google Scholar 

  35. Bazan, J.G.: A comparison of dynamic and non-dynamic rough set methods for extracting laws from decision tables. In: Polkowski, L., Skowron, A. (eds.) Rough Sets in Knowledge Discovery 2: Applications, Case Studies and Software Systems, pp. 321–365. Physica Verlag (1998)

    Google Scholar 

  36. Wroblewski, J.: Pairwise Cores in Information Systems. In: Ślęzak, D., Wang, G., Szczuka, M.S., Düntsch, I., Yao, Y. (eds.) RSFDGrC 2005. LNCS (LNAI), vol. 3641, pp. 166–175. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  37. Pawlak, Z.: Rough sets, rough relations and rough functions. Fundamenta Informaticae 27(2-3), 103–108 (1996)

    MathSciNet  MATH  Google Scholar 

  38. Thagard, P.: 10. In: Mind: Introduction to Cognitive Science, Segunda edn. MIT Press, Cambridge (2005)

    Google Scholar 

  39. Pinker, S.: How the mind works. W.W. Norton (1998)

    Google Scholar 

  40. Delimata, P., Moshkov, M.J., Skowron, A., Suraj, Z.: Inhibitory Rules in Data Analysis: A Rough Set Approach. SCI, vol. 163. Springer (2009)

    Google Scholar 

  41. Skowron, A., Stepaniuk, J.: Tolerance approximation spaces. Fundamenta Informaticae 27(2/3), 245–253 (1996)

    MathSciNet  MATH  Google Scholar 

  42. Skowron, A., Stepaniuk, J., Peters, J.F., Swiniarski, R.W.: Calculi of approximation spaces. Fundamenta Informaticae 72(1-3), 363–378 (2006)

    MathSciNet  MATH  Google Scholar 

  43. Parkinson, H.E., et al.: ArrayExpress update - from an archive of functional genomics experiments to the atlas of gene expression. Nucleic Acids Research 37(Database-Issue), 868–872 (2009)

    Article  Google Scholar 

  44. Wojnarski, M., Janusz, A., Nguyen, H.S., Bazan, J., Luo, C., Chen, Z., Hu, F., Wang, G., Guan, L., Luo, H., Gao, J., Shen, Y., Nikulin, V., Huang, T.-H., McLachlan, G.J., Bošnjak, M., Gamberger, D.: RSCTC’2010 Discovery Challenge: Mining DNA Microarray Data for Medical Diagnosis and Treatment. In: Szczuka, M., Kryszkiewicz, M., Ramanna, S., Jensen, R., Hu, Q. (eds.) RSCTC 2010. LNCS (LNAI), vol. 6086, pp. 4–19. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  45. Janusz, A.: Utilization of dynamic reducts to improve performance of the rule-based similarity model for highly-dimensional data. In: Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and International Conference on Intelligent Agent Technology - Workshops, pp. 432–435. IEEE (2010)

    Google Scholar 

  46. R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2008)

    Google Scholar 

  47. Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: IJCAI, pp. 1137–1145 (1995)

    Google Scholar 

  48. Bouckaert, R.R.: Choosing between two learning algorithms based on calibrated tests. In: Fawcett, T., Mishra, N. (eds.) Proceedings of the Twentieth International Conference, Machine Learning (ICML 2003), August 21-24, pp. 51–58. AAAI Press, Washington, DC, USA (2003)

    Google Scholar 

  49. Bazan, J.G., Szczuka, M.S.: RSES and RSESlib - A Collection of Tools for Rough Set Computations. In: Ziarko, W., Yao, Y.Y. (eds.) RSCTC 2000. LNCS (LNAI), vol. 2005, pp. 106–113. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  50. Øhrn, A., Komorowski, J.: ROSETTA – a rough set toolkit for analysis of data. In: Proc. Third International Joint Conference on Information Sciences, pp. 403–407 (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Janusz, A. (2012). Dynamic Rule-Based Similarity Model for DNA Microarray Data. In: Peters, J.F., Skowron, A. (eds) Transactions on Rough Sets XV. Lecture Notes in Computer Science, vol 7255. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31903-7_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-31903-7_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-31902-0

  • Online ISBN: 978-3-642-31903-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics