Dynamic Rule-Based Similarity Model for DNA Microarray Data

Janusz, Andrzej

doi:10.1007/978-3-642-31903-7_1

Andrzej Janusz¹⁸

Part of the book series: Lecture Notes in Computer Science ((TRS,volume 7255))

416 Accesses
2 Citations

Abstract

Rules-based Similarity (RBS) is a framework in which concepts from rough set theory are used for learning a similarity relation from data. This paper presents an extension of RBS called Dynamic Rules-based Similarity model (DRBS) which is designed to boost the quality of the learned relation in case of highly dimensional data. Rules-based Similarity utilizes a notion of a reduct to construct new features which can be interpreted as important aspects of a similarity in the classification context. Having defined such features it is possible to utilize the idea of Tversky’s feature contrast similarity model in order to design an accurate and psychologically plausible similarity relation for a given domain of objects. DRBS tries to incorporate a broader array of aspects of the similarity into the model by constructing many heterogeneous sets of features from multiple decision reducts. To ensure diversity, the reducts are computed on random subsets of objects and attributes. This approach is particularly well-suited for dealing with “few-objects-many-attributes” problem, such as mining of DNA microarray data. The induced similarity relation and the resulting similarity function can be used to perform an accurate classification of previously unseen objects in a case-based fashion. Experiments, whose results are also presented in the paper, show that the proposed model can successfully compete with other state-of-the-art algorithms such as Random Forest or SVM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 16.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Pawlak, Z.: Information systems, theoretical foundations. Information Systems 3(6), 205–218 (1981)
Article Google Scholar
Skowron, A., Stepaniuk, J.: Approximation of relations. In: RSKD 1993: Proceedings of the International Workshop on Rough Sets and Knowledge Discovery, pp. 161–166. Springer, London (1994)
Chapter Google Scholar
Greco, S., Matarazzo, B., Slowinski, R.: Dominance-Based Rough Set Approach to Case-Based Reasoning. In: Torra, V., Narukawa, Y., Valls, A., Domingo-Ferrer, J. (eds.) MDAI 2006. LNCS (LNAI), vol. 3885, pp. 7–18. Springer, Heidelberg (2006)
Chapter Google Scholar
Ngo, C.L., Nguyen, H.S.: A Tolerance Rough Set Approach to Clustering Web Search Results. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) PKDD 2004. LNCS (LNAI), vol. 3202, pp. 515–517. Springer, Heidelberg (2004)
Chapter Google Scholar
Szczuka, M., Janusz, A., Herba, K.: Clustering of Rough Set Related Documents with Use of Knowledge from DBpedia. In: Yao, J., Ramanna, S., Wang, G., Suraj, Z. (eds.) RSKT 2011. LNCS, vol. 6954, pp. 394–403. Springer, Heidelberg (2011)
Chapter Google Scholar
Slowinski, R., Vanderpooten, D.: Similarity relation as a basis for rough approximations. In: Wang, P. (ed.) Advances in Machine Intelligence and Soft-Computing, vol. IV, pp. 17–33. Duke University Press, Durham (1997)
Google Scholar
Slowinski, R., Vanderpooten, D.: A generalized definition of rough approximations based on similarity. IEEE Transactions on Data and Knowledge Engineering 12, 331–336 (2000)
Article Google Scholar
Stepaniuk, J.: Rough - Granular Computing in Knowledge Discovery and Data Mining. Springer, Heidelberg (2010)
Google Scholar
Aamodt, A., Plaza, E.: Case-based reasoning: Foundational issues, methodological variations, and system approaches. Artificial Intelligence Communications 7(1), 39–59 (1994)
Google Scholar
Tversky, A.: Features of similarity. Psychological Review 84, 327–352 (1977)
Article Google Scholar
Goldstone, R., Medin, D., Gentner, D.: Relational similarity and the nonindependence of features in similarity judgments. Cognitive Psychology 23, 222–262 (1991)
Article Google Scholar
Bazan, J.G.: Hierarchical Classifiers for Complex Spatio-temporal Concepts. In: Peters, J.F., Skowron, A., Rybiński, H. (eds.) Transactions on Rough Sets IX. LNCS, vol. 5390, pp. 474–750. Springer, Heidelberg (2008)
Chapter Google Scholar
Nguyen, S.H.T.: Regularity analysis and its applications in data mining. PhD thesis, Warsaw University, Faculty of Mathematics, Informatics and Mechanics (1999) Part II: Relational Patterns
Google Scholar
Martín-Merino, M., De Las Rivas, J.: Improving k-NN for Human Cancer Classification Using the Gene Expression Profiles. In: Adams, N.M., Robardet, C., Siebes, A., Boulicaut, J.-F. (eds.) IDA 2009. LNCS, vol. 5772, pp. 107–118. Springer, Heidelberg (2009)
Chapter Google Scholar
Ben-Dor, A., Bruhn, L., Friedman, N., Nachman, I., Schummer, M., Yakhini, Z.: Tissue classification with gene expression profiles. Journal of Computational Biology 7(3-4), 559–583 (2000)
Article Google Scholar
Stahl, A., Gabel, T.: Using Evolution Programs to Learn Local Similarity Measures. In: Ashley, K.D., Bridge, D.G. (eds.) ICCBR 2003. LNCS, vol. 2689, pp. 537–551. Springer, Heidelberg (2003)
Chapter Google Scholar
Janusz, A.: Similarity Relation in Classification Problems. In: Chan, C.-C., Grzymala-Busse, J.W., Ziarko, W.P. (eds.) RSCTC 2008. LNCS (LNAI), vol. 5306, pp. 211–222. Springer, Heidelberg (2008)
Chapter Google Scholar
Janusz, A.: Rule-based similarity for classification. In: Proceedings of the WI/IAT 2009 Workshops, September 15-18, pp. 449–452. IEEE Computer Society, Milan (2009)
Google Scholar
Janusz, A.: Discovering Rules-Based Similarity in Microarray Data. In: Hüllermeier, E., Kruse, R., Hoffmann, F. (eds.) IPMU 2010. LNCS (LNAI), vol. 6178, pp. 49–58. Springer, Heidelberg (2010)
Chapter Google Scholar
Bazan, J.G., Skowron, A., Synak, P.: Dynamic Reducts as a Tool for Extracting Laws from Decisions Tables. In: Raś, Z.W., Zemankova, M. (eds.) ISMIS 1994. LNCS, vol. 869, pp. 346–355. Springer, Heidelberg (1994)
Chapter Google Scholar
Slezak, D.: Approximate reducts in decision tables. In: Proceedings of IPMU 1996 (1996)
Google Scholar
Ślęzak, D., Janusz, A.: Ensembles of Bireducts: Towards Robust Classification and Simple Representation. In: Kim, T.-H., Adeli, H., Slezak, D., Sandnes, F.E., Song, X., Chung, K.-I., Arnett, K.P. (eds.) FGIT 2011. LNCS, vol. 7105, pp. 64–77. Springer, Heidelberg (2011)
Chapter Google Scholar
Śl\k{e}zak, D., Wróblewski, J.: Roughfication of Numeric Decision Tables: The Case Study of Gene Expression Data. In: Yao, J., Lingras, P., Wu, W.Z., Szczuka, M., Cercone, N., Slezak, D. (eds.) RSKT 2007. LNCS (LNAI), vol. 4481, pp. 316–323. Springer, Heidelberg (2007)
Chapter Google Scholar
Nguyen, H.S., Ślęzak, D.: Approximate Reducts and Association Rules - Correspondence and Complexity Results. In: Zhong, N., Skowron, A., Ohsuga, S. (eds.) RSFDGrC 1999. LNCS (LNAI), vol. 1711, pp. 137–145. Springer, Heidelberg (1999)
Chapter Google Scholar
Ślęzak, D.: Rough Sets and Functional Dependencies in Data: Foundations of Association Reducts. In: Gavrilova, M.L., Tan, C.J.K., Wang, Y., Chan, K.C.C. (eds.) Transactions on Computational Science V. LNCS, vol. 5540, pp. 182–205. Springer, Heidelberg (2009)
Chapter Google Scholar
Diaz-Uriarte, R., Alvarez de Andres, S.: Gene selection and classification of microarray data using random forest. BMC Bioinformatics 7(1), 3 (2006)
Article Google Scholar
Furey, T.S., Duffy, N., David, W., Haussler, D.: Support vector machine classification and validation of cancer tissue samples using microarray expression data (2000)
Google Scholar
Pawlak, Z.: Rough sets - Theoretical Aspects of Reasoning about Data. Kluwer Academic Publishers (1991)
Google Scholar
Pawlak, Z., Skowron, A.: Rudiments of rough sets. Information Sciences 177(1), 3–27 (2007)
Article MathSciNet MATH Google Scholar
Pawlak, Z., Skowron, A.: Rough sets: Some extensions. Information Sciences 177(1), 28–40 (2007)
Article MathSciNet MATH Google Scholar
Pawlak, Z., Skowron, A.: Rough sets and boolean reasoning. Information Sciences 177(1), 41–73 (2007)
Article MathSciNet MATH Google Scholar
Nguyen, H.S.: On efficient handling of continuous attributes in large data bases. Fundamenta Informaticae 48(1), 61–81 (2001)
MathSciNet MATH Google Scholar
Nguyen, H.S.: Approximate Boolean Reasoning: Foundations and Applications in Data Mining. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets V. LNCS, vol. 4100, pp. 334–506. Springer, Heidelberg (2006)
Chapter Google Scholar
Skowron, A., Rauszer, C.: The Discernibility Matrices and Functions in Information Systems, pp. 331–362. Kluwer, Dordrecht
Google Scholar
Bazan, J.G.: A comparison of dynamic and non-dynamic rough set methods for extracting laws from decision tables. In: Polkowski, L., Skowron, A. (eds.) Rough Sets in Knowledge Discovery 2: Applications, Case Studies and Software Systems, pp. 321–365. Physica Verlag (1998)
Google Scholar
Wroblewski, J.: Pairwise Cores in Information Systems. In: Ślęzak, D., Wang, G., Szczuka, M.S., Düntsch, I., Yao, Y. (eds.) RSFDGrC 2005. LNCS (LNAI), vol. 3641, pp. 166–175. Springer, Heidelberg (2005)
Chapter Google Scholar
Pawlak, Z.: Rough sets, rough relations and rough functions. Fundamenta Informaticae 27(2-3), 103–108 (1996)
MathSciNet MATH Google Scholar
Thagard, P.: 10. In: Mind: Introduction to Cognitive Science, Segunda edn. MIT Press, Cambridge (2005)
Google Scholar
Pinker, S.: How the mind works. W.W. Norton (1998)
Google Scholar
Delimata, P., Moshkov, M.J., Skowron, A., Suraj, Z.: Inhibitory Rules in Data Analysis: A Rough Set Approach. SCI, vol. 163. Springer (2009)
Google Scholar
Skowron, A., Stepaniuk, J.: Tolerance approximation spaces. Fundamenta Informaticae 27(2/3), 245–253 (1996)
MathSciNet MATH Google Scholar
Skowron, A., Stepaniuk, J., Peters, J.F., Swiniarski, R.W.: Calculi of approximation spaces. Fundamenta Informaticae 72(1-3), 363–378 (2006)
MathSciNet MATH Google Scholar
Parkinson, H.E., et al.: ArrayExpress update - from an archive of functional genomics experiments to the atlas of gene expression. Nucleic Acids Research 37(Database-Issue), 868–872 (2009)
Article Google Scholar
Wojnarski, M., Janusz, A., Nguyen, H.S., Bazan, J., Luo, C., Chen, Z., Hu, F., Wang, G., Guan, L., Luo, H., Gao, J., Shen, Y., Nikulin, V., Huang, T.-H., McLachlan, G.J., Bošnjak, M., Gamberger, D.: RSCTC’2010 Discovery Challenge: Mining DNA Microarray Data for Medical Diagnosis and Treatment. In: Szczuka, M., Kryszkiewicz, M., Ramanna, S., Jensen, R., Hu, Q. (eds.) RSCTC 2010. LNCS (LNAI), vol. 6086, pp. 4–19. Springer, Heidelberg (2010)
Chapter Google Scholar
Janusz, A.: Utilization of dynamic reducts to improve performance of the rule-based similarity model for highly-dimensional data. In: Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and International Conference on Intelligent Agent Technology - Workshops, pp. 432–435. IEEE (2010)
Google Scholar
R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2008)
Google Scholar
Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: IJCAI, pp. 1137–1145 (1995)
Google Scholar
Bouckaert, R.R.: Choosing between two learning algorithms based on calibrated tests. In: Fawcett, T., Mishra, N. (eds.) Proceedings of the Twentieth International Conference, Machine Learning (ICML 2003), August 21-24, pp. 51–58. AAAI Press, Washington, DC, USA (2003)
Google Scholar
Bazan, J.G., Szczuka, M.S.: RSES and RSESlib - A Collection of Tools for Rough Set Computations. In: Ziarko, W., Yao, Y.Y. (eds.) RSCTC 2000. LNCS (LNAI), vol. 2005, pp. 106–113. Springer, Heidelberg (2001)
Chapter Google Scholar
Øhrn, A., Komorowski, J.: ROSETTA – a rough set toolkit for analysis of data. In: Proc. Third International Joint Conference on Information Sciences, pp. 403–407 (1997)
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Mathematics, Informatics, and Mechanics, The University of Warsaw, Banacha 2, 02-097, Warszawa, Poland
Andrzej Janusz

Authors

Andrzej Janusz
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Manitoba, Winnipeg, MB, Canada
James F. Peters
University of Warsaw, Poland
Andrzej Skowron

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Janusz, A. (2012). Dynamic Rule-Based Similarity Model for DNA Microarray Data. In: Peters, J.F., Skowron, A. (eds) Transactions on Rough Sets XV. Lecture Notes in Computer Science, vol 7255. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31903-7_1

Download citation

DOI: https://doi.org/10.1007/978-3-642-31903-7_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31902-0
Online ISBN: 978-3-642-31903-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics