Dependency and Granularity in Data - Mining

Tsumoto, Shusaku; Hirano, Shoji

doi:10.1007/978-0-387-30440-3_119

Shusaku Tsumoto² &
Shoji Hirano²

249 Accesses
7 Citations

Definition of the Subject

The degree of granularity of a contingency table is closely related with that of dependence of contingency tables. We investigate these relations from the viewpoints of determinantal devisors and determinants. From the results of determinantal divisors , it seems that the devisors provide information on the degree of dependencies between the matrix of all elements and its submatrices and that an increase in degree of granularity may lead to an increase in dependency. However, another approach shows that a constraint on the sample size of a contingency table is very strong, which leads to an evaluation formula in which an increase of degree of granularity gives a decrease of dependency.

Introduction

Independence (dependence) is a very important concept in data mining, especially for feature selection. In rough sets [2], if two attribute‐value pairs, say \( { [c=0] } \) and \( { [d=0] } \) are independent, their supporting sets, denoted by C and Ddo not have...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 3,499.99; Price excludes VAT (USA)

Hardcover Book: USD 549.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The threshold δ is the degree of the closeness of overlapping sets, which will be given by domain experts. For more information, please refer to Sect. “ Rank of Contingency Table (\( { {2 \times 2} } \))”.

Bibliography

Butz C (2002) Exploiting contextual independencies in web search and user profiling. In: Proceedings of World Congress on Computational Intelligence (WCCI'2002) (CD-ROM)
Google Scholar
Pawlak Z (1991) Rough sets. Kluwer, Dordrecht
MATH Google Scholar
Skowron A, Grzymala-Busse J (1994) From rough set theory to evidence theory. In: Yager R, Fedrizzi M, Kacprzyk J (eds) Advances in the Dempster–Shafer Theory of Evidence. Wiley, New York, pp 193–236
Google Scholar
Tsumoto S (2000) Knowledge discovery in clinical databases and evaluation of discovered knowledge in outpatient clinic. Inf Sci 124:125–137
Google Scholar
Tsumoto S (2003) Statistical independence as linear independence. In: Skowron A, Szczuka M (eds) Electronic Notes in Theoretical Computer Science, vol 82. Elsevier
Google Scholar
Tsumoto S, Tanaka H (1996) Automated discovery of medical expert system rules from clinical databases based on rough sets. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining 96. AAAI Press, Palo Alto, pp 63–69
Google Scholar

Download references

Acknowledgment

This work was supported by the Grant-in-Aid for Scientific Research (13131208) on PriorityAreas (No.759) “Implementation of Active Mining in the Era of Information Flood” by the Ministry of Education, Science, Culture, Sports,Science and Technology of Japan.

Author information

Authors and Affiliations

Department of Medical Informatics, Shimane University, School of Medicine, Enya-cho Izumo City, Shimane, Japan
Shusaku Tsumoto & Shoji Hirano

Authors

Shusaku Tsumoto
View author publications
You can also search for this author in PubMed Google Scholar
Shoji Hirano
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

RAMTECH LIMITED, 122 Escalle Lane, Larkspur, CA, 94939, USA
Robert A. Meyers Ph. D. (Editor-in-Chief) (Editor-in-Chief)

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Tsumoto, S., Hirano, S. (2009). Dependency and Granularity in Data -Mining . In: Meyers, R. (eds) Encyclopedia of Complexity and Systems Science. Springer, New York, NY. https://doi.org/10.1007/978-0-387-30440-3_119

Download citation

DOI: https://doi.org/10.1007/978-0-387-30440-3_119
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-75888-6
Online ISBN: 978-0-387-30440-3
eBook Packages: Physics and AstronomyReference Module Physical and Materials ScienceReference Module Chemistry, Materials and Physics

Publish with us

Policies and ethics

Dependency and Granularity in Data -Mining