Dependence Factor as a Rule Evaluation Measure

Kryszkiewicz, Marzena

doi:10.1007/978-3-319-18781-5_12

Dependence Factor as a Rule Evaluation Measure

Marzena Kryszkiewicz⁴

Chapter
First Online: 01 January 2015

1855 Accesses
1 Citations

Part of the book series: Studies in Computational Intelligence ((SCI,volume 605))

Abstract

Certainty factor and lift are known evaluation measures of association rules. Nevertheless, they do not guarantee accurate evaluation of the strength of dependence between rule’s constituents. In particular, even if there is a strongest possible positive or negative dependence between rule’s constituents X and Y, these measures may reach values quite close to the values indicating independence of X and Y. Recently, we have proposed a new measure called a dependence factor to overcome this drawback. Unlike in the case of the certainty factor, when defining the dependence factor, we took into account the fact that for a given rule $X \rightarrow Y$, the minimal conditional probability of the occurrence of Y given X may be greater than 0, while its maximal possible value may less than 1. In this paper, we first recall definitions and properties of all the three measures. Then, we examine the dependence factor from the point of view of an interestingness measure as well as we examine the relationship among the dependence factor for X and Y with those for $\bar{X}$ and Y, X and $\bar{Y}$, as well as $\bar{X}$ and $\bar{Y}$, respectively. As a result, we obtain a number of new properties of the dependence factor.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Agrawal R, Imielinski T, Swami AN (1993) Mining association rules between sets of items in large databases. In: ACM SIGMOD international conference on management of data, pp 207–216
Google Scholar
Brin S, Motwani R, Ullman JD, Tsur S (1997) Dynamic itemset counting and implication rules for market basket data. In: ACM SIGMOD 1997 international conference on management of data, pp 255–264
Google Scholar
Hilderman RJ, Hamilton HJ (2001) Evaluation of interestingness measures for ranking discovered knowledge. LNCS 2035:247–259
MATH Google Scholar
Kryszkiewicz M (2015) Dependence factor for association rules. In: Proceedings of ACIIDS 2015, part II LNAI, vol 9012, pp 135–145, Springer, New York
Google Scholar
Lavrac N, Flach P, Zupan B (1999) Rule evaluation measures: a unifying view. In: Proceedings of ILP-1999. LNAI, vol 1634, pp 174–185. Springer, New York
Google Scholar
Lenca P, Meyer P, Vaillant B, Lallich S (2008) On selecting interestingness measures for association rules: user oriented description and multiple criteria decision aid. In: European journal of operational research, vol 184, pp 610–626. Elsevier, France
Google Scholar
Piatetsky-Shapiro G (1991) Discovery, analysis, and presentation of strong rules. Knowledge discovery in databases, pp 229–248. AAAI/MIT Press, Cambridge
Google Scholar
Sheikh LM, Tanveer B, Hamdani SMA (2004) Interesting measures for mining association rules. In: Proceedings of INMIC 2004, IEEE
Google Scholar
Shortliffe E, Buchanan B (1975) A model of inexact reasoning in medicine. Math Biosci 23:351–379
Article MathSciNet Google Scholar
Silberschatz A, Tuzhilin A (1995) On subjective measures of interestingness in knowledge discovery. Proc KDD 1995:275–281
Google Scholar
Suzuki E (2008) Pitfalls for categorizations of objective interestingness measures for rule discovery. In: Statistical implicative analysis: theory and applications, pp 383–395. Springer, New York
Google Scholar
Suzuki E (2009) Interestingness measures—limits, desiderata, and recent results. In: QIMIE/PAKDD
Google Scholar

Download references

Acknowledgments

We wish to thank an anonymous reviewer for constructive comments, which influenced the final version of this paper positively.

Author information

Authors and Affiliations

Institute of Computer Science, Warsaw University of Technology, Nowowiejska 15/19, 00-665, Warsaw, Poland
Marzena Kryszkiewicz

Authors

Marzena Kryszkiewicz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marzena Kryszkiewicz .

Editor information

Editors and Affiliations

Faculty of Computer Science, Dalhousie University, Halifax, Nova Scotia, Canada
Stan Matwin
Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland, and Warsaw University of Technology, Warsaw, Poland
Jan Mielniczuk

Appendix

Proof of Lemma 2

In the proof, we will use the following equations:

$P(\bar{X}) = 1 - P(X)$, $P(\bar{Y}) = 1 - P(Y)$,
$P(\bar{X}Y)=P(Y)-P(XY),\quad P(X\bar{Y}) = P(X)-P(XY),$
$P(\bar{X}\bar{Y}) = P(\bar{X}) - P(\bar{X}Y) = 1 - P(X)-P(Y)+P({ XY})$.

Ad (a)

Case $P(\bar{X}\bar{Y}) >P(\bar{X}) \times P(\bar{Y})$:

This case is equivalent to the case when $P({ XY}) >P(X)\times P(Y)$ (by Lemma 1a). Then:

$df(\bar{X} \rightarrow \bar{Y}) = \text { /* by Proposition 3a */}$

$$\begin{aligned}&=\frac{P(\bar{X}\bar{Y})-P(\bar{X})\times P(\bar{Y})}{{max}\,{\_}{P(\bar{X}\bar{Y}|_{P(\bar{X}),P(\bar{Y})})}-P(\bar{X})\times P(\bar{Y})} = \text { /* by Proposition 6a */}\\&= \frac{(1 - P(X)-P(Y)+P({XY}))-(1 - P(X))\times (1 - P(Y))}{(1-\max \{ P({X}),P({Y})\}) - (1-P({X})) \times (1-P({Y}))}\\&=\frac{P({X}{Y})-P({X})\times P({Y})}{\min \{ P({X}),P({Y})\} -P({X})\times P({Y})} = \text { /* by Theorem 3 */ }\\&=df(X \rightarrow Y). \end{aligned}$$

Case $P(\bar{X}\bar{Y}) = P(\bar{X})\times P(\bar{Y})$:

This case is equivalent to the case when $P(XY) = P(X) \times P(Y)$ (by Lemma 1b). Then:

${ df}(\bar{X} \rightarrow \bar{Y} ) = \text { /* by Proposition 3a */}$

$\begin{aligned}&= 0 = \text { /* by Proposition 3a */}\\&= { df}(X \rightarrow Y). \end{aligned}$

Case $P(\bar{X}\bar{Y}) < P(\bar{X}) \times P(\bar{Y})$ and $P(\bar{X}) + P(\bar{Y}) \le 1$:

This case is equivalent to the case when $P({ XY}) < P(X) \times P(Y)$ (by Lemma 1c) and $P(X)+P(Y)\ge 1$ (by Proposition 5c). Then:

${ df}(\bar{X} \rightarrow \bar{Y}) = \text { /* by Proposition 3a */}$

$$\begin{aligned}&-\frac{P(\bar{X})\times P(\bar{Y})-P(\bar{X}\bar{Y})}{P(\bar{X})\times P(\bar{Y})-\textit{min}\,{\_}{P(\bar{X}\bar{Y}|_{P(\bar{X}),P(\bar{Y})})}} = \text { /* by Proposition 6b */}\\&=- \frac{(1 - P(X))\times (1 - P(Y))-1 (- P(X)-P(Y)+P({XY}))}{(1-P({X})) \times (1-P({Y}))-\text {max}\{0,1- P({X}),P({Y})\} }\\&=-\frac{P({X})\times P({Y})-P({X}{Y})}{(1-P({X})-P({Y})+P({X}) \times P({Y}))-(0)}\\&=-\frac{P({X})\times P({Y})-P({X}{Y})}{(P({X})\times P({Y})- (P({X}) + P({Y})-1)}\\&=-\frac{P({X})\times P({Y})-P({X}{Y})}{P({X})\times P({Y})- \max \{0,P({X}) + P({Y})-1\}} = \text { /* by Theorem 3 */ }\\&= df(X\rightarrow Y). \end{aligned}$$

Case $P(\bar{X}\bar{Y}) < P(\bar{X}) \times P(\bar{Y})\,\mathrm{and}\,P(\bar{X}) + P(\bar{Y}) > 1$:

This case is equivalent to the case when $P(XY) < P(X) \times P(Y)$ (by Lemma 1c) and $P(X)+P(Y) <$ 1 (by Proposition 5c). Then:

$df(\bar{X}\rightarrow \bar{Y}) = \text { /* by Proposition 3a */} $

$$\begin{aligned}&= -\frac{P(\bar{X})\times P(\bar{Y})-P(\bar{X}\bar{Y})}{P(\bar{X})\times P(\bar{Y})-\textit{min}\,{\_}{P(\bar{X}\bar{Y}|_{P(\bar{X}),P(\bar{Y})})}} \text { /* by Proposition 6b */} \\&=- \frac{(1 - P(X))\times (1 - P(Y))- (1- P(X)-P(Y)+P({XY}))}{(1-P({X})) \times (1-P({Y}))-\text {max}\{0,1- P({X}),P({Y})\} }\\&=-\frac{P({X})\times P({Y})-P({X}{Y})}{(1-P({X})-P({Y})+P({X}) \times P({Y}))-(1-P(X)-P(Y))}\\&=-\frac{P({X})\times P({Y})-P({X}{Y})}{(P({X})\times P({Y})-0}\\&=-\frac{P({X})\times P({Y})-P({X}{Y})}{(P({X})\times P({Y})- \max \{0,P({X}) + P({Y})-1\}} = \text { /* by Theorem 3 */} \\&= df(X\rightarrow Y). \end{aligned}$$

Ad (b)

The proof is analogous to the proof of Lemma 1a.

Ad (c)

Case $P(X\bar{Y})>P(X) \times P(\bar{Y})$ and $P(X)\le P(\bar{Y})$:

This case is equivalent to the case when $P(XY)< P(X)\times P(Y)$ (by Lemma 1c) and $P(X) \le 1 - P(Y)$. Then:

$df(X \rightarrow \bar{Y}) = \text { /* by Proposition 3a */}$

$$\begin{aligned}&=\frac{P(X\bar{Y})-P(X)\times P(\bar{Y})}{max \,{\_}P(X\bar{Y}|_{P(X), P(\bar{Y})})-P(X)\times P(\bar{Y})} = \text { /* by Proposition 6c */} \\&= \frac{(P(X)-P(XY))-P(X)\times (1-P(Y))}{\min \{P(X), 1-P(Y)\}-P(X)\times (1-P(Y))}\\&= \frac{P(X)\times P(Y)-P(XY)}{P(X)\times P(Y)-0}\\&= \frac{P(X)\times P(Y)-P(XY)}{P(X)\times P(Y)-\max \{ 0, P(X)+P(Y)-1\}} = \text { /* by Theorem 3 */}\\&= - df(X \rightarrow Y). \end{aligned}$$

Case $P(X\bar{Y}) > P(X) \times P(\bar{Y})\,\mathrm{and}\,P(X) > P(\bar{Y})$.

This case is equivalent to the case when $P(XY)< P(X) \times P(Y)$ (by Lemma 1c) and $P(X) > 1 - P(Y)$. Then:

$df(X \rightarrow \bar{Y}) = \text { /* by Proposition 3a */}$

$$\begin{aligned}&=\frac{P(X\bar{Y})-P(X)\times P(\bar{Y})}{max {\_}P(X\bar{Y}|_{P(X), P(\bar{Y})})-P(X)\times P(\bar{Y})} = \text { /* by Proposition 6c */}\\&= \frac{(P(X)-P(XY))-P(X)\times (1-P(Y))}{\min \{P(X), 1-P(Y)\}-P(X)\times (1-P(Y))}\\&= \frac{P(X)\times P(Y)-P(XY)}{(1- P(Y))-P(X)\times (1-P(Y))}\\&= \frac{P(X)\times P(Y)-P(XY)}{P(X)\times P(Y)-\max \{ 0, P(X)+P(Y)-1\}} = \text { /* by Theorem 3 */}\\&= - df(X \rightarrow Y). \end{aligned}$$

Case $P(X\bar{Y})=P(X) \times P(\bar{Y})$:

This case is equivalent to the case when $P(XY)=P(X)\times P(Y)$ (by Lemma 1b). Then:

$df(\bar{X}\rightarrow \bar{Y}) = \text { /* by Proposition 3a */}$

$\begin{aligned}&= 0 = \text { /* by Proposition 3a */} \\&= -df(X\rightarrow Y). \end{aligned}$

Case $P(X\bar{Y}) < P(X)\times P(\bar{Y})$ and $P(X)+ P(\bar{Y}) \le 1$.

This case is equivalent to the case when $P(XY) > P(X) \times P(Y)$ (by Lemma 1a) and $P(X)\le P(Y)$. Then:

$df(X \rightarrow \bar{Y}) = \text { /* by Proposition 3a */}$

$$\begin{aligned}&= -\frac{P(X)\times P(\bar{Y})-P(X\bar{Y})}{P(X)\times P(\bar{Y})- min \, {\_}P(X\bar{Y}|_{P(X),P(\bar{Y})})}= \text { /* by Proposition 6d */} \\&=- \frac{P(X)\times (1-P(Y))-(P(X)-P(XY))}{P(X)\times (1-P(Y))-\max \{0, P(X)-P(Y)\}}\\&= - \frac{P(XY)-P(X)\times P(Y)}{(P(X)-P(X)\times P(Y))-(0)}\\&= - \frac{P(XY)-P(X)\times P(Y)}{\min \{P(X),P(Y)\}-P(X)\times P(Y)}= \text { /* by Theorem 3 */}\\&= - df(X\rightarrow Y). \end{aligned}$$

Case $P(X\bar{Y}) < P(X) \times P(\bar{Y})$ and $P(X)+P(\bar{Y}) > 1$.

This case is equivalent to the case when $P(XY) > P(X) \times P(Y)$ (by Lemma 1a) and $P(X)>P(Y)$. Then:

$df(X \rightarrow \bar{Y}) = \text { /* by Proposition 3a */}$

$$\begin{aligned}&= -\frac{P(X)\times P(\bar{Y})-P(X\bar{Y})}{P(X)\times P(\bar{Y})- min\, {\_}P(X\bar{Y}|_{P(X),P(\bar{Y})})}= \text { /* by Proposition 6d */}\\&=- \frac{P(X)\times (1-P(Y))-(P(X)-P(XY))}{P(X)\times (1-P(Y))-\max \{0, P(X)-P(Y)\}}\\&= - \frac{P(XY)-P(X)\times P(Y)}{P(X)\times (1-P(Y))-(P(X)-P(Y))}\\&= - \frac{P(XY)-P(X)\times P(Y)}{P(Y) -P(X)\times P(Y)}\\&= - \frac{P(XY)-P(X)\times P(Y)}{\min \{P(X),P(Y)\}-P(X)\times P(Y)}= \text { /* by Theorem 3 */}\\&= - df(X\rightarrow Y). \square \end{aligned}$$

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kryszkiewicz, M. (2016). Dependence Factor as a Rule Evaluation Measure. In: Matwin, S., Mielniczuk, J. (eds) Challenges in Computational Statistics and Data Mining. Studies in Computational Intelligence, vol 605. Springer, Cham. https://doi.org/10.1007/978-3-319-18781-5_12

Download citation

DOI: https://doi.org/10.1007/978-3-319-18781-5_12
Published: 28 June 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18780-8
Online ISBN: 978-3-319-18781-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Abstract

Buying options

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation