Incremental Junction Tree Inference

Agli, Hamza; Bonnard, Philippe; Gonzales, Christophe; Wuillemin, Pierre-Henri

doi:10.1007/978-3-319-40596-4_28

Hamza Agli¹⁶,
Philippe Bonnard¹⁶,
Christophe Gonzales¹⁷ &
…
Pierre-Henri Wuillemin¹⁷

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 610))

Included in the following conference series:

International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems

1161 Accesses

Abstract

Performing probabilistic inference in multi-target dynamic systems is a challenging task. When the system, its evidence and/or its targets evolve, most of the inference algorithms either recompute everything from scratch, even though incremental changes do not invalidate all the previous computations, or do not fully exploit incrementality to minimize computations. This incurs strong unnecessary overheads when the system under study is large. To alleviate this problem, we propose in this paper a new junction tree-based message-passing inference algorithm that, given a new query, minimizes computations by identifying precisely the set of messages that differ from the preceding computations. Experimental results highlight the efficiency of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Lifted Dynamic Junction Tree Algorithm

Adaptive Inference on Probabilistic Relational Models

Lifted Temporal Maximum Expected Utility

Notes

1.
http://agrum.lip6.fr.
2.
In Fig. 3a, due to the small number of nodes and arcs in the BNs, percentages of modifications lower than 10 % imply no modification at all, hence the lack of error bars.
3.
If $k'=r$ then $k\ne r'$ and the equality also verified.
4.
All the nodes are computationally equivalent if $\forall i\in \mathcal {V}(\mathcal {T}),i\in \mathcal {V}_1$ since $\mathcal {V}(\mathcal {T})=\mathcal {V}_1$.

References

Buchanan, B.G., Shortliffe, E.H.: Rule Based Expert Systems: The Mycin Experiments of the Stanford Heuristic Programming Project. Addison-Wesley, Reading (1984)
Google Scholar
Cooper, G.F.: The computational complexity of probabilistic inference using Bayesian belief networks. Artif. Intell. 42(2–3), 393–405 (1990)
Article MathSciNet MATH Google Scholar
Dagum, P., Luby, M.: Approximating probabilistic inference in Bayesian belief networks is NP-hard. Artif. Intell. 60(1), 141–153 (1993)
Article MathSciNet MATH Google Scholar
D’Ambrosio, B.: Incremental probabilistic inference. In: Proceedings of the 9th Conference on Uncertainty in Artificial Intelligence (UAI), pp. 301–308 (1993)
Google Scholar
Darwiche, A.: Dynamic join trees. In: Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence (UAI), pp. 97–104 (1998)
Google Scholar
Dean, T., Kanazawa, K.: A model for reasoning about persistence and causation. Comput. Intell. 5(2), 142–150 (1989)
Article Google Scholar
Flores, M.J., Gámez, J.A., Olesen, K.G.: Incremental compilation of Bayesian networks. In: Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence (UAI), pp. 233–240 (2003)
Google Scholar
Heckerman, D.E., Shortliffe, E.H.: From certainty factors to belief networks. Artif. Intell. Med. 4(1), 35–52 (1992)
Article Google Scholar
Jensen, F., Lauritzen, S., Olesen, K.: Bayesian updating in causal probabilistic networks by local computations. Comput. Stat. Q. 4, 269–282 (1990)
MathSciNet MATH Google Scholar
Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques. MIT Press, Cambridge (2009)
MATH Google Scholar
Koller, D., Pfeffer, A.: Probabilistic frame-based systems. In: Proceedings of the 15th National Conference on Artificial Intelligence (AAAI), pp. 580–587 (1998)
Google Scholar
Lauritzen, S., Spiegelhalter, D.J.: Local computations with probabilities on graphical structures and their applications to expert systems. J. Roy. Stat. Soc. 50(2), 157–224 (1988)
MathSciNet MATH Google Scholar
Li, W., van Beek, P., Poupart, P.: Performing incremental Bayesian inference by dynamic model counting. In: Proceedings of the National Conference on Artificial Intelligence (AAAI), pp. 1173–1179 (2006)
Google Scholar
Lin, Y., Druzdzel, M.J.: Relevance-based sequential evidence processing in Bayesian networks. In: Proceedings of the Eleventh International Florida Artificial Intelligence Research Society Conference (FLAIRS), pp. 446–450 (1998)
Google Scholar
Madsen, A.L., Jensen, F.V.: Lazy propagation: a junction tree inference algorithm based on lazy evaluation. Artif. Intell. 113(12), 203–245 (1999)
Article MathSciNet MATH Google Scholar
Murphy, K.P.: Dynamic Bayesian networks: representation, inference and learning. Ph.D. thesis, UC Berkeley (2002)
Google Scholar
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Mateo (1988)
MATH Google Scholar
Pfeffer, A.J.: Probabilistic reasoning for complex systems. Ph.D. thesis, Stanford University (2000)
Google Scholar
Robinson, J., Hartemink, A.: Non-stationary dynamic Bayesian networks, pp. 1369–1376 (2009)
Google Scholar
Shenoy, P., Shafer, G.: Axioms for probability and belief-function propagation. In: Proceedings of the Conference Uncertainty in Artificial Intelligence, vol. 4, pp. 169–198 (1990)
Google Scholar
Torti, L., Gonzales, C., Wuillemin, P.H.: Speeding-up structured probabilistic inference using pattern mining. Int. J. Approximate Reasoning 54(7), 900–918 (2013)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgments

This work was partially supported by IBM France Lab/ANRT CIFRE grant #2014/421.

Author information

Authors and Affiliations

IBM France Lab, Gentilly, France
Hamza Agli & Philippe Bonnard
Sorbonne Universités, UPMC Univ Paris 6, CNRS, UMR 7606 LIP6, Paris, France
Christophe Gonzales & Pierre-Henri Wuillemin

Authors

Hamza Agli
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Bonnard
View author publications
You can also search for this author in PubMed Google Scholar
Christophe Gonzales
View author publications
You can also search for this author in PubMed Google Scholar
Pierre-Henri Wuillemin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christophe Gonzales .

Editor information

Editors and Affiliations

INESC-ID,Instituto Superior Técnico, Universidade de Lisboa, Lisboa, Portugal
Joao Paulo Carvalho
LIP 6, Université Pierre et Marie Curie, Paris, France
Marie-Jeanne Lesot
School of Industrial Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
Uzay Kaymak
IDMEC,Instituto Superior Técnico, Universidade de Lisboa, Lisboa, Portugal
Susana Vieira
LIP6, Université Pierre et Marie Curie, CNRS, Paris, France
Bernadette Bouchon-Meunier
Machine Intelligence Institute, Iona College, New Rochelle, New York, USA
Ronald R. Yager

Appendix: Proofs

Proof of Proposition 1 : Note that $ \mathcal {V}_{\text {-}{j}}({i}) =\{i\}\cup \bigcup _{k\in \tiny {\mathrm{Adj}_{\text {-}{j}}(i)}}{ \mathcal {V}_{\text {-}{i}}({k}) }$ and, for $k\in \mathrm{Adj}_{\text {-}{j}}(i)$, $l'\in \mathcal {V}_{\text {-}{i}}({k}) $, we have $Adj_j(l')=Adj_i(l')$. Using Definition 4, one can thus rewrite $\mu _{{i}\rightarrow {j}}$ into:

$$\mu _{{i}\rightarrow {j}}=\displaystyle \bigcup _{\underset{\{l\}= Adj_{j}(l')}{l' \in \mathcal {V}_{\text {-}{j}}({i}) }} \lambda _{{l'}\rightarrow {l}} = \lambda _{{i}\rightarrow {j}}\cup \bigcup _{k\in \tiny {\mathrm{Adj}_{\text {-}{j}}(i)}}\overbrace{ \bigcup _{\underset{\{l\}=Adj_{j}(l')}{l'\in \mathcal {V}_{\text {-}{i}}({k}) }} \lambda _{{l'}\rightarrow {l}}}^{\mu _{{k}\rightarrow {i}}}=\lambda _{{i}\rightarrow {j}}\cup \,\bigcup _{k\in \tiny {\mathrm{Adj}_{\text {-}{j}}(i)}}{\mu _{{k}\rightarrow {i}}}$$

$\blacksquare $

Proof of Theorem 1 – mutual exclusivity: if property (a) is satisfied, then $\mathcal {T}$ contains no edge, therefore properties (b) and (c) cannot be satisfied.

Now, assume that there exist $r_1,r_1'$ such that $\mu _{{r_1'}\rightarrow {r_1}}=\mu _{{r_1}\rightarrow {r_1'}}=T\epsilon $ (property b). Let $r_2$ be any clique in $\mathcal {V}(\mathcal {T})$. Without loss of generality, assume that $r_1$ lies on the path $i_1=r_2,i_2,\ldots ,i_p=r_1'$ between $r_2$ and $r_1'$. Then, by Proposition 1, $\mu _{{i_2}\rightarrow {r_2}} \supseteq \mu _{{i_3}\rightarrow {i_2}} \supseteq \cdots \supseteq \mu _{{r_1'}\rightarrow {r_1}} = T\epsilon $. Therefore, properties (b) and (c) cannot hold simultaneously. $\blacksquare $

Proof of Theorem 1 – r ’s existence: if $\mathcal {A}(\mathcal {T}) = \emptyset $, then property (a) holds and r is the unique node of $\mathcal {T}$. Now, assume that $\mathcal {A}(\mathcal {T}) \ne \emptyset $. If there exists an edge $(i,j) \in \mathcal {E}(\mathcal {T})$ such that $\mu _{{i}\rightarrow {j}} = \mu _{{j}\rightarrow {i}} = T\epsilon $, then $r = i$ satifies property (b). Otherwise, neither properties (a) nor (b) hold. Assume that property (c) neither holds. Then, for all edges (i, j), exactly one of $\mu _{{i}\rightarrow {j}}$ or $\mu _{{j}\rightarrow {i}}$ is equal to $T\epsilon $ and the other one belongs to $\{\emptyset ,\epsilon ,T\}$. Let $(i_0,j_0)$ be such that $\mu _{{i_0}\rightarrow {j_0}} = T\epsilon $ and $\mu _{{j_0}\rightarrow {i_0}} \ne T\epsilon $. Then, if $|\mathrm{Adj}(i_0)|=1$, clique $i_0$ satisfies property (c), a contradiction. As we assume that property (b) neither holds, there exists $i_1 \in \mathrm{Adj}(i_0)$ such that $\mu _{{i_1}\rightarrow {i_0}} = T\epsilon $ and $\mu _{{i_0}\rightarrow {i_1}} \ne T\epsilon $. The same reasoning holds for $i_1$, hence either $i_1$ is a leaf, which contradicts property (c) or $i_1$ has another neighbor $i_2$ such that $\mu _{{i_2}\rightarrow {i_1}} = T\epsilon $ and $\mu _{{i_1}\rightarrow {i_2}} \ne T\epsilon $. By induction, we create a path $i_1,\ldots ,i_n$ of maximal size. This path is necessarily finite since $\mathcal {T}$ is a finite tree, hence clique $i_n$ is a leaf which, therefore, satisfies property (c), a contradiction. Consequently, when properties (a) and (b) do not hold, property (c) holds. $\blacksquare $

One can now prove separately the optimality for each property of Theorem 1, since these properties are mutually exclusive:

Proof of Theorem 1 – property a’s optimality: r is the only node in $\mathcal {T}$. Choosing it as a root is therefore optimal. $\blacksquare $

Lemma 1

Let $i,j\in \mathcal {V}(\mathcal {T})$ be such that $\epsilon \in \mu _{{j}\rightarrow {i}}$ and $\mu _{{i}\rightarrow {j}}=\emptyset $, then $\forall l \in \mathcal {V}_{\text {-}{j}}({i}) : \delta (l)= \delta (j)+len(l\!-\!j)$.

Proof

Note that when $\epsilon \notin \mu _{{j}\rightarrow {i}}$, $\mathcal {T}$ is up-to-date in the current inference and there is no need to perform any computation. The proof is achieved by induction on $n=len(l\!-\!j)$. For $n=1$, we have $l=i$, so by Eq. (2) and the fact that $\epsilon \in \mu _{{j}\rightarrow {i}}$ and $i\in Adj_i(j)$, we get $\delta _{{j}\rightarrow {i}}(i)=1$. As a consequence, $\delta (i)=\sum _{{(k',k)\in \mathcal {A}(\mathcal {T})\setminus \{(j,i)\}}}\delta _{{k'}\rightarrow {k}}(i)+1$. Yet, as $T\notin \mu _{{i}\rightarrow {j}}$ we have $\delta _{{j}\rightarrow {i}}(j)=0$; so $\delta (j)=\sum _{(k',k)\in \mathcal {A}(\mathcal {T}))\setminus \{(j,i)\}}\delta _{{k'}\rightarrow {k}}(j)$. Since $\epsilon \notin \mu _{{i}\rightarrow {j}}$, $\delta _{{i}\rightarrow {j}}(i)=\delta _{{i}\rightarrow {j}}(j)=0$. For $(k',k)\ne (i,j),(j,i)$, we have $Adj_i(k)=Adj_j(k)$ and $Adj_i(k')=Adj_j(k')$. In this case, it follows that $\delta _{{k'}\rightarrow {k}}(i)=\delta _{{k'}\rightarrow {k}}(j)$. We conclude that $\delta (i)=\delta (j)+1$.

Now suppose this property is satisfied for $n-1>1$, let us prove that it remains true for n. Let l be such that $len(l\!-\!j) = n-1$. Let $\{p\}=Adj_i(l)$. Then $\delta (l)=1+\sum _{{(k',k)\in \mathcal {A}(\mathcal {T})\setminus \{(p,l)\}}}\delta _{{k'}\rightarrow {k}}(l)$ because $\delta _{{p}\rightarrow {l}}(l)=1$ (since $\epsilon \in \mu _{{p}\rightarrow {l}}$ and $\{l\}= Adj_l(p)$). Knowing that $T\notin \mu _{{l}\rightarrow {p}}$, we get $\delta _{{p}\rightarrow {l}}(p)=0$, it follows that $\delta (p)=\sum _{{(k',k)\in \mathcal {A}(\mathcal {T})\setminus \{(p,l)\}}}\delta _{{k'}\rightarrow {k}}(p)$. Now using the same reasoning as in the case $n=1$ and by remarking $ \delta _{{l}\rightarrow {p}}(p)=\delta _{{l}\rightarrow {p}}(l)=0$ because $\epsilon \notin \mu _{{l}\rightarrow {p}}$, we conclude that $\delta (l)=1+\sum _{{(k',k)\in \mathcal {A}(\mathcal {T})\setminus \{(p,l)\}}}\delta _{{k'}\rightarrow {k}}(l)=1+\sum _{{(k',k)\in \mathcal {A}(\mathcal {T})\setminus \{(p,l)\}}}\delta _{{k'}\rightarrow {k}}(p)=1+\delta (p)$. By applying the induction hypothesis on l, where $len(l\!-\!j)=n-1$, we obtain: $\delta (l)=1+\delta (p)=1+n-1+\delta (j)=\delta (j)+n$. $\blacksquare $

Lemma 2

Let $\mathcal {V}_1=\{r\in \mathcal {V}(\mathcal {T}): \exists k\in \mathrm{Adj}(r), \mu _{{r}\rightarrow {k}}=\mu _{{k}\rightarrow {r}}=T\epsilon \}$, then for any $ r,r'$ in $\mathcal {V}_1$ we have $\delta (r)=\delta (r')$.

Proof

Assume that $|\mathcal {V}_1|>1$. By Proposition 1, the nodes in $\mathcal {V}_1$ form a connected subgraph. Let $r,r' \in \mathcal {V}_1$ be such that $(r,r') \in \mathcal {E}(\mathcal {T})$. Finally, let $(k',k)\in \mathcal {A}(\mathcal {T})\setminus \{(r,r'),(r',r)\}$. If $k'\notin \{r,r'\}$, then either $k=r$, $k=r'$ or $k\notin \{r,r'\}$ and in all these cases we have: $Adj_r(k')=Adj_{r'}(k')$, hence $\delta _{{k'}\rightarrow {k}}(r)=\delta _{{k'}\rightarrow {k}}(r')$. Otherwise, let $k'=r'$ then $k\ne r$ and we have also^{Footnote 3} $Adj_r(k)=Adj_{r'}(k)$ and again $\delta _{{k'}\rightarrow {k}}(r)=\delta _{{k'}\rightarrow {k}}(r')$. As a consequence: $\sum _{{(k',k)\in \mathcal {A}(\mathcal {T})\setminus \{(r,r'),(r',r)\}}}\delta _{{k'}\rightarrow {k}}(r)=\sum _{{(k',k)\in \mathcal {A}(\mathcal {T})\setminus \{(r,r'),(r',r)\}}}\delta _{{k'}\rightarrow {k}}(r')$. By Eq. (2), we get: $\delta _{{r}\rightarrow {r'}}(r)+\delta _{{r'}\rightarrow {r}}(r)= \delta _{{r}\rightarrow {r'}}(r')+\delta _{{r'}\rightarrow {r}}(r')=2$. We conclude that $\delta (r)-\!\sum _{{(k',k)\in \mathcal {A}(\mathcal {T})\setminus \{(r,r'),(r',r)\}}}\!\delta _{{k'}\rightarrow {k}}(r)= \delta (r')-\!\sum _{{(k',k)\in \mathcal {A}(\mathcal {T})\setminus \{(r,r'),(r',r)\}}}\!\delta _{{k'}\rightarrow {k}}(r')$. Hence $\delta (r)=\delta (r')$. $\blacksquare $

Proof of Theorem 1 – property b’s optimality: Under the notations of property b), it is sufficient to prove that for any i not in $\mathcal {V}_1,\delta (r)\le \delta (i)$ ^{Footnote 4}. Without loss of generality, assume that $i\in \mathcal {V}_{\text {-}{r'}}({r}) $. Let $ (k,k')\in \mathcal {A}(i\!-\!r)$, where $\mathcal {A}(i\!-\!r)$ is the set of arcs induced from $i\!-\!r$. We either have $\{k'\}=Adj_r(k)$ or $\{k\}=Adj_r(k')$. Assume for instance that $\{k'\}=Adj_r(k)\,,k\ne r$, the second case should be treated similarly. Then $\mu _{{k'}\rightarrow {k}}=T\epsilon $ and by applying Eq. 2, we summarize the results on the following table:

we conclude that $\sum _{{(k',k)\in \mathcal {A}(i\!-\!r)}}\delta _{{k'}\rightarrow {k}}(r)\le \sum _{(k',k)\in \mathcal {A}(i\!-\!r)}\delta _{{k'}\rightarrow {k}}(i)$. (1)

Now for $(k,k')\notin \mathcal {A}(i\!-\!r)$ it is easy to see that $\delta _{{k}\rightarrow {k'}}(i)=\delta _{{k}\rightarrow {k'}}(r)$ and hence: $\sum _{(k,k')\in \mathcal {A}(\mathcal {T})\setminus \mathcal {A}(i\!-\!r)}\delta _{{k'}\rightarrow {k}}(r)\!=\! \sum _{(k,k')\in \mathcal {A}(\mathcal {T})\setminus \mathcal {A}(i\!-\!r) }\delta _{{k'}\rightarrow {k}}(i)$. (2).

By comparing (1) and (2) we get that $\delta (r)\le \delta (i)$ for $i\notin \mathcal {V}_1$. So far, we obtain, by Lemma 2, for any i in $\mathcal {V}_1$, $\delta (r)=\delta (i)$ and for any i not in $\mathcal {V}_1,\delta (r)\le \delta (i)$, therefore we have $r\in Argmin_{i\in \mathcal {V}(\mathcal {T})}$ $\delta (i)$. $\blacksquare $

Proof of Theorem 1 – property c’s optimality: Let i in $\mathcal {V}(\mathcal {T})$ s.t. $i\ne r$.

first case: $\mu _{{Adj_i(r)}\rightarrow {r}}=\emptyset $. Assume that $T,\epsilon \in \mathcal {V}_{\text {-}{i}}({r}) $, because otherwise there is no need to perform any computation, as either there is no query or no modification in $\mathcal {T}$; so by Lemma 1 we have $\delta (i)=\delta (r)+len(i\!-\!r)$ because $i\in \mathcal {V}_{\text {-}{r}}({Adj_i(r)}) $. Hence $ \delta (r)<\delta (i)$.

second case: we omit the case $\mu _{{Adj_i(r)}\rightarrow {r}}\in \{T,\epsilon \}$, but one should use the same methodology as in property b)’s proof and the fact that for any $k,k'$ in $i\!-\!r$ s.t $\{k'\}=Adj_r(k): \mu _{{k}\rightarrow {k'}}=\mu _{{i}\rightarrow {Adj_r(i)}}$ and examine $\delta _{{k'}\rightarrow {k}}(r)$ and $\delta _{{k'}\rightarrow {k}}(i)$. $\blacksquare $

Proof of Proposition 2 : Given a root r, $\delta _{{i}\rightarrow {j}}(r)$ corresponds, by construction, to the fact that $\psi _{{i}\rightarrow {j}}$ is necessary during the current inference and was invalidated in the previous one. As a consequence, the current inference needs to recompute only such a message for any i, j in $\mathcal {V}(\mathcal {T})$. $\blacksquare $

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Agli, H., Bonnard, P., Gonzales, C., Wuillemin, PH. (2016). Incremental Junction Tree Inference. In: Carvalho, J., Lesot, MJ., Kaymak, U., Vieira, S., Bouchon-Meunier, B., Yager, R. (eds) Information Processing and Management of Uncertainty in Knowledge-Based Systems. IPMU 2016. Communications in Computer and Information Science, vol 610. Springer, Cham. https://doi.org/10.1007/978-3-319-40596-4_28

Download citation

DOI: https://doi.org/10.1007/978-3-319-40596-4_28
Published: 11 June 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-40595-7
Online ISBN: 978-3-319-40596-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Incremental Junction Tree Inference

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Lifted Dynamic Junction Tree Algorithm

Adaptive Inference on Probabilistic Relational Models

Lifted Temporal Maximum Expected Utility

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix: Proofs

Appendix: Proofs

Lemma 1

Proof

Lemma 2

Proof

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us