Abstract
Performing probabilistic inference in multi-target dynamic systems is a challenging task. When the system, its evidence and/or its targets evolve, most of the inference algorithms either recompute everything from scratch, even though incremental changes do not invalidate all the previous computations, or do not fully exploit incrementality to minimize computations. This incurs strong unnecessary overheads when the system under study is large. To alleviate this problem, we propose in this paper a new junction tree-based message-passing inference algorithm that, given a new query, minimizes computations by identifying precisely the set of messages that differ from the preceding computations. Experimental results highlight the efficiency of our approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
In Fig. 3a, due to the small number of nodes and arcs in the BNs, percentages of modifications lower than 10 % imply no modification at all, hence the lack of error bars.
- 3.
If \(k'=r\) then \(k\ne r'\) and the equality also verified.
- 4.
All the nodes are computationally equivalent if \(\forall i\in \mathcal {V}(\mathcal {T}),i\in \mathcal {V}_1\) since \(\mathcal {V}(\mathcal {T})=\mathcal {V}_1\).
References
Buchanan, B.G., Shortliffe, E.H.: Rule Based Expert Systems: The Mycin Experiments of the Stanford Heuristic Programming Project. Addison-Wesley, Reading (1984)
Cooper, G.F.: The computational complexity of probabilistic inference using Bayesian belief networks. Artif. Intell. 42(2–3), 393–405 (1990)
Dagum, P., Luby, M.: Approximating probabilistic inference in Bayesian belief networks is NP-hard. Artif. Intell. 60(1), 141–153 (1993)
D’Ambrosio, B.: Incremental probabilistic inference. In: Proceedings of the 9th Conference on Uncertainty in Artificial Intelligence (UAI), pp. 301–308 (1993)
Darwiche, A.: Dynamic join trees. In: Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence (UAI), pp. 97–104 (1998)
Dean, T., Kanazawa, K.: A model for reasoning about persistence and causation. Comput. Intell. 5(2), 142–150 (1989)
Flores, M.J., Gámez, J.A., Olesen, K.G.: Incremental compilation of Bayesian networks. In: Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence (UAI), pp. 233–240 (2003)
Heckerman, D.E., Shortliffe, E.H.: From certainty factors to belief networks. Artif. Intell. Med. 4(1), 35–52 (1992)
Jensen, F., Lauritzen, S., Olesen, K.: Bayesian updating in causal probabilistic networks by local computations. Comput. Stat. Q. 4, 269–282 (1990)
Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques. MIT Press, Cambridge (2009)
Koller, D., Pfeffer, A.: Probabilistic frame-based systems. In: Proceedings of the 15th National Conference on Artificial Intelligence (AAAI), pp. 580–587 (1998)
Lauritzen, S., Spiegelhalter, D.J.: Local computations with probabilities on graphical structures and their applications to expert systems. J. Roy. Stat. Soc. 50(2), 157–224 (1988)
Li, W., van Beek, P., Poupart, P.: Performing incremental Bayesian inference by dynamic model counting. In: Proceedings of the National Conference on Artificial Intelligence (AAAI), pp. 1173–1179 (2006)
Lin, Y., Druzdzel, M.J.: Relevance-based sequential evidence processing in Bayesian networks. In: Proceedings of the Eleventh International Florida Artificial Intelligence Research Society Conference (FLAIRS), pp. 446–450 (1998)
Madsen, A.L., Jensen, F.V.: Lazy propagation: a junction tree inference algorithm based on lazy evaluation. Artif. Intell. 113(12), 203–245 (1999)
Murphy, K.P.: Dynamic Bayesian networks: representation, inference and learning. Ph.D. thesis, UC Berkeley (2002)
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Mateo (1988)
Pfeffer, A.J.: Probabilistic reasoning for complex systems. Ph.D. thesis, Stanford University (2000)
Robinson, J., Hartemink, A.: Non-stationary dynamic Bayesian networks, pp. 1369–1376 (2009)
Shenoy, P., Shafer, G.: Axioms for probability and belief-function propagation. In: Proceedings of the Conference Uncertainty in Artificial Intelligence, vol. 4, pp. 169–198 (1990)
Torti, L., Gonzales, C., Wuillemin, P.H.: Speeding-up structured probabilistic inference using pattern mining. Int. J. Approximate Reasoning 54(7), 900–918 (2013)
Acknowledgments
This work was partially supported by IBM France Lab/ANRT CIFRE grant #2014/421.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix: Proofs
Appendix: Proofs
Proof of Proposition 1 : Note that \( \mathcal {V}_{\text {-}{j}}({i}) =\{i\}\cup \bigcup _{k\in \tiny {\mathrm{Adj}_{\text {-}{j}}(i)}}{ \mathcal {V}_{\text {-}{i}}({k}) }\) and, for \(k\in \mathrm{Adj}_{\text {-}{j}}(i)\), \(l'\in \mathcal {V}_{\text {-}{i}}({k}) \), we have \(Adj_j(l')=Adj_i(l')\). Using Definition 4, one can thus rewrite \(\mu _{{i}\rightarrow {j}}\) into:
\(\blacksquare \)
Proof of Theorem 1 – mutual exclusivity: if property (a) is satisfied, then \(\mathcal {T}\) contains no edge, therefore properties (b) and (c) cannot be satisfied.
Now, assume that there exist \(r_1,r_1'\) such that \(\mu _{{r_1'}\rightarrow {r_1}}=\mu _{{r_1}\rightarrow {r_1'}}=T\epsilon \) (property b). Let \(r_2\) be any clique in \(\mathcal {V}(\mathcal {T})\). Without loss of generality, assume that \(r_1\) lies on the path \(i_1=r_2,i_2,\ldots ,i_p=r_1'\) between \(r_2\) and \(r_1'\). Then, by Proposition 1, \(\mu _{{i_2}\rightarrow {r_2}} \supseteq \mu _{{i_3}\rightarrow {i_2}} \supseteq \cdots \supseteq \mu _{{r_1'}\rightarrow {r_1}} = T\epsilon \). Therefore, properties (b) and (c) cannot hold simultaneously. \(\blacksquare \)
Proof of Theorem 1 – r ’s existence: if \(\mathcal {A}(\mathcal {T}) = \emptyset \), then property (a) holds and r is the unique node of \(\mathcal {T}\). Now, assume that \(\mathcal {A}(\mathcal {T}) \ne \emptyset \). If there exists an edge \((i,j) \in \mathcal {E}(\mathcal {T})\) such that \(\mu _{{i}\rightarrow {j}} = \mu _{{j}\rightarrow {i}} = T\epsilon \), then \(r = i\) satifies property (b). Otherwise, neither properties (a) nor (b) hold. Assume that property (c) neither holds. Then, for all edges (i, j), exactly one of \(\mu _{{i}\rightarrow {j}}\) or \(\mu _{{j}\rightarrow {i}}\) is equal to \(T\epsilon \) and the other one belongs to \(\{\emptyset ,\epsilon ,T\}\). Let \((i_0,j_0)\) be such that \(\mu _{{i_0}\rightarrow {j_0}} = T\epsilon \) and \(\mu _{{j_0}\rightarrow {i_0}} \ne T\epsilon \). Then, if \(|\mathrm{Adj}(i_0)|=1\), clique \(i_0\) satisfies property (c), a contradiction. As we assume that property (b) neither holds, there exists \(i_1 \in \mathrm{Adj}(i_0)\) such that \(\mu _{{i_1}\rightarrow {i_0}} = T\epsilon \) and \(\mu _{{i_0}\rightarrow {i_1}} \ne T\epsilon \). The same reasoning holds for \(i_1\), hence either \(i_1\) is a leaf, which contradicts property (c) or \(i_1\) has another neighbor \(i_2\) such that \(\mu _{{i_2}\rightarrow {i_1}} = T\epsilon \) and \(\mu _{{i_1}\rightarrow {i_2}} \ne T\epsilon \). By induction, we create a path \(i_1,\ldots ,i_n\) of maximal size. This path is necessarily finite since \(\mathcal {T}\) is a finite tree, hence clique \(i_n\) is a leaf which, therefore, satisfies property (c), a contradiction. Consequently, when properties (a) and (b) do not hold, property (c) holds. \(\blacksquare \)
One can now prove separately the optimality for each property of Theorem 1, since these properties are mutually exclusive:
Proof of Theorem 1 – property a’s optimality: r is the only node in \(\mathcal {T}\). Choosing it as a root is therefore optimal. \(\blacksquare \)
Lemma 1
Let \(i,j\in \mathcal {V}(\mathcal {T})\) be such that \(\epsilon \in \mu _{{j}\rightarrow {i}}\) and \(\mu _{{i}\rightarrow {j}}=\emptyset \), then \(\forall l \in \mathcal {V}_{\text {-}{j}}({i}) : \delta (l)= \delta (j)+len(l\!-\!j)\).
Proof
Note that when \(\epsilon \notin \mu _{{j}\rightarrow {i}}\), \(\mathcal {T}\) is up-to-date in the current inference and there is no need to perform any computation. The proof is achieved by induction on \(n=len(l\!-\!j)\). For \(n=1\), we have \(l=i\), so by Eq. (2) and the fact that \(\epsilon \in \mu _{{j}\rightarrow {i}}\) and \(i\in Adj_i(j)\), we get \(\delta _{{j}\rightarrow {i}}(i)=1\). As a consequence, \(\delta (i)=\sum _{{(k',k)\in \mathcal {A}(\mathcal {T})\setminus \{(j,i)\}}}\delta _{{k'}\rightarrow {k}}(i)+1\). Yet, as \(T\notin \mu _{{i}\rightarrow {j}}\) we have \(\delta _{{j}\rightarrow {i}}(j)=0\); so \(\delta (j)=\sum _{(k',k)\in \mathcal {A}(\mathcal {T}))\setminus \{(j,i)\}}\delta _{{k'}\rightarrow {k}}(j)\). Since \(\epsilon \notin \mu _{{i}\rightarrow {j}}\), \(\delta _{{i}\rightarrow {j}}(i)=\delta _{{i}\rightarrow {j}}(j)=0\). For \((k',k)\ne (i,j),(j,i)\), we have \(Adj_i(k)=Adj_j(k)\) and \(Adj_i(k')=Adj_j(k')\). In this case, it follows that \(\delta _{{k'}\rightarrow {k}}(i)=\delta _{{k'}\rightarrow {k}}(j)\). We conclude that \(\delta (i)=\delta (j)+1\).
Now suppose this property is satisfied for \(n-1>1\), let us prove that it remains true for n. Let l be such that \(len(l\!-\!j) = n-1\). Let \(\{p\}=Adj_i(l)\). Then \(\delta (l)=1+\sum _{{(k',k)\in \mathcal {A}(\mathcal {T})\setminus \{(p,l)\}}}\delta _{{k'}\rightarrow {k}}(l)\) because \(\delta _{{p}\rightarrow {l}}(l)=1\) (since \(\epsilon \in \mu _{{p}\rightarrow {l}}\) and \(\{l\}= Adj_l(p)\)). Knowing that \(T\notin \mu _{{l}\rightarrow {p}}\), we get \(\delta _{{p}\rightarrow {l}}(p)=0\), it follows that \(\delta (p)=\sum _{{(k',k)\in \mathcal {A}(\mathcal {T})\setminus \{(p,l)\}}}\delta _{{k'}\rightarrow {k}}(p)\). Now using the same reasoning as in the case \(n=1\) and by remarking \( \delta _{{l}\rightarrow {p}}(p)=\delta _{{l}\rightarrow {p}}(l)=0\) because \(\epsilon \notin \mu _{{l}\rightarrow {p}}\), we conclude that \(\delta (l)=1+\sum _{{(k',k)\in \mathcal {A}(\mathcal {T})\setminus \{(p,l)\}}}\delta _{{k'}\rightarrow {k}}(l)=1+\sum _{{(k',k)\in \mathcal {A}(\mathcal {T})\setminus \{(p,l)\}}}\delta _{{k'}\rightarrow {k}}(p)=1+\delta (p)\). By applying the induction hypothesis on l, where \(len(l\!-\!j)=n-1\), we obtain: \(\delta (l)=1+\delta (p)=1+n-1+\delta (j)=\delta (j)+n\). \(\blacksquare \)
Lemma 2
Let \(\mathcal {V}_1=\{r\in \mathcal {V}(\mathcal {T}): \exists k\in \mathrm{Adj}(r), \mu _{{r}\rightarrow {k}}=\mu _{{k}\rightarrow {r}}=T\epsilon \}\), then for any \( r,r'\) in \(\mathcal {V}_1\) we have \(\delta (r)=\delta (r')\).
Proof
Assume that \(|\mathcal {V}_1|>1\). By Proposition 1, the nodes in \(\mathcal {V}_1\) form a connected subgraph. Let \(r,r' \in \mathcal {V}_1\) be such that \((r,r') \in \mathcal {E}(\mathcal {T})\). Finally, let \((k',k)\in \mathcal {A}(\mathcal {T})\setminus \{(r,r'),(r',r)\}\). If \(k'\notin \{r,r'\}\), then either \(k=r\), \(k=r'\) or \(k\notin \{r,r'\}\) and in all these cases we have: \(Adj_r(k')=Adj_{r'}(k')\), hence \(\delta _{{k'}\rightarrow {k}}(r)=\delta _{{k'}\rightarrow {k}}(r')\). Otherwise, let \(k'=r'\) then \(k\ne r\) and we have alsoFootnote 3 \(Adj_r(k)=Adj_{r'}(k)\) and again \(\delta _{{k'}\rightarrow {k}}(r)=\delta _{{k'}\rightarrow {k}}(r')\). As a consequence: \(\sum _{{(k',k)\in \mathcal {A}(\mathcal {T})\setminus \{(r,r'),(r',r)\}}}\delta _{{k'}\rightarrow {k}}(r)=\sum _{{(k',k)\in \mathcal {A}(\mathcal {T})\setminus \{(r,r'),(r',r)\}}}\delta _{{k'}\rightarrow {k}}(r')\). By Eq. (2), we get: \(\delta _{{r}\rightarrow {r'}}(r)+\delta _{{r'}\rightarrow {r}}(r)= \delta _{{r}\rightarrow {r'}}(r')+\delta _{{r'}\rightarrow {r}}(r')=2\). We conclude that \(\delta (r)-\!\sum _{{(k',k)\in \mathcal {A}(\mathcal {T})\setminus \{(r,r'),(r',r)\}}}\!\delta _{{k'}\rightarrow {k}}(r)= \delta (r')-\!\sum _{{(k',k)\in \mathcal {A}(\mathcal {T})\setminus \{(r,r'),(r',r)\}}}\!\delta _{{k'}\rightarrow {k}}(r')\). Hence \(\delta (r)=\delta (r')\). \(\blacksquare \)
Proof of Theorem 1 – property b’s optimality: Under the notations of property b), it is sufficient to prove that for any i not in \(\mathcal {V}_1,\delta (r)\le \delta (i)\) Footnote 4. Without loss of generality, assume that \(i\in \mathcal {V}_{\text {-}{r'}}({r}) \). Let \( (k,k')\in \mathcal {A}(i\!-\!r)\), where \(\mathcal {A}(i\!-\!r)\) is the set of arcs induced from \(i\!-\!r\). We either have \(\{k'\}=Adj_r(k)\) or \(\{k\}=Adj_r(k')\). Assume for instance that \(\{k'\}=Adj_r(k)\,,k\ne r\), the second case should be treated similarly. Then \(\mu _{{k'}\rightarrow {k}}=T\epsilon \) and by applying Eq. 2, we summarize the results on the following table:

we conclude that \(\sum _{{(k',k)\in \mathcal {A}(i\!-\!r)}}\delta _{{k'}\rightarrow {k}}(r)\le \sum _{(k',k)\in \mathcal {A}(i\!-\!r)}\delta _{{k'}\rightarrow {k}}(i)\). (1)
Now for \((k,k')\notin \mathcal {A}(i\!-\!r)\) it is easy to see that \(\delta _{{k}\rightarrow {k'}}(i)=\delta _{{k}\rightarrow {k'}}(r)\) and hence: \(\sum _{(k,k')\in \mathcal {A}(\mathcal {T})\setminus \mathcal {A}(i\!-\!r)}\delta _{{k'}\rightarrow {k}}(r)\!=\! \sum _{(k,k')\in \mathcal {A}(\mathcal {T})\setminus \mathcal {A}(i\!-\!r) }\delta _{{k'}\rightarrow {k}}(i)\). (2).
By comparing (1) and (2) we get that \(\delta (r)\le \delta (i)\) for \(i\notin \mathcal {V}_1\). So far, we obtain, by Lemma 2, for any i in \(\mathcal {V}_1\), \(\delta (r)=\delta (i)\) and for any i not in \(\mathcal {V}_1,\delta (r)\le \delta (i)\), therefore we have \(r\in Argmin_{i\in \mathcal {V}(\mathcal {T})}\) \(\delta (i)\). \(\blacksquare \)
Proof of Theorem 1 – property c’s optimality: Let i in \(\mathcal {V}(\mathcal {T})\) s.t. \(i\ne r\).
first case: \(\mu _{{Adj_i(r)}\rightarrow {r}}=\emptyset \). Assume that \(T,\epsilon \in \mathcal {V}_{\text {-}{i}}({r}) \), because otherwise there is no need to perform any computation, as either there is no query or no modification in \(\mathcal {T}\); so by Lemma 1 we have \(\delta (i)=\delta (r)+len(i\!-\!r)\) because \(i\in \mathcal {V}_{\text {-}{r}}({Adj_i(r)}) \). Hence \( \delta (r)<\delta (i)\).
second case: we omit the case \(\mu _{{Adj_i(r)}\rightarrow {r}}\in \{T,\epsilon \}\), but one should use the same methodology as in property b)’s proof and the fact that for any \(k,k'\) in \(i\!-\!r\) s.t \(\{k'\}=Adj_r(k): \mu _{{k}\rightarrow {k'}}=\mu _{{i}\rightarrow {Adj_r(i)}}\) and examine \(\delta _{{k'}\rightarrow {k}}(r)\) and \(\delta _{{k'}\rightarrow {k}}(i)\). \(\blacksquare \)
Proof of Proposition 2 : Given a root r, \(\delta _{{i}\rightarrow {j}}(r)\) corresponds, by construction, to the fact that \(\psi _{{i}\rightarrow {j}}\) is necessary during the current inference and was invalidated in the previous one. As a consequence, the current inference needs to recompute only such a message for any i, j in \(\mathcal {V}(\mathcal {T})\). \(\blacksquare \)
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Agli, H., Bonnard, P., Gonzales, C., Wuillemin, PH. (2016). Incremental Junction Tree Inference. In: Carvalho, J., Lesot, MJ., Kaymak, U., Vieira, S., Bouchon-Meunier, B., Yager, R. (eds) Information Processing and Management of Uncertainty in Knowledge-Based Systems. IPMU 2016. Communications in Computer and Information Science, vol 610. Springer, Cham. https://doi.org/10.1007/978-3-319-40596-4_28
Download citation
DOI: https://doi.org/10.1007/978-3-319-40596-4_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-40595-7
Online ISBN: 978-3-319-40596-4
eBook Packages: Computer ScienceComputer Science (R0)