Skip to main content
Log in

A unified agent-based framework for constrained graph partitioning

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

Social networks offer various services such as recommendations of social events, or delivery of targeted advertising material to certain users. In this work, we focus on a specific type of services modeled as constrained graph partitioning (CGP). CGP assigns users of a social network to a set of classes with bounded capacities so that the similarity and the social costs are minimized. The similarity cost is proportional to the dissimilarity between a user and his class, whereas the social cost is measured in terms of friends that are assigned to different classes. In this work, we investigate two solutions for CGP. The first utilizes a game-theoretic framework, where each user constitutes a player that wishes to minimize his own social and similarity cost. The second employs local search, and aims at minimizing the global cost. We show that the two approaches can be unified under a common agent-based framework that allows for two types of deviations. In a unilateral deviation, an agent switches to a new class, whereas in a bilateral deviation a pair of agents exchange their classes. We develop a number of optimization techniques to improve result quality and facilitate efficiency. Our experimental evaluation on real datasets demonstrates that the proposed methods always outperform the state of the art in terms of solution quality, while they are up to an order of magnitude faster.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21

Similar content being viewed by others

Notes

  1. In the rest of the paper, we use the terms node/user, edge/friendship and class/event interchangeably.

  2. Although SEO is presented as a utility maximization problem, it could also be defined as a cost minimization problem using Objective Function (1).

  3. Usually, the former emphasizes the entire solution, whereas the latter emphasizes the individual agent strategies.

  4. From a game-theoretic perspective, our definition ensures individual rationality since a player participates in a bilateral deviation, if and only if he is not worse off by participating.

  5. https://www.eventbrite.com.

  6. It is trivial to show that the proof also stands if we suppose that u deviates first.

References

  1. Aarts, E., Lenstra, J.K. (eds.): Local Search in Combinatorial Optimization, 1st edn. Wiley, Hoboken (1997)

    MATH  Google Scholar 

  2. Agarwala, A., Dontcheva, M., Agrawala, M., Drucker, S., Colburn, A., Curless, B., Salesin, D., Cohen, M.: Interactive digital photo-montage. ACM Trans. Graph. 23(3), 294–302 (2004)

    Article  Google Scholar 

  3. Andrews, M., Hajiaghayi, M.T., Karloff, H., Moitra, A.: Capacitated metric labeling. In: Proceedings of the Twenty-Second Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 11, SIAM, pp. 976–995 (2011)

  4. Anshelevich, E., Sekar, S.: Approximate equilibrium and incentivizing social coordination. CoRR arXiv:1404.4718 (2014)

  5. Armenatzoglou, N., Pham, H., Ntranos, V., Papadias, D., Shahabi, C.: Real-time multi-criteria social graph partitioning: a game theoretic approach. In: SIGMOD (2015)

  6. Barnard, S.: Stochastic stereo matching over scale. Int. J. Comput. Vis. 3(1), 17–32 (1989)

    Article  Google Scholar 

  7. Boykov, Y., Jolly, M.P.: Interactive graph cuts for optimal boundary and region segmentation of objects in N–D images. In: Proceedings of Eighth IEEE International Conference on Computer Vision, vol. 1, pp. 105–112 (2001)

  8. Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE Trans. PAMI 23(11), 1222–1239 (2001)

    Article  Google Scholar 

  9. Calinescu, G., Karloff, H., Rabani, Y.: Improved approximation algorithms for multiway cut. In: Proceedings of the ACM Symposium on Theory of Computing, ACM (1998)

  10. Feldman, M., Friedler, O.: A unified framework for strong price of anarchy in clustering games. In: Proceedings of the 42nd International Colloquium on Automata, Languages, and Programming, Part II, Springer International Publishing, Lecture Notes in Computer Science, vol. 9135, pp. 601–613 (2015)

  11. Kleinberg, J., Tardos, E.: Approximation algorithms for classification problems with pairwise relationships: metric labeling and Markov random fields. JACM 49(5), 616639 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  12. Kolmogorov, V., Zabih, R.: What energy functions can be minimized via graph cuts. IEEE Trans. PAMI 26, 147–159 (2004)

    Article  MATH  Google Scholar 

  13. Leskovec, J., Krevl, A.: SNAP datasets: Stanford large network dataset collection (2014). http://snap.stanford.edu/data. Accessed May 2015

  14. Levandoski, J.J., Sarwat, M., Eldawy, A., Mokbel, M.F.: Lars: a location-aware recommender system. In: 2012 IEEE 28th International Conference on Data Engineering, pp 450–461. IEEE, Washington, DC, USA (2012). https://doi.org/10.1109/ICDE.2012.54

  15. Li, K., Lu, W., Bhagat, S., Lakshmanan, L.V., Yu, C.: On social event organization. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’14, pp 1206–1215. ACM, New York, NY, USA (2014). https://doi.org/10.1145/2623330.2623724

  16. Naor, J., Schwartz, R.: Balanced metric labeling. In: Proceedings of the Thirty-Seventh Annual ACM Symposium on Theory of Computing. STOC ’05, pp 582–591. ACM, New York, NY, USA (2005). https://doi.org/10.1145/1060590.1060676

  17. Orlin, J.B., Punnen, A.P., Schulz, A.S.: Approximate local search in combinatorial optimization. In: Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA’04, pp. 587–596 (2004)

  18. Rahn, M., Schäfer, G.: Efficient Equilibria in Polymatrix Coordination Games, pp. 529–541. Springer, Berlin (2015)

    MATH  Google Scholar 

  19. Rother, C., Kolmogorov, V., Blake, A.: “GrabCut”-interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. 23(3), 309–314 (2004)

    Article  Google Scholar 

  20. Schaeffer, S.E.: Survey: graph clustering. Comput Sci Rev 1(1), 27–64 (2007). https://doi.org/10.1016/j.cosrev.2007.05.001

    Article  MATH  Google Scholar 

  21. Wainwright, M., Jaakkola, T., Willsky, A.: Map estimation via agreement on trees: message-passing and linear programming. IEEE Trans. Inf. Theory 51, 3697–3717 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  22. Yedidia, J., Freeman, W., Weiss, Y.: Generalized belief propagation. In: Advances in Neural Information Processing Systems, pp. 689–695 (2000)

Download references

Funding

This work was supported by GRF grants 16207914 and 16231216 from Hong Kong RGC.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dimitris Papadias.

Appendix

Appendix

Lemma 1

The difference in Potential Function (3) \(\varDelta \varPhi _{UNI}(v,p_v,p'_v,\overline{\varvec{p}_v})\) due to a unilateral deviation in assignment \(\varvec{p}\) of player v from event \(p_v\) to \(p'_v\), while the rest of the players do not deviate, is:

$$\begin{aligned} \begin{aligned}&\varDelta \varPhi _{UNI}(v,p_v,p'_v,\overline{\varvec{p}_v})=\varPhi (p'_v,\overline{\varvec{p}_v})-\varPhi (p_v,\overline{\varvec{p}_v}) \\&\quad =c_v(p'_v,\overline{\varvec{p}_v})-c_v(p_v,\overline{\varvec{p}_v}) \\&\quad =\Big (\alpha \cdot d(v,p'_v)+(1-\alpha )\cdot \sum \limits _{\begin{array}{c} (v,f) \in E \\ \wedge p_f = p_v \end{array}} \frac{1}{2}\cdot w(v,f)\Big ) \\&\quad -\Big (\alpha \cdot d(v,p_v)+(1-\alpha )\cdot \sum \limits _{\begin{array}{c} (v,f) \in E \\ \wedge p_f = p'_v \end{array}} \frac{1}{2}\cdot w(v,f)\Big ) \end{aligned} \end{aligned}$$

Proof

After player v deviates from event \(p_v\) to \(p_v'\), the change in the potential function is:

$$\begin{aligned} \begin{aligned}&\varDelta \varPhi _{UNI}(v,p_v,p'_v,\overline{\varvec{p}_v})=\varPhi (p'_v,\overline{\varvec{p}_v})-\varPhi (p_v,\overline{\varvec{p}_v})\\&\quad =\Big (\alpha \cdot d(v,p'_v)-\alpha \cdot d(v,p_v)\Big )\\&\quad +\Big ((1-\alpha )\cdot \sum \limits _{\begin{array}{c} (v,f) \in E \\ \wedge p_f \ne p_v' \end{array}} \frac{1}{2}\cdot w(v,f) \\&\quad -(1-\alpha )\cdot \sum \limits _{\begin{array}{c} (v,f) \in E \\ \wedge p_f \ne p_v \end{array}} \frac{1}{2}\cdot w(v,f)\Big ) \end{aligned} \end{aligned}$$
(10)

This result is directly from Eq. (3), as only v and his friends are affected by v’s deviation; all the other terms in the potential function are canceled. Additionally, before and after the unilateral deviation, according to Eq. (4) we have:

$$\begin{aligned} c_v(p_v,\overline{\varvec{p}_v})= & {} \alpha \cdot \sum _{v\in V}{d(v,p_v)}\nonumber \\&+(1-\alpha )\cdot { \sum \limits _{\begin{array}{c} (v,f) \in E \wedge \\ p_v \ne p_f \end{array}} \frac{1}{2} \cdot w(v,f)} \end{aligned}$$
(11)
$$\begin{aligned} c_v(p'_v,\overline{\varvec{p}_v})= & {} \alpha \cdot \sum _{v\in V}{d(v,p_v)}\nonumber \\&+(1-\alpha )\cdot { \sum \limits _{\begin{array}{c} (v,f) \in E \wedge \\ p'_v \ne p_f \end{array}} \frac{1}{2} \cdot w(v,f)} \end{aligned}$$
(12)

If we subtract Eq. (11) from Eq. (12), we have Eq. (10), therefore proving that the difference in the potential function equals the difference in the cost of the player who deviates. Note that in Eq. (3) and also in the above equations, we consider the friends of v that are assigned in different events than \(p_v\) or \(p'_v\) (\(p_f \ne p_v\) and \(p_f \ne p_v\)). In contrast, to calculate the cost difference \(\varDelta \varPhi _{UNI}\) in Eq. (3), we only consider the friends of v that are assigned to \(p_v\) or \(p'_v\) (\(p_f= p_v\) and \(p_f = p'_v\)). It is straightforward that the friends of v that we need to consider during the calculation of his cost difference, are those assigned to different events than \(p_v\) (resp. \(p'_v\)) and especially those assigned to \(p'_v\) (resp. \(p_v\)). This is because the weights of friends assigned to events other than \(p'_v\) (resp. \(p_v\)) are canceled out. \(\square \)

Lemma 2

Assume vu swap events in assignment \(\varvec{p}\), while the rest of the players do not deviate. For the new assignment \((p'_v, p'_u,\overline{\varvec{p}_{vu}})\), where \(p'_v=p_u\) and \(p'_u=p_v\), the difference \(\varDelta \varPhi _{BI}(v,u,\varvec{p})\) in Potential Function (3) is:

$$\begin{aligned} \begin{aligned}&\varDelta \varPhi _{BI}(v,u,\varvec{p})=\varPhi (p'_v, p'_u,\overline{\varvec{p}_{vu}})-\varPhi (p_v, p_u,\overline{\varvec{p}_{vu}})\\&\quad =\Big (c_v(p'_v, p_u\cup \overline{\varvec{p}_{vu}})-c_v(p_v, p_u\cup \overline{\varvec{p}_{vu}})+\frac{1}{2}w(v,u))\\&\qquad +\Big (c_u(p'_u,p_v\cup \overline{\varvec{p}_{vu}})-c_u(p_u,p_v\cup \overline{\varvec{p}_{vu}})+\frac{1}{2}w(v,u)) \end{aligned} \end{aligned}$$

Proof

In order to express the bilateral deviation as two unilateral deviations (v to \(p_u\) and u to \(p_v\)), we add and deduct the term \(\varPhi (p'_v, p_u,\overline{\varvec{p}_{vu}})\) to/from the difference in the potential function:

$$\begin{aligned} \begin{aligned}&\varDelta \varPhi _{BI}(v,u,\varvec{p})=\varPhi (p'_v, p'_u,\overline{\varvec{p}_{vu}})-\varPhi (p_v,p_u,\overline{\varvec{p}_{vu}}) \\&\quad =\Big (\varPhi (p'_v, p_u,\overline{\varvec{p}_{vu}}) - \varPhi (p_v, p_u,\overline{\varvec{p}_{vu}})\Big ) \\&\qquad +\Big (\varPhi (p'_v, p'_u,\overline{\varvec{p}_{vu}}) - \varPhi (p'_v, p_u,\overline{\varvec{p}_{vu}})\Big ) \end{aligned} \end{aligned}$$
(13)

Each of the two terms of the sum in Eq. (13) represents a deviation of a single player. Specifically, the first sum is the unilateral deviation of v, while the second sum is the unilateral deviation of u, supposing that v deviates first and u follows right after.Footnote 6 Based on Eq. (2), Eq. (13) becomes:

$$\begin{aligned} \begin{aligned}&\varDelta \varPhi _{BI}(v,u,\varvec{p})= \Big (c_v(p'_v, p_u\cup \overline{\varvec{p}_{vu}})-c_v(p_v, p_u\cup \overline{\varvec{p}_{vu}}))\\&\qquad +\Big (c_u(p'_u,p'_v\cup \overline{\varvec{p}_{vu}})-c_u(p_u,p'_v\cup \overline{\varvec{p}_{vu}})) \end{aligned}\nonumber \\ \end{aligned}$$
(14)

However, since \(p'_v=p_u\), \(p'_u=p_v\) and \(p_v\ne p_u\), for the second term of the sum in Eq. (14) we will have:

$$\begin{aligned}&c_u(p'_u,p'_v\cup \overline{\varvec{p}_{vu}}))=c_u(p'_u,p_v\cup \overline{\varvec{p}_{vu}}))+\frac{1}{2}w{(v,u)} \end{aligned}$$
(15)
$$\begin{aligned}&c_u(p_u,p'_v\cup \overline{\varvec{p}_{vu}})=c_u(p_u,p_v\cup \overline{\varvec{p}_{vu}})-\frac{1}{2}w(v,u) \end{aligned}$$
(16)

Equation (15) implies that u’s cost for his new strategy (\(p'_u=p_v\)), afterv has changed strategy (v switched to \(p'_v=p_u\)), is equal to u’s cost for \(p'_u\)beforev changed strategy plus their friendship weight (if they are friends, otherwise the friendship weight is zero); this is because after v changed strategy, u (may) lost a friend from \(p'_u\); thus, his cost for \(p'_u\) needs to be increased. In a similar manner, Eq. (16), means that u’s cost for his old strategy (\(p_u\)), equals to u’s cost for \(p_u\)beforev changed strategy minus their friendship weight (because v will switch to \(p'_v=p_u\) and u will have one more friend in \(p_u\)). Therefore, by combining Eqs. (14)–(16) we prove Lemma 2.\(\square \)

Proposition 1

In a finite potential game, from an arbitrary feasible assignment, the combined dynamics of Fig. 4 always converges to a NEPS solution in a finite number of rounds.

Proof

First, note that we only allow unilateral deviations that do not violate the events’ capacity constraints, while bilateral deviations leave the events’ cardinalities unchanged. Thus, if we start from a feasible assignment, the assignment will remain feasible after a unilateral or bilateral deviation. Second, in a unilateral deviation the cost of the deviating player decreases; thus, from Eq. (2) it follows that the potential function also drops. In a bilateral deviation, both users benefit (or one benefits and the other’s cost is unchanged). But then the sum \(\varDelta \varPhi _{BI}(v,u,\varvec{p})\) which by Lemma 2 equals to:

$$\begin{aligned} \begin{aligned}&\Big (c_v(p'_v, p_u\cup \overline{\varvec{p}_{vu}})-c_v(p_v, p_u\cup \overline{\varvec{p}_{vu}})+\frac{1}{2}w(v,u))\\&\quad +\Big (c_u(p'_u,p_v\cup \overline{\varvec{p}_{vu}})-c_u(p_u,p_v\cup \overline{\varvec{p}_{vu}})+\frac{1}{2}w(v,u)) \end{aligned} \end{aligned}$$

is always negative, which guarantees that the potential function will drop. Since the game is finite, the combined dynamics eventually terminates to an assignment that is NEPS. \(\square \)

Proposition 2

The PoS in the capacitated game using NEPS is upper bounded by 2.

Proof

Consider any (feasible) assignment S. Let C(S) be the total cost for all users under S, i.e.,

$$\begin{aligned} C(S)=\alpha \cdot {\sum \nolimits _{v\in V} {d(v,p_v)}}+(1-\alpha ) \cdot {\sum \limits _{\begin{array}{c} (v,f) \in E \wedge \\ p_v \ne p_f \end{array}} w(v,f)} \end{aligned}$$

It is straightforward to see that:

$$\begin{aligned} \begin{aligned} \frac{1}{2} C(S)\le \varPhi (S)\le C(S) \end{aligned} \end{aligned}$$
(17)

For the socially optimal assignment OPT, we have by definition that \(C(OPT)\le C(S)\) for any feasible assignment S. Now, let \(S^{\min }\) be the assignment that minimizes the potential function, i.e., \(\varPhi (S^{\min })\le \varPhi (S)\) for any feasible assignment S. An interesting property is that \(S^{\min }\) is NEPS. Indeed, if that were not the case, then by Lemma 1 there would be a unilateral or bilateral deviation that would further drop the potential function. Inequality (17) together with the above observations implies that: \(\frac{1}{2} C(S^{\min })\le \varPhi (S^{\min })\le \varPhi (OPT) \le C(OPT)\), or equivalently, \(\frac{C(S^{\min })}{C(OPT)}\le 2\). The NEPS with the lowest total cost \(S^{best}\) will have a total cost of at least \(C(S^{\min })\), so we conclude that the PoS is upper bounded by 2. \(\square \)

Proposition 3

The PoA in the capacitated game using NE and PS is upper bounded by

$$\begin{aligned} \frac{\sum \nolimits _{k=1}^{|P|}{\max \limits _{p\in P}{\varXi ^p_{(k)}}}}{\alpha \cdot \sum \nolimits _{v\in V}{\min \limits _{p\in P}c(v,p)}}. \end{aligned}$$

Proof

Assume an assignment that is NEPS. Consider any pair of users \(u,v\in V\) that are assigned to two distinct events \(p_u\ne p_v\in P\). We will now argue that it is not possible to simultaneously hold that \(p_u=p^u_1\) and \(p_v=p^v_1\), except for the trivial cases where \(\xi (u,p^u_1)=\xi (u,p^u_2)\) or \(\xi (v,p^v_1)=\xi (v,p^v_2)\). Indeed, if that were the case, users u and v would be incentivized to swap their corresponding events, since they are already in the worst possible case and they cannot lose by deviating. For k users assigned to k distinct events, we can similarly show that if a user u is assigned to his worst event \(p^u_1\), then the rest \(k-1\) users must be assigned to at most their second-worst event (except for the trivial cases where multiple events share the same \(\xi \) value). Among the remaining \(k-1\) users, if one of them, say v, is assigned to his second-worst event \(p^v_2\), then the other \(k-2\) must be assigned to at most their third-worst events (except for the trivial cases of equal \(\xi \) values among multiple events). We can continue this argument to show that if users \(u_1,\dots ,u_{k-1}\) are assigned to their worst, second-worst, up to \((k-1)\)th-worst events, then user k can be assigned to at most his kth-worst event.

The above result suggests that if we pick any |P| users assigned to distinct events, then the total cost is upper bounded by the quantity

$$\begin{aligned} \max _{p\in P, v\in V}\{\xi (v,p^v_1)\}+\cdots +\max _{p\in P, v\in V}\{\xi (v,p^v_{|P|})\}. \end{aligned}$$

Note that the values \(\max _{p\in P, v\in V}\{\xi (v,p^v_k)\}\) and \(\max _{p\in P, v\in V}\{\xi (v,p^v_{k'})\}\), \(k\ne k'\), may occur in the same event or for the same user, even though we assumed (i) distinct events and (ii) that a user can only be assigned to a single event in any feasible assignment. This is, however, not a concern, since we are interested in an upper bound.

In a similar manner, we can obtain an upper bound on the worst possible cost of any NEPS as follows: Assume that one event takes the worst possible value \(\varXi ^p_{(1)}\) among all |P| events; another event takes the worst possible value \(\varXi ^p_{(2)}\) among all |P| events; and, finally, the remaining event takes the worst possible value \(\varXi ^p_{(|P|)}\) among all |P| events. In other words, we can upper bound the total cost of any NEPS equilibrium by the quantity \(\sum \nolimits _{k=1}^{|P|}{\max _{p\in P}{\varXi ^p_{(k)}}}\). On the other hand, the optimal solution has a cost of at least \( \sum \nolimits _{v\in V}{\min _{p\in P}c(v,p)}\), since in the best case scenario each user v is assigned to his event with the least cost and all v’s friends with positive friendship weight are assigned to the same event.

Since the numerator is an upper bound on the total cost of a NEPS and the denominator a lower bound on the total cost of the optimal assignment, we can upper bound the PoA by

$$\begin{aligned} \frac{\sum \nolimits _{k=1}^{|P|}{\max \limits _{p\in P}{\varXi ^p_{(k)}}}}{\alpha \cdot \sum \nolimits _{v\in V}{\min \limits _{p\in P}c(v,p)}}. \end{aligned}$$

\(\square \)

Lemma 3

For a unilateral deviation in assignment \(\varvec{p}\) where a user v switches from event \(p_v\) to \(p'_v\), the change in Objective Function (1) is:

$$\begin{aligned} \begin{aligned}&\varDelta C_{UNI}(v,p_v,p'_v,\overline{\varvec{p}_v})\\&\quad =\Big (\alpha \cdot d(v,p'_v)+(1-\alpha )\cdot \sum \limits _{\begin{array}{c} (v,f) \in E \\ \wedge p_f = p_v \end{array}} w(v,f)\Big ) \\&\quad -\Big (\alpha \cdot d(v,p_v)+(1-\alpha )\cdot \sum \limits _{\begin{array}{c} (v,f) \in E \\ \wedge p_f = p'_v \end{array}} w(v,f)\Big ) \end{aligned} \end{aligned}$$

Proof

Before v deviates, Objective Function (1) is equal to:

$$\begin{aligned} \begin{aligned} \alpha \cdot \sum \limits _{\begin{array}{c} u \in V \\ \wedge u \ne v \end{array}} d(u,p_u) + (1-\alpha )\cdot \sum \limits _{\begin{array}{c} (u,f_u) \in E \\ \wedge u \ne v \\ \wedge p_{f_u} \ne p_u \end{array}} w(u,f_u)\\ + \alpha \cdot d(v,p_v)+ (1-\alpha )\cdot \sum \limits _{\begin{array}{c} (v,f) \in E \\ \wedge p_{f} \ne p_v \end{array}} w(v,f) \end{aligned} \end{aligned}$$
(18)

After v deviates to \(p'_v\), Objective Function (1) becomes:

$$\begin{aligned} \begin{aligned} \alpha \cdot \sum \limits _{\begin{array}{c} u \in V \\ \wedge u \ne v \end{array}} d(u,p_u) + (1-\alpha )\cdot \sum \limits _{\begin{array}{c} (u,f_u) \in E \\ \wedge u \ne v \\ \wedge p_{f_u} \ne p_u \end{array}} w(u,f_u)\\ + \alpha \cdot d(v,p'_v)+ (1-\alpha )\cdot \sum \limits _{\begin{array}{c} (v,f) \in E \\ \wedge p_{f} \ne p'_v \end{array}} w(v,f) \end{aligned} \end{aligned}$$
(19)

If we subtract Eq. (18) from Eq. (19), we have:

$$\begin{aligned} \begin{aligned}&\varDelta C_{UNI}(v,p_v,p'_v,\overline{\varvec{p}_v})\\&\quad =\Big (\alpha \cdot d(v,p'_v)+(1-\alpha )\cdot \sum \limits _{\begin{array}{c} (v,f) \in E \\ \wedge p_f \ne p'_v \end{array}} w(v,f)\Big ) \\&\quad -\Big (\alpha \cdot d(v,p_v)+(1-\alpha )\cdot \sum \limits _{\begin{array}{c} (v,f) \in E \\ \wedge p_f \ne p_v \end{array}} w(v,f)\Big ) \end{aligned} \end{aligned}$$
(20)

which proves Lemma 3. (The right-hand sides of Eq. (20) and Lemma 3 are equal, for the same reasoning as in the proof of Lemma 1.) \(\square \)

Lemma 4

For a bilateral deviation in assignment \(\varvec{p}\) where two users \(v, u \in V\) swap events (v / u from \(p_v/p_u\) to \(p_u/p_v\)), the change in Objective Function (1) is:

$$\begin{aligned} \begin{aligned}&\varDelta C_{BI}(v,u,\varvec{p})\\&\quad =\Big (\tilde{c}_v(p'_v, p_u\cup \overline{\varvec{p}_{vu}})-\tilde{c}_v(p_v, p_u\cup \overline{\varvec{p}_{vu}})+w(v,u))\\&\quad +\Big (\tilde{c}_u(p'_u,p_v\cup \overline{\varvec{p}_{vu}})-\tilde{c}_u(p_u,p_v\cup \overline{\varvec{p}_{vu}})+w(v,u)) \end{aligned} \end{aligned}$$

Proof

In a similar manner to the proof of Lemma 2, we express the bilateral deviation as two unilateral deviations. The only difference is the factor 2 in the cost \((1-\alpha )\cdot w(v,u)\), obviously because in LS the agent cost does not include the factor \(\frac{1}{2}\) as in GAME. \(\square \)

Proposition 4 describes the complexity of INIT; it uses the following lemma.

Lemma 5

The expected number of trials to draw with replacement m distinct items from a set of \(M\ge m\) items is \(M\cdot (H_{M}-H_{M-m})\), where \(H_i=\sum _{k=1}^{i}\frac{1}{k}\) is the i-th Harmonic number (with \(H_0=0\)).

Proof

Let the random variable \(X_i\), \(1\le i\le m\), be the number of trials to draw the ith item after the first \(i-1\) have been obtained already. The total number of trials is then: \(X=X_1+\ldots +X_m\). By linearity of expectation, the expected number of trials to see all m items will then be: \(E[X]=E[X_1]+\ldots +E[X_m]\). Consider the random variable \(X_i\). Since all items are equally likely to be drawn, the probability of drawing an item different from the \(i-1\) already seen is: \(P_i=1-\frac{i-1}{M}=\frac{M-i+1}{M}\). Given the trials are independent (Bernoulli), the expected number of draws to see item i given the first \(i-1\) is \(\frac{1}{P_i}=\frac{M}{M-i+1}\). But then \(E[X]=E[X_1]+\ldots +E[X_m]=\sum _{i=1}^{m}\frac{M}{M-i+1}=M\cdot (H_{M}-H_{M-m})\). \(\square \)

Proposition 4

For \(n<<|V|\), INIT has a time complexity of \(O(\max (|V|n,|E||P|, |V||P|\log (|P|)))\), and a space requirement of \(\varTheta (|V||P|)\), where n, |V|, |E|, |P| are the number of samples, nodes, edges and classes, respectively.

Proof

The total running time of INIT consists of three components. The first concerns initialization (Lines 1–6) and heap creation. During initialization, obtaining c(vp) for a single user v and event p (Lines 3–6) incurs \(1+deg_v\) computations, where \(deg_v\) is the degree of v. Repeating the process for all |P| events and summarizing over all users yields complexity \(\sum \nolimits _{v\in V} {|P|(1+deg_v)}=|V||P|+2|E||P| = O((|V|+|E|)|P|)\). Recall that Phase 2 uses the user-event costs from Phase 1, without requiring initialization. Regarding the complexity of heap creation, since the size of each user heap is O(|P|), heapifying the costs of all users requires O(|V||P|), which is dominated by \(O((|V|+|E|)|P|)\).

The second component of the cost is due to the while iterations (Lines 7–24) of both phases. Observe that the total number of iterations equals the number of users, since every iteration performs an non-revocable assignment of a user. An iteration (i) finds the minimum among the top of the heaps of the n samples (Lines 13–17), (ii) updates the cost heaps for each friend of the user v to be assigned (Lines 18–20), and (iii) if the assigned event closes, it is removed from the heaps of all unassigned users (Lines 21–24). Summarizing, over all iterations (i.e., number of users |V|), the total cost of (i) is O(|V|n). Item (ii) involves updating the heaps of \(deg_v\) friends of v. The cost for all users is \(\sum \nolimits _{v\in V} {deg_v\log (|P|)} = 2|E|\log (|P|)=O(|E|\log (|P|))\). Regarding (iii), observe that an event closes at least once (when it reaches its minimum capacity during Phase 1), and at most twice (if it also reaches its maximum capacity during Phase 2). When this happens, the event must be removed from the heaps of at most |V| users with cost \(O(|V|\log (|P|))\). Repeating for all events yields \(O(|V||P|\log (|P|))\). Summarizing items (i) to (iii), the complexity of the second component is \(O(|V|n+(|V||P|+|E|)\log (|P|))\).

Recall that each iteration of INIT selects n distinct users from the set \(V_{un}\). If users are drawn equiprobably with replacement, then by Lemma 5 the expected number of draws until n users are selected is \(|V_{un}|(H_{|V_{un}|}-H_{|V_{un}|-n})\). INIT performs sampling when \(|V_{un}|=\{n+1,n+2,\ldots ,|V|\}\); for \(|V_{un}|\le n\) it just considers all users in \(|V_{un}|\). So, the total number of draws is \(\sum _{k=n+1}^{|V|}k(H_{k}-H_{k-n})=\sum _{k=n+1}^{|V|}k(\frac{1}{k-n+1}+\cdots +\frac{1}{k})\). Observe that each term \(\frac{1}{k}\) can only occur at most n times with coefficients \(k,k+1,\ldots ,k+n-1\). The last expression can be rewritten as \(\sum _{k=1}^{|V|}\frac{1}{k}(k+k+1+\cdots +k+n-1)\le \sum _{k=1}^{|V|}\frac{1}{k} n(k+n-1)=n\sum _{k=1}^{|V|}\frac{1}{k}(k+n-1)=n\sum _{k=1}^{|V|}(1+\frac{n-1}{k})=n|V|+n(n-1)\sum _{k=1}^{|V|}\frac{1}{k}\). Since the sum of the first |V| harmonic numbers is \(O(\log (|V|))\), the required number of samples in expectation is \(O(|V|n+n^2\log (|V|))\).

Considering that in practice \(n<< |V|\), the complexity of INIT, including all three components, can be simplified to \(O(\max (|V|n,|E||P|, |V||P|\log (|P|)))\). The space requirements are \(\varTheta (|V||P|)\) because of the |V| user heaps, each of size |P|, and the cost table |V||P|. \(\square \)

Proposition 5

The time complexity of a super-round of unilateral and bilateral deviations is \(O(|P|^2|V|(\log (|P|) +\log (|V|)))\).

Proof

Regarding the complexity of UNI, each round considers all users. In the worst case, for each user v we must scan the entire heap of size |P| in order to find a valid event to re-assign v. After re-assignment, the costs of \(deg_v\) friends change, triggering \(deg_v\) heap operations in user heaps (resp. event pair heaps) with complexity \(O(deg_v\log (|P|))\)(resp. \(O(deg_v \cdot \log (|V|))\)). Summarizing over all users, the complexity of a single round is \(\sum \nolimits _{v\in V} {(|P|+deg_v(\log (|P|)+\log (|V|))}\)\(=O(|V||P|+|E|(\log (|P|)+\log (|V|))\). Similarly, for BI, each swap between users v and u updates the costs of their friends (and possibly the costs of v, u, if they are connected) for both events, triggering a maximum of \(deg_v+deg_u+2\) heap operations (in both user and event pair heaps) per swap. Given that \(deg_v\) and \(deg_u\) are O(|V|) and assuming that all event pairs incur swaps, the complexity of a round of BI is \(O(|P|^2|V|(\log (|P|)+\log (|V|)))\). \(\square \)

The bound in Proposition 5 is loose, because in practice only a small fraction of event pairs causes swaps and \(deg_v<<|V|\).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ntaflos, L., Trimponias, G. & Papadias, D. A unified agent-based framework for constrained graph partitioning. The VLDB Journal 28, 221–241 (2019). https://doi.org/10.1007/s00778-018-0526-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-018-0526-5

Keywords

Navigation