An Information Theoretic Perspective for Heterogeneous Subgraph Federated Learning

Guo, Jiayan; Li, Shangyang; Zhang, Yan

doi:10.1007/978-3-031-30637-2_50

An Information Theoretic Perspective for Heterogeneous Subgraph Federated Learning

Jiayan Guo¹⁵,
Shangyang Li¹⁶ &
Yan Zhang¹⁵

Conference paper
First Online: 14 April 2023

1997 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13943))

Abstract

Mining graph data has gained wide attention in modern applications. With the explosive growth of graph data, it is common to see many of them collected and stored in different distinction systems. These local graphs can not be directly shared due to privacy and bandwidth concerns. Thus, Federated Learning approach needs to be considered to collaboratively train a powerful generalizable model. However, these local subgraphs are usually heterogeneously distributed. Such heterogeneity brings challenges for subgraph federated learning. In this work, we analyze subgraph federated learning and find that sub-optimal objectives under the FedAVG training setting influence the performance of GNN. To this end, we propose InfoFedSage, a federated subgraph learning framework guided by Information bottleneck to alleviate the non-iid issue. Experiments on public datasets demonstrate the effectiveness of InfoFedSage against heterogeneous subgraph federated learning.

J. Guo and S. Li—Equal contribution.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.00; Price excludes VAT (USA)

Softcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Achille, A., Soatto, S.: Information dropout: learning optimal representations through noisy computation. IEEE Trans. Pattern Anal. Mach. Intell. 40, 2897–2905 (2018)
Article Google Scholar
Alemi, A.A., Fischer, I., Dillon, J.V., Murphy, K.: Deep variational information bottleneck. arXiv preprint arXiv:1612.00410 (2016)
Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech: Theory Exp. 2008, 10008 (2008)
Article MATH Google Scholar
Blum, A., Haghtalab, N., Procaccia, A.D.: Variational dropout and the local reparameterization trick. In: NIPS (2015)
Google Scholar
Gao, C., et al.: Graph neural networks for recommender systems: challenges, methods, and directions. ArXiv abs/2109.12843 (2021)
Google Scholar
Guo, J.N., Li, S., Zhao, Y., Zhang, Y.: Learning robust representation through graph adversarial contrastive learning. In: Bhattacharya, A., et al. (eds.) DASFAA 2022. LNCS, vol. 13245, pp. 682–697. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-00123-9_54
Chapter Google Scholar
Guo, J., et al.: Learning multi-granularity user intent unit for session-based recommendation. In: Proceedings of the 15’th ACM International Conference on Web Search and Data Mining, WSDM 2022 (2022)
Google Scholar
Guo, J., Zhang, P., Li, C., Xie, X., Zhang, Y., Kim, S.: Evolutionary preference learning via graph nested GRU ode for session-based recommendation. In: Proceedings of the 31st ACM International Conference on Information & Knowledge Management (2022)
Google Scholar
Hamilton, W.L., Ying, Z., Leskovec, J.: Inductive representation learning on large graphs. In: NIPS (2017)
Google Scholar
Hartigan, J.A., Wong, M.A.: A k-means clustering algorithm (1979)
Google Scholar
He, C., et al.: FedGraphNN: a federated learning system and benchmark for graph neural networks. ArXiv abs/2104.07145 (2021)
Google Scholar
Kingma, D.P., Welling, M.: Auto-encoding variational bayes. CoRR abs/1312.6114 (2014)
Google Scholar
Li, D., Wang, J.: FedMD: heterogenous federated learning via model distillation. ArXiv abs/1910.03581 (2019)
Google Scholar
Li, T., Sahu, A.K., Talwalkar, A.S., Smith, V.: Federated learning: challenges, methods, and future directions. IEEE Signal Process. Mag. 37, 50–60 (2020)
Google Scholar
Li, T., Sahu, A.K., Zaheer, M., Sanjabi, M., Talwalkar, A., Smith, V.: Federated optimization in heterogeneous networks. Proc. Mach. Learn. Syst. 2, 429–450 (2020)
Google Scholar
Li, X., Huang, K., Yang, W., Wang, S., Zhang, Z.: On the convergence of FedAvg on non-IID data. ArXiv abs/1907.02189 (2020)
Google Scholar
Lin, T., Kong, L., Stich, S.U., Jaggi, M.: Ensemble distillation for robust model fusion in federated learning. ArXiv abs/2006.07242 (2020)
Google Scholar
McMahan, H.B., Moore, E., Ramage, D., Hampson, S., y Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: AISTATS (2017)
Google Scholar
Namata, G., London, B., Getoor, L., Huang, B.: Query-driven active surveying for collective classification (2012)
Google Scholar
Qiu, Y., Huang, C., Wang, J., Huang, Z., Xiao, J.: A privacy-preserving subgraph-level federated graph neural network via differential privacy. In: Memmi, G., Yang, B., Kong, L., Zhang, T., Qiu, M. (eds.) KSEM 2022. LNCS, vol. 13370, pp. 165–177. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-10989-8_14
Chapter Google Scholar
Sahu, A.K., Li, T., Sanjabi, M., Zaheer, M., Talwalkar, A.S., Smith, V.: Federated optimization in heterogeneous networks. arXiv Learning (2020)
Google Scholar
Sen, P., Namata, G., Bilgic, M., Getoor, L., Gallagher, B., Eliassi-Rad, T.: Collective classification in network data (2008)
Google Scholar
Shchur, O., Mumme, M., Bojchevski, A., Günnemann, S.: Pitfalls of graph neural network evaluation. ArXiv abs/1811.05868 (2018)
Google Scholar
Tishby, N., Zaslavsky, N.: Deep learning and the information bottleneck principle. In: 2015 IEEE Information Theory Workshop (ITW), pp. 1–5 (2015)
Google Scholar
Wasserman, S., Faust, K.: Social network analysis - methods and applications. In: Structural Analysis in the Social Sciences (2007)
Google Scholar
Wu, L., et al.: Graph neural networks for natural language processing: a survey. ArXiv abs/2106.06090 (2021)
Google Scholar
Wu, T., Ren, H., Li, P., Leskovec, J.: Graph information bottleneck. ArXiv abs/2010.12811 (2020)
Google Scholar
Xie, H., Ma, J., Xiong, L., Yang, C.: Federated graph classification over non-IID graphs. Adv. Neural. Inf. Process. Syst. 34, 18839–18852 (2021)
Google Scholar
Zhang, K., Yang, C., Li, X., Sun, L., Yiu, S.M.: Subgraph federated learning with missing neighbor generation. ArXiv abs/2106.13430 (2021)
Google Scholar
Zhang, P., et al.: Efficiently leveraging multi-level user intent for session-based recommendation via atten-mixer network. arXiv preprint arXiv:2206.12781 (2022)
Zhu, Z., Hong, J., Zhou, J.: Data-free knowledge distillation for heterogeneous federated learning. Proc. Mach. Learn. Res. 139, 12878–12889 (2021)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Intelligence Science and Technology, Peking University, Beijing, China
Jiayan Guo & Yan Zhang
Peking-Tsinghua Center for Life Sciences, IDG/McGovern Institute for Brain Research, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
Shangyang Li

Authors

Jiayan Guo
View author publications
You can also search for this author in PubMed Google Scholar
Shangyang Li
View author publications
You can also search for this author in PubMed Google Scholar
Yan Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shangyang Li .

Editor information

Editors and Affiliations

Tianjin University, Tianjin, China
Xin Wang
University of Torino, Turin, Italy
Maria Luisa Sapino
POSTECH, Pohang, Korea (Republic of)
Wook-Shin Han
University of California Santa Barbara, Santa Barbara, CA, USA
Amr El Abbadi
University of Auckland, Auckland, New Zealand
Gill Dobbie
Tianjin University, Tianjin, China
Zhiyong Feng
Beijing University of Posts and Telecommunications, Beijing, China
Yingxiao Shao
The University of Queensland, Brisbane, QLD, Australia
Hongzhi Yin

A Appendix

1.1 A.1 Proof for Proposition 4.1

We first state a lemma.

Lemma A.1. Given X have n states ($x_1,x_2, \cdots x_n$) and $x_1$ can be divided into k sub states ($x_{11},x_{12}, \cdots x_{1n}$), Y has m states ($y_1,y_2, \cdots y_n$), we have

$$\begin{aligned} \begin{aligned}&I\left( x_{11}, x_{12}, \cdots , x_{1 k}, x_{2}, \cdots , x_{n} ; Y\right) \\&=\,p\left( x_{1}\right) \cdot I\left( x_{11}, x_{12}, \cdots , x_{1 k} ; Y\right) +I\left( x_{1}, x_{2}, \cdots , x_{n} ; Y\right) \end{aligned} \end{aligned}$$

(15)

Proof. The mutual information between ($x_{11},x_{12}, \cdots x_{1n}$) and Y:

$$\begin{aligned} \begin{aligned}&I\left( x_{11}, x_{12}, \cdots , x_{1 k} ; Y\right) \\&=\,H\left( x_{11}, x_{12}, \cdots , x_{1 k}\right) -H\left( x_{11}, x_{12}, \cdots , x_{1 k} / Y\right) \\ \end{aligned} \end{aligned}$$

(16)

Then, we have

$$\begin{aligned} \begin{aligned}&I\left( x_{11}, x_{12}, \cdots , x_{1 k}, x_{2}, \cdots , x_{n} ; Y\right) \\&= -\,\sum _{t=1}^{k} p\left( x_{1 t}\right) \log \frac{p\left( x_{1 t}\right) }{p\left( x_{1}\right) }+\sum _{t=1}^{k} \sum _{j=1}^{m} p\left( x_{1 t} y_{j}\right) \log \frac{p\left( x_{1 t} y_{j}\right) }{p\left( x_{1} y_{j}\right) }\\&+\,I\left( x_{1}, x_{2}, \cdots , x_{n} ; Y\right) \\&=\, p\left( x_{1}\right) \cdot I\left( x_{11}, x_{12}, \cdots , x_{1 k} ; Y\right) +I\left( x_{1}, x_{2}, \cdots , x_{n} ; Y\right) \end{aligned} \end{aligned}$$

(17)

Then, we can get Corollary A.1:

$$\begin{aligned} I\left( x_{11}, x_{12}, \cdots , x_{1 k}, x_{2}, \cdots , x_{n} ; Y\right) \ge I\left( x_{1}, x_{2}, \cdots , x_{n} ; Y\right) \end{aligned}$$

(18)

We restate Proposition 4.1: For $Z_X' = Z_{X1} \cup \cdots \cup Z_{Xm}$ and $Y' = Y_{1} \cup \cdots \cup Y_{m}$, we have

$$\begin{aligned} \begin{aligned} {\mathcal {L}}_{\text {FedAVG}} \ge -\frac{1}{m}\sum _{i=1}^{m} I(Z_{X_i}, Y_{i}) \ge -I(Z_X', Y'). \end{aligned} \end{aligned}$$

(19)

Proof. We first consider the first inequality, since the definition of mutual information has the form

$$\begin{aligned} \begin{aligned} I(Z_{X_i}, Y_i)=\sum p(y_i, z_{X_i}) \log p(y_i \mid z_{X_i})+H(Y_i) \end{aligned} \end{aligned}$$

(20)

Notice that the entropy of labels $H(Y_i)$ is independent of our optimization procedure and can be ignored. Therefore, we have

$$\begin{aligned} \begin{aligned} {\mathcal {L}}_{\textrm{FedAVG}} =&-\frac{1}{m}\sum _{i=1}^{m} I(Z_{X_i} ; Y_i) + \frac{1}{m} H(Y) \ge -\frac{1}{m} \sum _{i=1}^{m} I(Z_{X_i}, Y_{i}). \end{aligned} \end{aligned}$$

(21)

Then, we consider the second inequality. Directly, based on our Corollary A.1, we can get

$$\begin{aligned} \begin{aligned} {\mathcal {L}}_{\textrm{FedAVG}} \ge&-\frac{1}{m}\sum _{i=1}^{m} I(Z_{X_i}, Y_{i}) \\ \ge&-max (I(Z_{X_i}, Y_{i})) \ge -I(Z', Y_{i} ) \ge -I(Z', Y') \end{aligned} \end{aligned}$$

(22)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Guo, J., Li, S., Zhang, Y. (2023). An Information Theoretic Perspective for Heterogeneous Subgraph Federated Learning. In: Wang, X., et al. Database Systems for Advanced Applications. DASFAA 2023. Lecture Notes in Computer Science, vol 13943. Springer, Cham. https://doi.org/10.1007/978-3-031-30637-2_50

Download citation

DOI: https://doi.org/10.1007/978-3-031-30637-2_50
Published: 14 April 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-30636-5
Online ISBN: 978-3-031-30637-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Abstract

Buying options

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Appendix

A Appendix

1.1 A.1 Proof for Proposition 4.1

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation