Abstract:
This article considers a federated temporal difference (TD) learning algorithm and provides both asymptotic and finite-time analyses. To protect each worker agent's cost ...View moreMetadata
Abstract:
This article considers a federated temporal difference (TD) learning algorithm and provides both asymptotic and finite-time analyses. To protect each worker agent's cost information from being acquired by possible attackers, we propose a privacy-preserving variant of the algorithm by adding perturbation to the exchanged information. We show the rigorous differential privacy guarantee by using moments accountant and derive an upper bound of the utility loss for the privacy-preserving algorithm. Evaluations are also provided to corroborate the efficiency of the algorithms.
Published in: IEEE Transactions on Parallel and Distributed Systems ( Volume: 33, Issue: 11, 01 November 2022)