Abstract:
In many mobile applications, user-generated data are presented as set-valued data. To tackle potential privacy threats in analyzing these valuable data, local differentia...Show MoreMetadata
Abstract:
In many mobile applications, user-generated data are presented as set-valued data. To tackle potential privacy threats in analyzing these valuable data, local differential privacy has been attracting substantial attention. However, existing approaches only provide sub-optimal utility and are expensive in computation and communication for set-valued data distribution estimation and heavy-hitter identification. In this paper, we propose a utility-optimal and efficient set-valued data publication method (i.e., Wheel mechanism). On the user side, the computational complexity is only O(\min \lbrace m\log m, m e^\epsilon \rbrace ) and communication costs are O(\epsilon +\log m) bits, where m is the number of items, d is the domain size and \epsilon is the privacy budget, while existing approaches usually depend on O(d) or O(\log d) (d \gg m). Our theoretical analyses reveal the estimation errors have been reduced from the previously known O(\frac{m^{2}d}{n\epsilon ^{2}}) to the optimal rate O(\frac{m d}{n\epsilon ^{2}}). Additionally, for heavy-hitter identification, we present a variant of the Wheel mechanism as an efficient frequency oracle, entailing only O(\sqrt{n}) computational complexity. This heavy-hitter protocol achieves an identification bar of \tilde{O}(\frac{1}{\epsilon }\sqrt{\frac{m}{n} \log d}), reducing by a factor of \sqrt{m} relative to existing protocols. Extensive experiments demonstrate our methods are 3-100x faster than existing approaches and have optimized statistical efficiency.
Published in: IEEE Transactions on Mobile Computing ( Volume: 23, Issue: 8, August 2024)