Degree-Preserving Randomized Response for Graph Neural Networks under Local Differential Privacy
Seira Hidano(a), Takao Murakami(b),(c),(*)
Transactions on Data Privacy 17:2 (2024) 89 - 121
Abstract, PDF
(a) KDDI Research, Inc., 2-1-15 Ohara, Fujimino, Saitama, 356-8502, Japan.
(b) ISM, 10-3 Midori-cho, Tachikawa, Tokyo, 190-8562, Japan.
(c) AIST, 2-4-7 Aomi, Koto-ku, Tokyo, 135-0064, Japan..
e-mail:se-hidano @kddi.com; tmura @ism.ac.jp
|
Abstract
Differentially private GNNs (Graph Neural Networks) have been recently studied to provide high accuracy in various tasks on graph data while strongly protecting user privacy. In particular, a recent study proposes an algorithm to protect each user's feature vector in an attributed graph, which includes feature vectors along with node IDs and edges, with LDP (Local Differential Privacy), a strong privacy notion without a trusted third party. However, this algorithm does not protect edges (friendships) in a social graph, hence cannot protect user privacy in unattributed graphs, which include only node IDs and edges. How to provide strong privacy with high accuracy in unattributed graphs remains open. In this paper, we propose a novel LDP algorithm called the DPRR (Degree-Preserving Randomized Response) to provide LDP for edges in GNNs. Our DPRR preserves each user's degree hence a graph structure while providing edge LDP. Technically, our DPRR uses Warner's RR (Randomized Response) and strategic edge sampling, where each user's sampling probability is automatically tuned using the Laplacian mechanism to preserve the degree information under edge LDP. We also propose a privacy budget allocation method to make the noise in both Warner's RR and the Laplacian mechanism small. We focus on graph classification as a task of GNNs and evaluate the DPRR using three social graph datasets. Our experimental results show that the DPRR significantly outperforms three baselines and provides accuracy close to a non-private algorithm in all datasets with a reasonable privacy budget, e.g., e = 1. Finally, we introduce data poisoning attacks to our DPRR and a defense against the attacks. We evaluate them using the three social graph datasets and discuss the experimental results.
|