Recommendation with k-Anonymized Ratings
Jun Sakuma(a),(b),(*), Tatsuya Osame(a)
Transactions on Data Privacy 11:1 (2018) 47 - 60
Abstract, PDF
(a) University of Tsukuba, 1-1-1 Tennoh-dai, Tsukuba, Ibaraki, 305-8573, Japan.
(b) RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo, 103-0027, Japan.
e-mail:jun @cs.tsukuba.ac.jp; osame @mdl.cs.tsukuba.ac.jp
|
Abstract
Recommender systems are widely used to predict personalized preferences of goods or services using users' past activities, such as item ratings or purchase histories. If collections of such personal activities were made publicly available, they could be used to personalize a diverse range of services, including targeted advertisement or recommendations. However, there would be an accompanying risk of privacy violations. The pioneering work of Narayanan et al. demonstrated that even if the identifiers are eliminated, the public release of user ratings can allow for the identification of users by those who have only a small amount of data on the users' past ratings. In this paper, we assume the following setting. A collector collects user ratings, then anonymizes and distributes them. A recommender constructs a recommender system based on the anonymized ratings provided by the collector. Based on this setting, we exhaustively list the models of recommender systems that use anonymized ratings. For each model, we then present an item-based collaborative filtering algorithm for making recommendations based on anonymized ratings. Our experimental results show that an item-based collaborative filtering based on anonymized ratings can perform better than collaborative filterings with non-anonymized ratings under certain conditions. This surprising result indicates that, in some settings, privacy protection does not necessarily reduce the usefulness of recommendations. From the experimental analysis of this counterintuitive result, we observed that the sparsity of the ratings could be reduced by anonymization and the variance of the prediction error can be reduced if k, the anonymization parameter, is appropriately tuned.
|