Analysis and Performance Enhancement to Achieve Recursive (c, l) Diversity Anonymization in Social Networks
Saptarshi Chakraborty(a),(*), John George Ambooken(a), B. K. Tripathy(a), Swarnalatha Purushotham(a)
Transactions on Data Privacy 8:2 (2015) 173 - 215
Abstract, PDF
(a)
School of Computing Science and Engineering, VIT University, Vellore - 632014,
Tamil Nadu, India.
e-mail:sssaptarshii @gmail.com; jxgamebook @gmail.com; tripathybk @vit.ac.in; pswarnalatha @vit.ac.in
|
Abstract
Protecting the identities of the actors along with their sensitive information has become a matter of concern for the organizations which are publishing huge amounts of data every day for the purpose of research. Recent studies have shown that simply removing the sensitive labels associated with the actors do not guarantee their privacy protection. The structural property of the graph associated with the network or the information about the degree of the nodes in it can also be used by an adversary to identify a particular actor. Anonymization algorithms available for social networks are relatively less in number as compared to those for micro-data. The reasons being that the process of anonymizing social networks is much more complex and also as the structural property of the original graph should be taken care and more or less to be retained in the anonymized graph to minimize the loss of information. In recent studies, the original micro-data anonymization concepts of k-anonymity and l-diversity have been extended to the social network environment. The ldiversity model can protect the identity of the users as well as the sensitive labels associated with them. Out of the three different versions of l-diversity available, the recursive (c, l) diversity is much more complex than the mostly handled distinct l-diversity. M.Yuan et al (2013) have developed a recursive (c, l) diversity algorithm using the noise node approach. In this paper, we point out some drawbacks in this algorithm and propose an improved algorithm, which removes these drawbacks and generates the anonymized graph with minimal number of noise nodes. In our approach, we use the noise node addition concept as it retains the structural property of the original graph and also preserves the data utility of the anonymized graph. We have tested our algorithm against several real datasets to justify its efficiency. Also, we have established two theorems in order to establish some characteristics which have been used in support of our claims.
|