Lightning: Utility-Driven Anonymization of High-Dimensional Data
Fabian Prasser(a),(*), Raffael Bild(a), Johanna Eicher(a), Helmut Spengler(a), Florian Kohlmayer(a), Klaus A. Kuhn(a)
Transactions on Data Privacy 9:2 (2016) 161 - 185
Abstract, PDF
(a) Chair of Biomedical Informatics, Department of Medicine, Technical University of Munich (TUM), Germany.
e-mail:firstname.lastname @tum.de; firstname.lastname @tum.de; firstname.lastname @tum.de; firstname.lastname @tum.de; firstname.lastname @tum.de; firstname.lastname @tum.de
|
Abstract
The ARX Data Anonymization Tool is a software for privacy-preserving microdata publishing. It implements methods of statistical
disclosure control and supports a wide variety of privacy models, which are used to specify disclosure risk thresholds.
Data is mainly transformed with a combination of two methods: (1) global recoding with full-domain generalization of attribute
values followed by (2) local recoding with record suppression. Within this transformation model, given a
dataset with low dimensionality, it is feasible to compute an optimal solution with minimal
loss of data quality. However, combinatorial complexity renders this approach impracticable for high-dimensional data. In this article, we
describe the Lightning algorithm, a simple, yet effective, utility-driven heuristic search strategy which we have implemented in
ARX for anonymizing high-dimensional datasets. Our work improves upon existing methods because it is not tailored towards specific
models for measuring disclosure risks and data utility. We have performed an extensive experimental evaluation in which we have
compared our approach to state-of-the-art heuristic algorithms and a globally-optimal search algorithm. In this process, we have
used several real-world datasets, different models for measuring data utility and a wide variety of privacy models.
The results show that our method outperforms previous approaches in terms output quality, even when
using k-anonymity, which is the model for which previous work has been designed.
|