Transactions on
Data Privacy
Foundations and Technologies
http://www.tdp.cat

Articles in Press

Accepted articles here

Latest Issues

Year 2025

Volume 18 Issue 2
Volume 18 Issue 1

Year 2024

Volume 17 Issue 3
Volume 17 Issue 2
Volume 17 Issue 1

Year 2023

Volume 16 Issue 3
Volume 16 Issue 2
Volume 16 Issue 1

Year 2022

Volume 15 Issue 3
Volume 15 Issue 2
Volume 15 Issue 1

Year 2021

Volume 14 Issue 3
Volume 14 Issue 2
Volume 14 Issue 1

Year 2020

Volume 13 Issue 3
Volume 13 Issue 2
Volume 13 Issue 1

Year 2019

Volume 12 Issue 3
Volume 12 Issue 2
Volume 12 Issue 1

Year 2018

Volume 11 Issue 3
Volume 11 Issue 2
Volume 11 Issue 1

Year 2017

Volume 10 Issue 3
Volume 10 Issue 2
Volume 10 Issue 1

Year 2016

Volume 9 Issue 3
Volume 9 Issue 2
Volume 9 Issue 1

Year 2015

Volume 8 Issue 3
Volume 8 Issue 2
Volume 8 Issue 1

Year 2014

Volume 7 Issue 3
Volume 7 Issue 2
Volume 7 Issue 1

Year 2013

Volume 6 Issue 3
Volume 6 Issue 2
Volume 6 Issue 1

Year 2012

Volume 5 Issue 3
Volume 5 Issue 2
Volume 5 Issue 1

Year 2011

Volume 4 Issue 3
Volume 4 Issue 2
Volume 4 Issue 1

Year 2010

Volume 3 Issue 3
Volume 3 Issue 2
Volume 3 Issue 1

Year 2009

Volume 2 Issue 3
Volume 2 Issue 2
Volume 2 Issue 1

Year 2008

Volume 1 Issue 3
Volume 1 Issue 2
Volume 1 Issue 1

Volume 10 Issue 3

Measuring Rule Retention in Anonymized Data - When One Measure Is Not Enough

Sam Fletcher^(a),(*), Md Zahidul Islam^(a)

Transactions on Data Privacy 10:3 (2017) 175 - 201

(a) School of Computing and Mathematics, Charles Sturt University, Bathurst 2795, Australia.

e-mail:sam.pt.fletcher @gmail.com; zislam @csu.edu.au

Abstract

In this paper, we explore how anonymizing data to preserve privacy affects the utility of the classification rules discoverable in the data. In order for an analysis of anonymized data to provide useful results, the data should have as much of the information contained in the original data as possible. Therein lies a problem - how does one make sure that anonymized data still contains the information it had before anonymization? This question is not the same as asking if an accurate classifier can be built from the anonymized data. Often in the literature, the prediction accuracy of a classifier made from anonymized data is used as evidence that the data are similar to the original. We demonstrate that this is not the case, and we propose a new methodology for measuring the retention of the rules that existed in the original data. We then use our methodology to design three measures that can be easily implemented, each measuring aspects of the data that no pre-existing techniques can measure. These measures do not negate the usefulness of prediction accuracy or other measures - they are complementary to them, and support our argument that one measure is almost never enough.

^* Corresponding author.

ISSN: 1888-5063; ISSN (Digital): 2013-1631; D.L.:B-11873-2008; Web Site: http://www.tdp.cat/
Contact: Transactions on Data Privacy; Vicenç Torra; Umeå University; 90187 Umeå (Sweden); e-mail:tdp@tdp.cat
Note: TDP's web site does not use cookies. TDP does not keep information neither on IP addresses nor browsers. For the privacy policy access here.

TDP

Vicenç Torra, Last modified: 00 : 08 May 19 2020.