A Probabilistic Perspective on Re-Identifiability
A quasi-identifier is a set of attributes that can be used to re-identify entries in anonymized data sets. A group of individuals is considered about whom quasi-identifying numerical information is disclosed such as date of birth, age, weight, and height. The fraction of individuals is determined whose information is unique in that group and hence is identifiable unambiguously. Nonuniformity can be captured well by a single number, the Kullback-Leibler distance. For example sets of real microdata, given approximations based on Kullback-Leibler distances are accurate. Second, the effect of disclosing more specific or less specific information is analyzed experimentally. Third, the effect of correlation between numerical attributes is measured. A formula gives the re-identifiability level. The approximations are validated using publicly available demographic data sets.
Year of publication: |
2013
|
---|---|
Authors: | KOOT, MATTHIJS ; MANDJES, MICHEL ; GUIDO VAN 'T NOORDENDE ; LAAT, CEES DE |
Published in: |
Mathematical Population Studies. - Taylor & Francis Journals, ISSN 0889-8480. - Vol. 20.2013, 3, p. 155-171
|
Publisher: |
Taylor & Francis Journals |
Saved in:
Online Resource
Saved in favorites
Similar items by person
-
The Shape of the Loss Curve and the Impact of Long-Range Dependence on Network Performance
Mandjes, Michel, (2001)
-
Fast Simulation of a Queue fed by a Superposition of Many (Heavy-Tailed) Sources
Boots, Nam Kyoo, (2001)
-
A flexible and optimal approach for appointment scheduling in healthcare
Kuiper, Alex, (2021)
- More ...