A simplified approach to generating synthetic data for disclosure control

Raab, G., Nowok, B. & Dibben, C. (2014) arXiv.org (arXiv:1409.0217v2), [SLS]

Other information:
Abstract:

We describe results on the creation and use of synthetic data that were derived in the context of a project to make synthetic extracts available for users of the UK Longitudinal Studies. Contrary to the existing literature we show that there are circumstances when inferences can be made from fully synthetic data generated from fitted parameters without sampling from their posterior distributions (simple synthesis). The condition that allows this, which we describe as "common-sampling", is that the original sample and the synthetic data can be considered as sampled in the same way from their respective populations. New variance estimators for the analysis of synthetic data are derived when the common-sampling condition is met. It is shown that simple synthesis, with these estimators, provide better estimates than the methods suggested in the literature for fully synthetic data. The results are confirmed by simulations and are illustrated with an example from the Scottish Longitudinal Study.

Available online: arXiv.org
Download output document: Full paper (PDF 288KB)
Output from project: 2013_012

QUICK DATA DICTIONARY SEARCH

Recent News

Upcoming Events

Sorry, there are currently no upcoming Events.

Latest Tweets