PrivGWAS: Preserving Privacy in Genomewide Association Studies

PrivGWAS is a set of two methods for performing privacy preserving GWAS in the presence of population stratification: PrivSTRAT and PrivLMM. Both methods allow researched to generate GWAS results while protecting private phenotype information about participants. In particular, researchers can: (1) return highly associated SNPs (2) estimate association statistics and (3) estimate the number of significant SNPs.

Details of both methods are given in the paper "Enabling Privacy Preserving GWAS in Heterogenious Human Populations" (hopefully soon to be published!). Additional online material (proofs, etc) are available here.

Due to privacy concerns we can not publicly publish the real GWAS data. We can, however, share simulated data and the code used to generate it (relies on PLINK). This is available here.

The code used to generate the figures, including a python implementation of PrivSTRAT and PrivLMM, is available here.

Any questions or concerns can be directed to seanken at mit dot edu