Version 1 (modified by 13 years ago) (diff) | ,
---|
SOP for converting LifeLines Geno Data
Table of Contents
This SOP applies to LL3.
Data is released to researcher 'per study' (i.e. an approved research request).
- Per study a subset of the genotypes is created and made available to the researcher:
- Only individuals selected for study (e.g. 5000 out of total 17000)
- The identifiers 're-pseunomized' from 'marcel identifiers' to 'study identifiers' (so data can not be matched between studies).
Expected outputs
User expects files in PLINK format:
- TPED/TFAM genotype files (chosen for internal use as easier to produce)
- BIM/BED/FAM genotype files (with empty phenotype, monomorphic filtered)
- BIM/BED/FAM genotype files splitted per chromosome
- MAP/PED dosage files
- MAP/PED dosage files splitted per chromosome
Available inputs
Complete genotype data is in: /target/gpfs2/lifelines_rp/releases/LL3/