= SOP for converting LifeLines Geno Data = [[TOC()]] This SOP applies to LL3. Data is released to researcher 'per study' (i.e. an approved research request). * Per study a subset of the genotypes is created and made available to the researcher: * Only individuals selected for study (e.g. 5000 out of total 17000) * The identifiers 're-pseunomized' from 'marcel identifiers' to 'study identifiers' (so data can not be matched between studies). == Expected outputs == User expects files in PLINK format: * TPED/TFAM genotype files (chosen for internal use as easier to produce) * BIM/BED/FAM genotype files (with empty phenotype, monomorphic filtered) * BIM/BED/FAM genotype files splitted per chromosome * MAP/PED '''dosage''' files * MAP/PED dosage files '''splitted per chromosome''' == Available inputs == Complete genotype data is in: /target/gpfs2/lifelines_rp/releases/LL3/