nievergeltlab
4/19/2017 - 8:55 PM

Descriptive statistics for genotype merge discordances

Descriptive statistics for genotype merge discordances

#Merge data in PLINK, using merge mode 6 or 7
./plink --bfile YEHUDA --bmerge YEHUDA.bed YEHUDA.bim YEHUDA.fam --merge-mode 7 --out yehude-merge

R 
setwd('F:/rutgers_2')
dat <- read.table('yehude-merge.diff', header=T,nr=800000,stringsAsFactors=F)
library(plyr)
#Determine general amount of disagreement for each SNP. Ones that have especially high disagreement may be badly genotyped
quantile(table(dat$SNP))
#Plot it
hist(table(dat$SNP) )

#Count discordances for each subject by getting dimension of data
#Returns a dataframe with N merged subjects rows, column 2 is the number of discordances
dimcheck <- ddply(dat, ~IID, dim)

#See average discordances by subject