summix {Summix} | R Documentation |
Summix: estimating mixture proportions of reference groups from large (N SNPs>10,000) genetic AF data
summix(data, reference, observed, pi.start = c())
data |
a dataframe of the observed and reference allele frequencies for N genetic variants. See data formatting document at https://github.com/hendriau/Summix for more information. |
reference |
a character vector of the column names for the reference ancestries. |
observed |
a character value that is the column name for the observed group. |
pi.start |
length K numeric vector of the starting guess for the ancestry proportions. If not specified, this defaults to 1/K where K is the number of reference ancestry groups. |
data frame with the following columns
objective: least square value at solution
iterations: number of iterations for SLSQP algorithm
time: time in seconds of SLSQP algorithm
filtered: number of SNPs not used in estimation due to missing values
K columns of mixture proportions of reference ancestry groups input into the function
Gregory Matesi, gregory.matesi@ucdenver.edu
Audrey Hendricks, audrey.hendricks@ucdenver.edu
adjAF
for adjusting allele frequencies and https://github.com/hendriau/Summix for further documentation. slsqp
function in the nloptr package for further details on Sequential Quadratic Programming https://www.rdocumentation.org/packages/nloptr/versions/1.2.2.2/topics/slsqp
# load the data data("ancestryData") # Estimate 5 reference ancestry proportion values for the gnomAD African/African American group # using a starting guess of .2 for each ancestry proportion. summix( data = ancestryData, reference=c("ref_AF_afr_1000G", "ref_AF_eur_1000G", "ref_AF_sas_1000G", "ref_AF_iam_1000G", "ref_AF_eas_1000G"), observed="gnomad_AF_afr", pi.start = c(.2, .2, .2, .2, .2) )