Key options
Genotype formats
The VCF genofile allows to specify multiple genotype formats for each marker. RABBIT assumes that all markers are biallelic; multiallelic markers can be either deleted (delmultiallelic=true) or transformed into biallelic markers by pooling multiple alternative alleles. RABBIT allows three kinds of genotype formats:
- GT: discrete genotype such as "0/1" for unphased genotypes and "0|1" for phased genotype.
- AD: allelic depth in form of "r0,r1", where r0 is the number of reads for allele 0 and r1 is the number of reads for allele 1.
- GP: genotype probability. It can be either in form of "p00, p01, p11", where p00, p01, and p11 denote the probabilities of being genotype "0/0", "0/1", and "1/1", respectively; or in form of "p00, p11" for inbred lines with only two possible homozygous genotypes (model=depmodel for
magicimpute
).
RABBIT has a keyword argument formatpriority
for functions requiring genofile as an argument. Specify formatpriority
as a vector of formats with descreasing priorities. To optimize the performance, the default value of formatpriority
varies with function.
Function | formatpriority |
---|---|
magicfilter | ["AD","GT"] |
magicmap | ["GT","AD"] |
magicimpute | ["AD","GT"] |
magicreconstruct | ["GP","AD","GT"] |
- Specify formatpriority like this
--formatpriority "[AD, GT]"
for running in a command, andformatpriority = ["AD", "GT"]
for running in julia REPL. magiccall
performs single marker genotype calling. For each marker with AD, it first infers parental genotypes and error rates, and then exports AD, calculated GP, and called GT.magicmap
performs raw genotype calling for "AD" if skippingmagiccall
for sequence data. The raw genotype calling assumes a pre-specified error rate, and it performs well for almost homozygous populations, but not for sequence data from heterozygous populations.magicimpute
exports both posterior genotype probabilities ("GP") and discrete genotypes ("GT").magicreconstruct
performs generally better if using "GP" than that of "GT".
Likelihood parameters
Genotype likelihood denotes the probability of observed genotypic data given hidden states and parameters. The parameters include foundererror
, offspringerror
, seqerror
, allelebalancemean
, allelebalancedisperse
, and alleledropout
. See the section Statistical Framework
.
RABBIT has three keyword arguments likeparameters
, priorlikeparameters
, and threshlikeparameters
, which have struct types LikeParameters
, PriorLikeParameters
, and TreshLikeParameters
, respectively, and each of the structs has field names for the six likelihood parameters.
LikeParameters
MagicBase.LikeParameters
— TypeLikeParameters
keyword-based struct for the parameters of likelihood function.
LikeParameters() is equivalent to LikeParameters(foundererror=0.005, offspringerror=nothing, peroffspringerror=nothing, seqerror=nothing, allelebalancemean=nothing, allelebalancedisperse=nothing, alleledropout=0.0). The peroffspringerror
refers to error rate per offspring, and the other parameters refer to error rate per marker.
If genotype format is not "AD", the parameters seqerror
, allelebalancemean
, allelebalancedisperse
, and alleledropout
are irrelevant.
If model="depmodel"
, the parameters allelebalancemean
, allelebalancedisperse
, and alleledropout
are irrelevant.
If there exists keyarg isinfererror
and isinfererror
= true, the parameters with values being nothing will be inferred and the other parameters will be fixed. If isinfererror
= false, LikeParameters() is equivalent to LikeParameters(foundererror=0.005, offspringerror=0.005, peroffspringerror=0.0, seqerror=0.001, allelebalancemean=0.5, allelebalancedisperse=0.0, alleledropout=0.0).
PriorLikeParameters
MagicBase.PriorLikeParameters
— TypePriorLikeParameters
keyword-based struct for the priors of the likelihood parameters.
Fields
foundererror::Distribution = Beta(1.0,1.0)
: prior distribution of founder error rate
offspringerror::Distribution = Beta(1.0,1.0)
: prior distribution of offspring error rate
peroffspringerror::Distribution = Beta(1.0,1.0)
: prior distribution of error rate per offspring
seqerror::Distribution = Beta(1.0,1.0)
: prior distribution of sequence base error rate
allelebalancemean::Distribution = Beta(1.01,1.01)
prior distribution of allele balance mean
allelebalancedisperse::Distribution = Exponential(0.5)
: prior distribution of allele balance overdispersion
alleledropout::Distribution = Beta(1,19)
: prior distribution of allele dropout rate
ThreshLikeParameters
MagicBase.ThreshLikeParameters
— TypeThreshLikeParameters
keyword-based struct for the thresholds of the likelihood parameters. Markers with the inferred parameter values being greater than the maximum will be deleted.
ThreshLikeParameters() is equivalent to ThreshLikeParameters(foundererror=0.25, offspringerror=0.25, peroffspringerror=0.25, seqerror=0.25, allelebalancemean=0.9, allelebalancedisperse=1.0, alleledropout=0.05).
magicimpute
andmagiccall
have the three keyargs for likelihood parameters and keyargisinfererror
.magicimpute
has defaultlikeparameters = LikeParameters()
,priorlikeparameters=PriorLikeParameters()
, andthreshlikeparameters = ThreshLikeParameters()
magiccall
has defaultlikeparameters = LikeParameters(offspringerror=0.04, alleledropout=0.0)
,priorlikeparameters=PriorLikeParameters(seqerror=0.005)
, andthreshlikeparameters = ThreshLikeParameters(seqerror=0.05)