Key options

Genotype formats

The VCF genofile allows to specify multiple genotype formats for each marker. RABBIT assumes that all markers are biallelic; multiallelic markers can be either deleted (delmultiallelic=true) or transformed into biallelic markers by pooling multiple alternative alleles. RABBIT allows three kinds of genotype formats:

  • GT: discrete genotype such as "0/1" for unphased genotypes and "0|1" for phased genotype.
  • AD: allelic depth in form of "r0,r1", where r0 is the number of reads for allele 0 and r1 is the number of reads for allele 1.
  • GP: genotype probability. It can be either in form of "p00, p01, p11", where p00, p01, and p11 denote the probabilities of being genotype "0/0", "0/1", and "1/1", respectively; or in form of "p00, p11" for inbred lines with only two possible homozygous genotypes (model=depmodel for magicimpute).

RABBIT has a keyword argument formatpriority for functions requiring genofile as an argument. Specify formatpriority as a vector of formats with descreasing priorities. To optimize the performance, the default value of formatpriority varies with function.

Functionformatpriority
magicfilter["AD","GT"]
magicmap["GT","AD"]
magicimpute["AD","GT"]
magicreconstruct["GP","AD","GT"]
formatpriority
  • Specify formatpriority like this --formatpriority "[AD, GT]" for running in a command, and formatpriority = ["AD", "GT"] for running in julia REPL.
  • magiccall performs single marker genotype calling. For each marker with AD, it first infers parental genotypes and error rates, and then exports AD, calculated GP, and called GT.
  • magicmap performs raw genotype calling for "AD" if skipping magiccall for sequence data. The raw genotype calling assumes a pre-specified error rate, and it performs well for almost homozygous populations, but not for sequence data from heterozygous populations.
  • magicimpute exports both posterior genotype probabilities ("GP") and discrete genotypes ("GT").
  • magicreconstruct performs generally better if using "GP" than that of "GT".

Likelihood parameters

Genotype likelihood denotes the probability of observed genotypic data given hidden states and parameters. The parameters include foundererror, offspringerror, seqerror, allelebalancemean, allelebalancedisperse, and alleledropout. See the section Statistical Framework.

RABBIT has three keyword arguments likeparameters, priorlikeparameters, and threshlikeparameters, which have struct types LikeParameters, PriorLikeParameters, and TreshLikeParameters, respectively, and each of the structs has field names for the six likelihood parameters.

LikeParameters

MagicBase.LikeParametersType
LikeParameters

keyword-based struct for the parameters of likelihood function.

LikeParameters() is equivalent to LikeParameters(foundererror=0.005, offspringerror=nothing, peroffspringerror=nothing, seqerror=nothing, allelebalancemean=nothing, allelebalancedisperse=nothing, alleledropout=0.0). The peroffspringerror refers to error rate per offspring, and the other parameters refer to error rate per marker.

If genotype format is not "AD", the parameters seqerror, allelebalancemean, allelebalancedisperse, and alleledropout are irrelevant.

If model="depmodel", the parameters allelebalancemean, allelebalancedisperse, and alleledropout are irrelevant.

If there exists keyarg isinfererror and isinfererror = true, the parameters with values being nothing will be inferred and the other parameters will be fixed. If isinfererror = false, LikeParameters() is equivalent to LikeParameters(foundererror=0.005, offspringerror=0.005, peroffspringerror=0.0, seqerror=0.001, allelebalancemean=0.5, allelebalancedisperse=0.0, alleledropout=0.0).

PriorLikeParameters

MagicBase.PriorLikeParametersType
PriorLikeParameters

keyword-based struct for the priors of the likelihood parameters.

Fields

foundererror::Distribution = Beta(1.0,1.0): prior distribution of founder error rate

offspringerror::Distribution = Beta(1.0,1.0): prior distribution of offspring error rate

peroffspringerror::Distribution = Beta(1.0,1.0): prior distribution of error rate per offspring

seqerror::Distribution = Beta(1.0,1.0): prior distribution of sequence base error rate

allelebalancemean::Distribution = Beta(1.01,1.01) prior distribution of allele balance mean

allelebalancedisperse::Distribution = Exponential(0.5): prior distribution of allele balance overdispersion

alleledropout::Distribution = Beta(1,19): prior distribution of allele dropout rate

ThreshLikeParameters

MagicBase.ThreshLikeParametersType
ThreshLikeParameters

keyword-based struct for the thresholds of the likelihood parameters. Markers with the inferred parameter values being greater than the maximum will be deleted.

ThreshLikeParameters() is equivalent to ThreshLikeParameters(foundererror=0.25, offspringerror=0.25, peroffspringerror=0.25, seqerror=0.25, allelebalancemean=0.9, allelebalancedisperse=1.0, alleledropout=0.05).

Likelihood parameters
  • magicimpute and magiccall have the three keyargs for likelihood parameters and keyarg isinfererror.
  • magicimpute has default likeparameters = LikeParameters(), priorlikeparameters=PriorLikeParameters(), and threshlikeparameters = ThreshLikeParameters()
  • magiccall has default likeparameters = LikeParameters(offspringerror=0.04, alleledropout=0.0), priorlikeparameters=PriorLikeParameters(seqerror=0.005), and threshlikeparameters = ThreshLikeParameters(seqerror=0.05)