RABBIT interface for julia REPL
Documentation for RABBIT's public interface.
Contents
Index
MagicBase
MagicCall
MagicFilter
MagicImpute
MagicMap
MagicReconstruct
MagicScan
MagicSimulate
MagicBase.DesignInfo
MagicBase.JuncDist
MagicBase.MagicAncestry
MagicBase.MagicGeno
MagicBase.MagicPed
Pedigrees.Pedigree
MagicBase.animcondprob
MagicBase.array_extract_pedfile
MagicBase.arrayfile2vcf
MagicBase.formmagicgeno
MagicBase.formmagicped
MagicBase.hapmap2vcf
MagicBase.merge_arrayfiles
MagicBase.merge_pedfiles
MagicBase.merge_vcffiles
MagicBase.parsebreedped
MagicBase.parsedesign
MagicBase.pedfile_designcode2ped
MagicBase.plotcondprob
MagicBase.plotmagicped
MagicBase.plotmarkermap
MagicBase.plotrecombreak
MagicBase.rabbitgeno_mma2jl
MagicBase.rabbitped_mma2jl
MagicBase.readmagicancestry
MagicBase.readmagicped
MagicBase.resetmap
MagicBase.savegenodata
MagicBase.savemagicancestry
MagicBase.savemagicgeno
MagicBase.savemagicped
MagicBase.vcf_extract_pedfile
MagicBase.vcf_pad_samples
MagicBase.vcffilter
MagicCall.magiccall
MagicFilter.magicfilter
MagicFilter.magicfilter!
MagicImpute.magicimpute
MagicImpute.magicimpute!
MagicMap.magicmap
MagicPrior.magicorigin
MagicPrior.magicprior
MagicReconstruct.magicreconstruct
MagicReconstruct.magicreconstruct!
MagicScan.magicscan
MagicSimulate.magicsimulate
MagicSimulate.simfhaplo
Pedigrees.orderped
Pedigrees.plotped
Pedigrees.readped
Pedigrees.saveped
Public Interface
Pedigrees.Pedigree
— TypePedigree{T}
immutable struct that stores information of a pedigree. See also readped
.
The Pedigree fields include nfounder
, generation
, member
, gender
, mother
, and father
. The mothers and fathers of founders must be denoted by 0.
Pedigree(nfounder,member,mother,father,gender,generation)
inner constructor. The gender of each member must be "notapplicable", "female", or "male".
Pedigree(df::DataFrame)
extenral constructor. The dataframe must have columns member
, mother
, and father
. A gender column is optional. The constructor calculates the generation for each member, which is the max path length from founders to the member. The constructor orders pedigree by calling orderped
.
Pedigrees.orderped
— Functionorderped(ped::Pedigree)
order pedigree members so that parents always come first.
Pedigrees.readped
— Functionreadped(pedfile::AbstractString, delim=',', commentstring="##")
read a pedigree from pedfile, ignoring the lines beginning with commentstring.
The pedigree file must contain columns: member, mother, and father. The mothers and fathers of founders must be denoted by 0. The value in the optional gender column must be "notapplicable", "female" or "male".
Pedigrees.saveped
— Functionsaveped(filename::AbstractString,ped::Pedigree)
save pedigree into a csv file. The file contains columns: member, mother, father, gender, generation.
Pedigrees.plotped
— Functionplot(ped::Pedigree)
plot a pedigree.
MagicPrior.magicorigin
— Functionmagicorigin(pedigree::Pedigree; kwargs...)
calculate expected ancestral probabilities and junction densities in pedigree.
Keyword Argument
memberlist::Union{Nothing,AbstractVector}=nothing
: a list of pedigree members. By default, memberlist contains the last pedigree member.
isautosome::Bool=true
: autosome rather than sex chromosome,
isconcise::Bool=false
: if false, calculate quantities: phi12, R(a)(or R^m), R(b)(or R^p), ρ, J1112, J1121, J1122, J1211, J1213, J1222, J1232. If true, calculate only the first 4 quantities.
isfglexch::Bool=false
: if true, assume that founder genotype lables (FGLs) are exchangeable.
MagicPrior.magicprior
— Functionmagicprior(pedigree::Pedigree,founderfgl::AbstractMatrix; kwargs...)
calculate prior distribution of recombination breakpoints in pedigree.
For each pedigree member in memberlist
, return continutuous Markov process parameter values: initial probability vector and rate matrix under three models "depmodel", "indepmodel", and "jointmodel", a relationship between the two ancestral processes along each of two homologous chromosomes.
Position arguments
pedigree::Pedigree
: pedigree struct.
isfounderinbred::Bool
: if true, founders are inbred, and otherwise outbred.
Keyword Arguments
memberlist::Union{Nothing,AbstractVector}=nothing
: a list of pedigree members.
isautosome::Bool=true
: autosome rather than sex chromosome,
isfglexch::Bool=false
: not assume that founder genotype lables (FGLs) are exchangeable.
isconcise::Bool=false
: if isconcise is true, the parameter values for "jointmodel" are not calculated.
MagicBase
— ModuleMagicBase
a package for basic data structures and functions for genetic analysis in connected multiparental populations.
MagicBase.JuncDist
— TypeJuncDist
struct that stores junction information, that is, the prior distribution of recombination breakpoints.
MagicBase.DesignInfo
— TypeDesignInfo
mutable struct that stores design information for a subpopulation. See also parsedesign
.
Fields
designtype::Symbol
: type of the subpopulation design. It must be :commoncross, :breedcross, :juncdist, or :matescheme.
founders::Union{Nothing,AbstractVector}
: founders for the subpopulation.
designcode::Union{Nothing,AbstractString}
: string code for the design.
pedigree::Union{Nothing, Pedigree}
: pedigree for the design
matescheme::Union{Nothing, MateScheme}
: mate schemes for the design
juncdist::Union{Nothing, JuncDist}
: junctdist for the design.
MagicBase.MagicPed
— TypeMagicPed
mutable struct that stores pedigree information
MagicPed(designinfo,founderinfo,offspringinfo)
inner constructor. See also readmagicped
.
Fields
designinfo::Union{Nothing, Dict{String,DesignInfo},Pedigree}
: specifies population designinfo. A designinfo::Pedigree specifies the designinfo in pedigree. See also Pedigree
. A designinfo::Dict{String,DesignInfo} specifies designinfo for each subpopulation. See also DesignInfo
.
founderinfo::DataFrame
: founder information.
offspringinfo::DataFrame
: offspring information. The column names are [:individual,:member,:ishomozygous,:isfglexch,:gender]. The :ishomozygous column specifies if the individual is homozygous. If design is set by juncdist::JuncDist, the :member column is set to "juncdist". If design is set by a string designcode::AbstractString, the :member column is set to the last non-male member of the pedigree. If design is set by designinfo::Pedigree, the :member column is associated to the :member column of the pedigree.
MagicBase.formmagicped
— Functionformmagicped(designinfo, popsize)
form magicped::MagicPed form magicped for a non-divided pouplation of popsize.
formmagicped(pedigree, popsize)
form magicped::MagicPed from pedigree and popsize.
formmagicped(designcode, popsize)
form magicped::MagicPed from designcode and popsize.
formmagicped(juncdist, popsize)
form magicped::MagicPed from juncdist and popsize.
MagicBase.readmagicped
— Functionreadmagicped(pedfile; commentstring="##",workdir=pwd())
read a CSV formatted pedfile
and return magicped::MagicPed.
Positional arguments
pedfile::AbstractString
: saves the pedigre information: designinfo and offspringinfo. The designinfo can be provided in three formats: pedigree, mating design code, and junction distribution.
Keyword arguments
commentstring::AbstractString="##"
: the lines beginning with commentstring are ignored in pedfile.
workdir::AbstractString=pwd()
: directory for reading pedfile.
MagicBase.savemagicped
— Functionsavemagicped(sink,magicped; delim=',',workdir=pwd())
save magicped into sink. see [readmagicped
] for reading saved output file.
Positional arguments
sink::Union{IO,AbstractString}
: output file or IO.
magicped::MagicPed
: a struct for storing pedidgree info.
Keyword arguments
delim::AbstractChar=','
: delimitor character.
workdir::AbstractString=pwd()
: directory for reading genfile and pedfile.
MagicBase.plotmagicped
— Functionplotmagicped(magicped; kwargs...)
plot magicped.
Keyword arguments
isfounderinbred::Bool=true
: if true, founders are inbred, and otherwise outbred.
outfile::Union{Nothing,AbstractString}=nothing
: if nothing, not save the plot, and otherwise save to outfile.
MagicBase.MagicGeno
— TypeMagicGeno
mutable struct that stores genotypic data
MagicGeno(magicped,markermap,foundergeno,offspringgeno,deletion,correction)
inner constructor. See also formmagicgeno
.
Fields
magicped::MagicPed
: breeding pedigree information. See also MagicPed
.
markermap::Vector{DataFrame}
: marker map for each chromosome. markermap[c]
gives the markermap of chromosome c.
foundergeno::Vector{Matrix}
: genotypic data in founders. foundergeno[ch][m,i]
gives the genotype of founder i at marker m of chromosome ch.
offspringgeno::Vector{Matrix}
: genotypic data in offspring. offspringgeno[ch][m,i]
gives the genotype of offspring i at marker m of chomosome ch.
misc::Dict{String, DataFrame}
: contains information such as (1) "deletion" dataframe for markers that were removed from markermap, (2) "correction" dataframe for parental error correction.
MagicBase.formmagicgeno
— Functionformmagicgeno(genofile, nfounder;
isfounderinbred, formatpriority, isphysmap, recomrate, commentstring)
form magicgeno::MagicGeno from the genofile
and the number nfounder
of founders. Assume that founders' columns are on the left of offspring columns.
formmagicgeno(genofile, pedinfo;
formatpriority, isphysmap, recomrate, commentstring)
form magicgeno from the genofile
and the pedigree information pedinfo
.
Positional arguments
genofile::AbstractString
: genotypic data file with extension ".vcf" or ".vcf.gz".
pedinfo::AbstractString
: designcode
or pedfile
pedfile
: Seereadmagicped
for the format of a pedigree file.designcode
: a string designcode for a breeding population. Seeparsedesign
for details.
Keyword arguments
isfounderinbred::Bool=true
: if true, founders are inbred, and otherwise outbred.
formatpriority::AbstractVector=["AD","GT"]
: the priority of genotype formats when parasing input vcf genofile.
isphysmap::Bool=false
: if ture, transform physical map into genetic map using recomrate and overwrite the exist genetic map. If false, keep input physical and/or genetic map."
recomrate::Real=1.0
, average recombation rate in cM per Mbp. Valid only if isphysmap = true.
commentstring::AbstractString="##"
: the lines beginning with commentstring are ignored in genofile or pedfile.
workdir::AbstractString=pwd()
: directory for reading genofile and pedfile.
MagicBase.savemagicgeno
— Functionsavemagicgeno(outfile,magicgeno,workdir=pwd(),delim=',')
save genotypic data of magicgeno::MagicGeno into outfile
Positional arguments
outfile::AbstractString
: filename for saving results.
magicgeno::MagicGeno
: a struct returned by formmagicgeno
.
Keyward arguments
missingstring::AbstractString = "NA"
: string for missing values workdir::AbstractString = pwd()
: directory for writing outfile.
delim::AbstractChar=','
: delimitor character
MagicBase.savegenodata
— Functionsavegenodata(outfile,magicgeno; missingstring="NA",workdir=pwd(),delim=',')
save genotypic data of magicgeno::MagicGeno into outfile
Positional arguments
outfile::AbstractString
: output genofile for saving genotypic data.
magicgeno::MagicGeno
: a struct returned by formmagicgeno
.
Keyward arguments
workdir::AbstractString = pwd()
: directory for writing outfile.
delim::AbstractChar=','
: delimitor character
MagicBase.MagicAncestry
— TypeMagicAncestry
mutable struct that stores the results of haplotype reconstruction by magicreconstruct.
MagicAncestry(magicped,markermap,foundergeno,statespace,
viterbipath, diploprob,genoprob,haploprob,loglike,misc)
inner constructor. See also readmagicancestry
.
Fields
magicped::MagicPed
: breeding pedigree information. See also MagicPed
.
markermap::Vector{DataFrame}
: marker map for each chromosome. markermap[c]
gives the marker map of chromosome c.
foundergeno::Vector{Matrix}
: haplotypes in founders. foundergeno[c][m,f]
gives the founder f at marker m of chromosome c.
statespace::AbstractDict
: definition of ancestral haplotype states corresponding to haploprob
.
viterbipath::Union{Nothing,Vector{Matrix}}
: optimal state path obtained by the Viterbi algorithm. viterbipath[c][m,o]
gives the ancestral diplotype state of offspring o at marker m for chromosome c.
diploprob::Union{Nothing,Vector{Vector{SparseArrays.SparseMatrixCSC}}}
: marginal posterior probabilities for diplotypes. diploprob[c][o][m,s]
gives the probability of offspring o at ancestral diplotype state s for marker m of chromosome c.
genoprob::Union{Nothing,Vector{Vector{SparseArrays.SparseMatrixCSC}}}
: marginal posterior probabilities for genotypes. genoprob[c][o][m,s]
gives the probability of offspring o at ancestral genotype state s for marker m of chromosome c.
haploprob::Union{Nothing,Vector{Vector{SparseArrays.SparseMatrixCSC}}}
: marginal posterior probabilities for haplotypes. genoprob[c][o][m,s]
gives the probability of offspring o at ancestral haplotype state s for marker m of chromosome c.
inbredcoef::Union{Nothing,Vector{Matrix}}
: realized inbreeding coefficients. inbredcoef[c][m,o]
gives the inbreeding coefficients at marker m of chromosome c in offpsring o.
loglike::AbstractMatrix
: loglike[c,o] gives the log likelihood for chromsome c of offspring o.
misc::Dict{String, DataFrame}
: contains information such as (1) "deletion" dataframe for markers that were removed from markermap, (2) "correction" dataframe for parental error correction.
MagicBase.readmagicancestry
— Functionreadmagicancestry(ancestryfile, workdir=pwd(),tempdirectory=tempdir())
read ancestryfile in the directory workdir and return magicancestry::MagicAncestry. See MagicAncestry
.
Positional argument
ancestryfile
: file storing magicancestry that is generated by savemagicancestry
. It results from magicreconstruct
.
Keyward arguments
workdir::AbstractString = pwd()
: directory for reading ancestryfile,
tempdirectory::AbstractString=tempdir()
: temparary directory.
MagicBase.savemagicancestry
— Functionsavemagicancestry(outputfile,magicancestry,workdir=pwd())
save magicancestry into outputfile.
Positional arguments
outputfile::AbstractString
: file for saving magicancestry.
magicancestry::MagicAncestry
: a struct returned by magicreconstruct.
Keyward arguments
workdir::AbstractString = pwd()
: directory for writing outputfile.
MagicBase.parsedesign
— Functionparsedesign(designcode; kwargs...)
parse string designcode into DesignInfo
Keyword Argument
founders::Union{Nothing,AbstractVector}=nothing
: a list of founders.
popid="pop"
: population id.
fixed_nself::Integer=20
: number of selfing generation for designcode in form of pedcode=>FIXED.
MagicBase.parsebreedped
— Functionparsebreedped(pedfile; fixed_nself=10, outfile, commentstring="##",workdir=pwd())
convert a breedped pedfile into the magicped outfile. The first 3 columns of breedpedfile must be sample, pedcode, nself.
Keyword arguments
fixed_nself::Integer = 20
: interprete "FIXED" in nself column as 20.
delim::AbstractChar=','
: text delimitor of input pedfile.
commentstring::AbstractString="##"
: the lines beginning with commentstring are ignored in pedfile.
workdir::AbstractString=pwd()
: directory for reading pedfile.
MagicBase.resetmap
— Functionresetmap(vcffile,mapfile;
missingstring, commentstring, outstem, workdir)
exports a new vcf file with marker map replaced with mapfile.
Positional arguments
vcffile::AbstractString
: genotypic data file with extension ".vcf" or ".vcf.gz".
mapfile::AbstractString
: file for marker map, it can either be in VCF format or in CSV format. For CSV-format, it must contain at least five columns: marker, linkagegroup, poscm, physchrom, physposbp. The values are represented by missingstring.
Keyword arguments
missingstring::AbstractString="NA"
: string representing missing value.
commentstring::AbstractString="##"
: the lines beginning with commentstring are ignored in vcffile or mapfile.
outstem::Union{Nothing,AbstractString}="outstem"
: stem of output filenames.
workdir::AbstractString=pwd()
: directory for reading genfile and pedfile.
MagicBase.rabbitped_mma2jl
— Functionrabbitped_mma2jl(mmapedfile; kwargs...)
convert input pedfile of Mathematica-version RABBIT into pedfile for Julia-version RABBIT.
keyword arguments
ishomozygous::Bool = false
: if true, offspring is homozygous.
isfglexch::Bool = false
: if true, offspring is produced with rand parent ordering.
workdir::AbstractString=pwd()
: working directory for input and output files.
outfile::AbstractString = "outstem_ped.csv"
: output filename.
MagicBase.rabbitgeno_mma2jl
— Functionrabbitgeno_mma2jl(mmagenofile; kwargs...)
convert input genofile of Mathematica-version RABBIT into genofile for Julia-version RABBIT.
keyword arguments
workdir::AbstractString=pwd()
: working directory for input and output files.
outfile::AbstractString = "outstem_geno.vcf.gz"
: output filename.
MagicBase.arrayfile2vcf
— Functionarrayfile2vcf(arrayfile; keyargs...)
extract pedfile from arrayfile. Work only for a non-subdivided population.
Positional arguments
arrayfile::AbstractString
: SNP array genofile.
Keyword arguments
delmultiallelic::Bool = true,
: if true, delete markers with #alleles >= 3.
delim= ","
: text delimiter.
outstem::AbstractString="outstem"
: stem of output filename.
workdir::AbstractString=pwd()
: directory for reading and saving files.
logfile::Union{Nothing,AbstractString,IO}= outstem*"_arrayfile2vcf.log"
: log file or IO for writing log. If it is nothing, no log file.
verbose::Bool=true
: if true, print details on the stdout.
MagicBase.hapmap2vcf
— Functionhapmap2vcf(hapmapfile; keyargs...)
convert from a hapmap genofile into a vcf genofile.
Positional arguments
hapmapfile::AbstractString
: hapmap genofile.
Keyword arguments
delmultiallelic::Bool = true,
: if true, delete markers with #alleles >= 3.
delim= ","
: text delimiter.
missingallele::AbstractString="-"
: string for missing allele
outstem::AbstractString="outstem"
: stem of output filename.
workdir::AbstractString=pwd()
: directory for reading and saving files.
logfile::Union{Nothing,AbstractString,IO}= outstem*"_arrayfile2vcf.log"
: log file or IO for writing log. If it is nothing, no log file.
verbose::Bool=true
: if true, print details on the stdout.
MagicBase.merge_vcffiles
— Functionmerge_vcffiles(vcffiles; outstem, workdir)
merge vcffiles into a single vcf genofile.
Positional arguments
vcffiles::AbstractVector
: a list of vcf genofile.
Keyword arguments
outstem::AbstractString="outstem"
: stem of output filename.
workdir::AbstractString=pwd()
: directory for reading and saving files.
MagicBase.merge_arrayfiles
— Functionmerge_arrayfiles(arrayfiles; outstem, workdir)
merge arrayfiles into a single SNP array genofile.
Positional arguments
arrayfiles::AbstractVector
: a list of SNP array genofile.
Keyword arguments
missingallele::AbstractString = "-"
: string for missing allele.
outstem::AbstractString="outstem"
: stem of output filename.
outext::AbstractString=".csv.gz"
: extension of output file.
workdir::AbstractString=pwd()
: directory for reading and saving files.
logfile::Union{Nothing,AbstractString,IO}= outstem*"_merge_arrayfiles.log"
: log file or IO for writing log. If it is nothing, no log file.
verbose::Bool=true
: if true, print details on the stdout.
MagicBase.merge_pedfiles
— Functionmerge_pedfiles(pedfiles; keyargs...)
merge pedfiles into a single pedfile.
Positional arguments
pedfiles::AbstractString
: a list of pedfiles.
Keyword arguments
isped::Bool
: if true, designinfo in pedfiles is in form of pedigree, and otherwise designcode.
isfounderinbred::Bool=true
: if true, founders are inbred.
outstem::AbstractString=popid
: stem of output filename.
workdir::AbstractString=pwd()
: directory for reading and saving files.
MagicBase.array_extract_pedfile
— Functionarray_extract_pedfile(arrayfile; keyargs...)
extract pedfile from arrayfile. Work only for a non-subdivided population.
Positional arguments
arrayfile::AbstractString
: SNP array genofile.
Keyword arguments
designcode::AbstractString
: designcode
popid::AbstractString
: population id.
delim= ","
: text delimiter.
outstem::AbstractString=popid
: stem of output filename.
workdir::AbstractString=pwd()
: directory for reading and saving files.
MagicBase.vcf_pad_samples
— Functionvcf_pad_samples(vcffile; keyargs...)
pad samples into vcffiles with all genotypes being missing.
Positional arguments
vcffile::AbstractString
: vcf genofile.
Keyword arguments
padsamples::AbstractVector
: a list of samples to be padded
commentstring::AbstractString="##"
: the lines beginning with are ignored
outstem::AbstractString=popid
: stem of output filename.
workdir::AbstractString=pwd()
: directory for reading and saving files.
MagicBase.vcf_extract_pedfile
— Functionvcf_extract_pedfile(vcffile; keyargs...)
extract pedfile from vcffile. Work only for a non-subdivided population.
Positional arguments
vcffile::AbstractString
: vcf genofile.
Keyword arguments
designcode::AbstractString
: designcode
ishomozygous::Bool=false
: specify if offspring are completely homozygous.
isfglexch::Bool=true
: specify if founders are exchangeable.
popid::AbstractString="pop"
: population id.
outstem::AbstractString="outstem"
: stem of output filename.
workdir::AbstractString=pwd()
: directory for reading and saving files.
MagicBase.vcffilter
— Functionvcffilter(vcffile; keyargs...)
filter markers line by line.
Positional arguments
vcffile::AbstractString
: vcf genofile.
Keyword arguments
setmarkerid::Union{Nothing,Bool}=nothing
: if true, set markerid. If it is nothing, setmarkerid = true only if markerid is missing
delsamples::Union{Nothing,AbstractVector}=nothing
: list of sample IDs to be deleted. If it is nothing, no deletion of samples
deldupe::Bool=false
: if true, delete sucessive markers that have exactly duplicated genotypes in format of GT
delmultiallelic::Bool=true
: if true, delete markers with >2 alleles.
delmonomorphic::Bool=true
: if true, delete markers with single allele.
seqstretch::Integer=0
: delete non-initial markers in a sequence stretch of length <= seqstretch (in bp), assuming marker are ordered by physical positions. If it is not positive, no filtering for short streches.
snp_maxmiss::Real = 0.99
: delete markers with missing fraction > snp_maxmiss
snp_minmaf::Real = 0.01
: delete markers with minor allele frequency < snp_minmaf
commentstring::AbstractString="##"
: the lines beginning with are ignored
outstem::AbstractString=popid
: stem of output filename.
workdir::AbstractString=pwd()
: directory for reading and saving files.
logfile::AbstractString = outstem*"_vcffilter.log"
: log filename.
verbose::Bool=true
: if true, print details on the stdout.
MagicBase.pedfile_designcode2ped
— Functionpedfile_designcode2ped(pedfile; commentstring='#',workdir=pwd())
convert a pedfile from designinfo
being designcode
to pedigree
.
Keyword arguments
isfounderinbred::Bool=true
: if true, founders are inbred, and otherwise outbred.
commentstring::AbstractString="##"
: the lines beginning with commentstring are ignored in pedfile.
workdir::AbstractString=pwd()
: directory for reading pedfile.
MagicBase.plotcondprob
— Functionplotcondprob(magicancestry,offspring=nothing,probtype="haploprob",
colorgradient = cgrad([:white,:blue,:red]),,
boundaryline = (1.5, :gray),
truemarker=(:star, 5, 0.5,:gray,stroke(:gray)),
truefgl=nothing,
outfile::Union{Nothing, AbstractString}=nothing,
plotkeyargs...)
plot heatmap for conditional probability.
Positional arguments
magicancestry::MagicAncestry
: magicancestry returned from magicreconstruct
.
Keyword arguments
probtype::AbstractString="haploprob"
: specify type of condprob
offspring::Union{Nothing,Integer}=nothing
: offsprign index. By default, a random offspring index.
colorgradient::ColorGradient=cgrad([:white,:blue,:red]),
: color gradient for heatmap
boundaryline=(1.5,:gray)
: vertical lines for chromosome boundaries.
truemarker=(:star, 5, 0.5,:gray,stroke(:gray))
: scatter markers for true ancestral states.
truefgl::Union{Nothing,MagicGeno}=nothing
: provides true ancestral origins.
outfile::Union{Nothing, AbstractString}=nothing
: if nothing, not save the plot, and otherwise save to outfile.
plotkeyargs...
: other Plots.plot keyward arguments.
MagicBase.animcondprob
— Functionanimcondprob(magicancestry,fps=1,outstem="",kewargs...)
animation for plots of conditional probability.
Positional arguments
magicancestry::MagicAncestry
: magicancestry returned from magicreconstruct
.
Keyword arguments
fps::Real=1
: number of frames per seconds.
outfile::AbstractString="condprob.gif"
: output file for saving animation.
see plotcondprob
for keyargs.
MagicBase.plotmarkermap
— Functionplotmarkermap(mapx, mapy;
boundaryline = (1.5,:dot,:black),
markersize = 1.5,
isannotate= true,
maplabels=["mapx(cM)", "mapy(cM)"],
isphysmap = [false,false],
plotkeyargs...
)
plot postions of mapx vs those of mapy.
Positional arguments
mapx::Vector{DataFrame}
: marker map for all chromosomes.
mapy::Vector{DataFrame}
: comparing map for all chromosomes.
Keyword arguments
boundaryline = (1.0,:dot,:gray)
: vertical lines for chromosome boundaries.
markersize::Real= size(mapx,1) <= 1000 ? 3.0 : 1.5
: size of scatter markers.
isannotate::Bool=true
: if ture, annotate chromosome ID and kendall correlation.
mmaplabels::Union{Nothing,AbstractString}=nothing
: labels of comparing marker maps.
isphysmap::AbstractVector = falses(2)
: specify if mapx and/or map are physical maps.
plotkeyargs...
: other Plots.plot keyward arguments.
MagicBase.plotrecombreak
— Functionplotrecombreak(magicancestry,chr=1,
colorgradient = ColorGradient([:yellow,:blue,:red]),
truefgl=nothing)
plot recombination breakpoints.
Positional arguments
magicancestry::MagicAncestry
: magicancestry returning from magicreconstruct
.
Keyword arguments
chr::Integer=1
: chromosome index.
colorgradient::ColorGradient=cgrad([:white,:blue,:red])
: color gradient for heatmap
truefgl::Union{Nothing,MagicGeno}
: provides true ancestral origins.
MagicSimulate
— ModuleMagicSimulate
a package for simulating genotyping data in multiparental populations. See also magicsimulate
.
MagicSimulate.simfhaplo
— Functionsimfhaplo(; kwargs...)
simulate founder haplotypes.
Keyword arguments
nsnp
: number of markers.
nparent
: number of parents.
missingstring="NA"
: string representing missing value.
chrlen::AbstractVector = 100 * ones(5)
: specify genetic length of each chromosome.
isfounderinbred::Bool=true
: if true, founders are inbred, and otherwise outbred.
outfile = "sim_fhaplo.vcf.gz"
: output filename.
workdir::AbstractString = pwd()
specifies the working directory.
MagicSimulate.magicsimulate
— Functionmagicsimulate(fhaplofile, pedinfo; kwargs...)
simulate genotypic data from founder haplotypes in fhaplofile
and pedigree information in pedinfo
.
Positional arguments
fhaplofile::AbstractString
specifies file for founder haplotypes including marker map. The fhaplofile extension must be ".vcf" or ".csv".
pedinfo::Union{AbstractString,MateScheme}
specifies pedigree information via a matescheme (e.g. MateScheme(8,["Pairing","Selfing"],[3,6])), or via a pedigree file (if pedinfo[end-3:end]==".csv"), or via a string designcode (e.g. "8ril-self6").
Keyword arguments
isfounderinbred::Bool=true
if true, the founders are inbred. For inbred founders, the heterozygous genotypes are set to missing. For outbred founders, genotypes must be phased.
popsize::Union{Nothing,Integer}=200
specifies the population size, valid only if the pedinfo
is specified via a designcode or matescheme.
error_randallele::Union{Nothing,Real}=0.0
specifices genotyping error model. The error model follows uniform allele model with probability errorrandallele, and it follows uniform genotype model with probability 1-errorrandallele. If it is nothing, error_randallele is given by the interally estimated non-ibd probability.
foundererror::Distribution=Beta(1,199)
specifies that the probability distribution of genotyping error rate at a marker in founders. At a given marker, founders have the same error rate.
offspringerror::Distribution=Beta(1,199)
specifies that the probability distribution of genotyping error rate at a marker in offspring. At a given marker, offspring have the same error rate.
foundermiss::Distribution=Beta(1,9)
specifies that the probability distribution of the fraction of missing genotypes at a marker in founders.
offspringmiss::Distribution=Beta(1,9)
specifies that the probability distribution of the fraction of missing genotypes at a marker in offspring.
seqfrac::Real=0.0
specifies the fraction of markers being genotyped by sequencing; the rest markers are genotyped by SNP array.
seqerror::Distribution = Beta(1,199)
specifies the probability diestribution of sequencing error rate among markers.
allelebalancemean::Distribution = Beta(10,10)
specifies the probability distribution of the mean sequencing allelic balance among markers.
allelebalancedisperse::Distribution = Exponential(0.05)
specifies overdispersion parameter for the probability distribution of allelebalancemean among individuals at a marker.
seqdepth::Distribution = Gamma(2, 5)
specifies the probability distribution of the mean read depth at a marker.
seqdepth_overdispersion::Distribution = Gamma(1,1)
specifies the probability distribution of over-dispersion of read depths among individuals at a marker. Given the mean depth lam at a marker, the read depth of an individual follows a NegativeBionomial(r,p), such that mean lam = r(1-p)/p, and variance = r(1-p)/p^2 = r(1+lam/r) = r(1+seqdepthoverdispersion) where seqdepthoverdispersion = lam/r. seqdepth_overdispersion = 0 denotes no over-dispersion.
isobligate::Bool=false
specifies whether there must be at least one crossover event
interference::Integer=0
specifies the strength of chiasma interference. By default, no chiasma interference. The recombination breakpoins are obtained by taking every (1+interference) points that follow a Poisson distribution along chromosome.
ispheno::Bool=false
specifies whether to simulate phenotypes.
pheno_nqtl::Integer=1
specifies the number of QTLs in simulating phenotypes.
pheno_h2::Real= 0.5
specifies heritablity in simulating phenotypes.
select_nqtl::Integer=1
specifies the number of QTLs in simulating trait for artifical selection.
select_prop::Real = 1.0
specifies the proportion of zygotes selected in artifical selection. By default, no artifical selection.
outstem::Union{Nothing,AbstractString}="outstem"
specifies the stem of output filenames.
workdir::AbstractString = pwd()
specifies the working directory.
verbose::Bool=true
: if true, print details on the stdout.
Examples
julia> magicsimulate("fhaplo.vcf.gz","ped.csv")
julia> magicsimulate("fhaplo.vcf.gz","8ril-self6"; popsize=800)
magicsimulate(pedinfo; kwargs...)
simulates ancestral blocks from pedinfo.
Positional arguments
pedinfo::Union{AbstractString,MateScheme}
specifies pedigree information via a matescheme (e.g. MateScheme(8,["Pairing","Selfing"],[3,6])), or via a pedigree file (if pedinfo[end-3:end]==".csv"), or via a string designcode (e.g. "8ril-self6").
Keyword arguments
isfounderinbred::Bool=true
if true, the founders are inbred. For inbred founders, the heterozygous genotypes are set to missing. For outbred founders, genotypes must be phased.
popsize::Union{Nothing,Integer}=200
specifies the population size, valid only if the pedinfo
is specified via a designcode or matescheme.
chrlen::AbstractVector= 100*ones(5)
specifies lengths (cM) for each chromosome.
isobligate::Bool=false
specifies whether there must be at least one crossover event
interference::Integer=0
specifies the strength of chiasma interference. By default, no chiasma interference. The recombination breakpoins are obtained by taking every (1+interference) points that follow a Poisson distribution along chromosome.
outstem::Union{Nothing,AbstractString}="outstem"
specifies the stem of output filenames.
workdir::AbstractString = pwd()
specifies the working directory.
verbose::Bool=true
: if true, print details on the stdout.
Outputs
Output file | Description |
---|---|
outstem_ped.csv | simulated pedigree file |
outstem_truecontfgl.csv | truevalues of continuous origin-genotypes |
Here fgl denotes founder genome labels, and origin-genotypes denote genotypes with each fgl being regarded as a distinct allele.
Examples
julia> magicsimulate("8ril-self6"; popsize=800)
MagicFilter
— ModuleMagicFilter
a package for filtering markers and individuals in connected multiparental populations. Export one function: magicfilter
.
MagicFilter.magicfilter
— Functionmagicfilter(genofile, pedinfo;
formatpriority, isphysmap, recomrate, commentstring,kwargs...)
filter mrkers and founders/offspring from genofile and pedinfo.
Positional arguments
genofile::AbstractString
genotypic data file.
pedinfo::Union{MagicBase.JuncDist,AbstractString}
specifies pedigree information via a pedigree fille or a string designcode or via a struct juncdist::JuncDist.
Keyword arguments
See formmagicgeno
for the arguments (formatpriority
, isphysmap
, recomrate
,commentstring
) that are used for formming magicgeno. Note that formatpriority=["AD","GT"] by default.
See magicfilter!
for kwargs.
Examples
julia> magicfilter("geno.vcf.gz","4ril_self3")
MagicFilter.magicfilter!
— Functionmagicfilter!(magicgeno::MagicGeno; kwargs...)
removes bad mrkers and founders/offspring from magicgeno.
Keyword arguments
model::AbstractString="jointmodel"
: prior dependence of ancestral prior process along the two homologous chromosomes within an offspring. It must be "depmodel", "indepmodel", or "jointmodel".
likeparameters::LikeParameters=LikeParameters()
: specifies default genotyping error rates.
isfounderinbred::Bool=true
: specifies if fouonders are inbred
chrsubset::Union{Nothing,AbstractRange,AbstractVector}=nothing
: subset of chromosome indices. nothing
denotes all chromosomes. Delete chromosome indices that are out of range.
snpsubset::Union{Nothing,AbstractRange,AbstractVector}=nothing
: subset of marker indices within each chromosome. nothing
denotes all markers. Marker indices that are larger than the number of markers within the chromosome are deleted.
threshcall::Real = model == "depmodel" ? 0.95 : 0.9
: threshold for genotype calling. The filtering is based on called genotypes.
min_subpop::Integer = 1
: delete subpopulations with size < min_subpop.
min_nprogeny::Integer = 1
: delete founder and their progeny if the number of progeny < min_nprogeny.
snp_monosubpop::Integer = 20
: a subpopulation is tested for monomorphic at a marker only if the number of observed genotypes >= snp_mono_subpop.
snp_mono2miss::Union{Nothing,Bool} = true
: if true, all offspring genotypes in a monomorphic subpopulation are set to missing, and otherwise only inconsistent offspring genotypes are corrected. And if nothing, offspring genotypes are not changed.
del_inconsistent::Bool = false
: if true, delete markers with inconsistent changes of founder genotypes.
snp_minmaf::Real = 0.05
: keep only markers if maf >= snp_min_maf; maf denotes minor allele frequency.
snp_missfilter::Function=(fmiss,omiss)-> omiss <= 1.0 || fmiss < 0.0
: keep only markers if snp_missfilter(fmiss, omiss); fmiss denotes missing fraction in founders, and omiss for offspring.
offspring_maxmiss::Real = 0.99
: delete offspring if its missing > offspring_max_miss
isfilterdupe::Bool=false
: if true, remove duplicated offspring by their correlations.
offspring_maxcorr::Real = 0.99
: two offspring are duplciated if their correlation >= offspring_maxcorr,
offspring_cutcorr::Real = 0.4
: pairwise offspring correlations that < offspring_cutcorr are set to zeros.
isparallel::Bool=true
: if true, pefrom parallel multicore computing.
workdir::AbstractString=pwd()
: working directory for input and output files.
outstem::Union{Nothing,AbstractString}="outstem"
specifies the stem of output files.
logfile::Union{Nothing,AbstractString,IO}= (isnothing(outstem) ? nothing : string(outstem,"_magicreconstruct.log"))
: log file or IO for writing log. If it is nothing, no log file.
verbose::Bool=true
: if true, print details on the stdout.
Examples
julia> magicgeno = formmagicgeno("geno.vcf.gz","ped.csv")
julia> magicfilter!(magicgeno)
MagicCall
— ModuleMagicCall
a package for single site genotyping calling from sequence data in connected multiparental populations. Export one function: magiccall
.
MagicCall.magiccall
— Functionmagiccall(genofile, pedinfo; kwargs...)
single marker genotype call from genofile and pedinfo.
Positional arguments
genofile::AbstractString
genotypic data file.
pedinfo::Union{MagicBase.JuncDist,AbstractString}
specifies pedigree information via a pedigree fille or a string designcode or via a struct juncdist::JuncDist.
Keyword arguments
model::Union{AbstractString,AbstractVector}="jointmodel"
: prior depedence of ancestral prior process along the two homologous chromosomes within an offspring. It must be "depmodel", "indepmodel", or "jointmodel".
likeparameters::LikeParameters=LikeParameters(peroffspringerror=0.0)
: parameters for genotypic data model. If isinfererror = true, parameters with values being nothing will be inferred.
threshlikeparameters::ThreshLikeParameters=ThreshLikeParameters()
: markers with inferred likeparameters values > threshlikeparameters values will be deleted.
priorlikeparameters::PriorLikeParameters=PriorLikeParameters(offspringerror=Beta(1.05,9),seqerror=Beta(1.05,9))
: priors for likelihood parameters
israndallele::Bool=true
: if true, genotyping error model follows the random allelic model, and otherwise the random genotypic model.
isfounderinbred::Bool=true
: if true, founders are inbred, and otherwise they are outbred.
threshcall::Real = 0.9
: offspring genotypes are call if the maximum posterior probability > threshcall.
delmultiallelic::Bool=true
: if true, delete markers with >=3 alleles.
delmonomorphic::Bool=true
: if true, delete monomorphic markers.
snp_minmaf::Real = 0.05
: delete makrs with minor allele frequency (MAF) < 0.05.
snp_maxmiss::Real = 0.99
: delete makrs with genotype missing frequency > 0.99.
israwcall::Bool= false
: if true, perform raw genotype calling.
isinfererror::Bool = !israwcall
: if true, infer marker specific likelihood parameters that have values of nothing in likeparameters.
isparallel::Bool=true
: if true, parallel multicore computing over chromosomes.
outstem::Union{Nothing,AbstractString}="outstem"
: stem of output filenames.
outext::AbstractString=".vcf.gz"
: extension of output file for imputed geno.
logfile::Union{Nothing, AbstractString,IO} = outstem*"_magiccall.log"
: log file or IO for writing log. If it is nothing, no log file.
workdir::AbstractString=pwd()
: working directory for input and output files.
verbose::Bool=true
: if true, print details on the stdout.
MagicMap
— ModuleMagicMap
a package for genetic map construction in connected multiparental populations. Export one function: magicmap
.
MagicMap.magicmap
— Functionmagicmap(genofile, pedinfo; kwargs...)
genetic map construction from genofile and pedinfo.
Positional arguments
genofile::AbstractString
genotypic data file.
pedinfo::Union{MagicBase.JuncDist,AbstractString}
specifies pedigree information via a pedigree fille or a string designcode or via a struct juncdist::JuncDist.
Keyword arguments
formatpriority::AbstractVector=["GT","AD"]
: the priority of genotype formats when parasing input vcf genofile.
model::AbstractString="jointmodel"
: prior depedence of ancestral prior process along the two homologous chromosomes within an offspring. It must be "depmodel", "indepmodel", or "jointmodel".
likeparameters::LikeParameters=LikeParameters()
: specifies default genotyping error rates.
threshcall::Real = model == "depmodel" ? 0.95 : 0.9
: threshold for genotype calling. The filtering is based on called genotypes.
israndallele::Bool=true
: if true, genotyping error model follows the random allelic model, and otherwise the random genotypic model.
isfounderinbred::Bool=true
: if true, founders are inbred, and otherwise they are outbred.
snpthin::Integer = 1
: take every snpthin-th markers.
ispermmarker::Bool=true
: if true, permute input marker ordering
isdupebinning::Union{Nothing,Bool}=false
: if ture, bin duplicate marker.
binshare::Real=0.5
: min fraction of shared genotypes between represent marker and each of the rest in a bin.
minlodsave::Union{Nothing, Real}=nothing
: results of pairwise analyses are saved only if the LOD score for LD or linkage > minlodsave. If it is nothing, minlodsave increases with number of markers.
minldsave::Union{Nothing, Real}=nothing
: results of pairwise LD analyses are saved only if the LD score (squared allelic correlation) > minldsave. If it is nothing, minldsave increases with number of markers.
ncluster::Union{Nothing, Integer}=nothing
: number of linkage groups. If it is nothing, ncluster will be inferred in the range of [minncluster, maxncluster]
minncluster::Integer = isnothing(ncluster) ? 1 : ncluster
: min number of linkage groups. If it is nothing, minncluster is set to 1 if ncluster = nothing and otherwise it is set to ncluster
maxncluster::Integer = isnothing(ncluster) ? 30 : ncluster
: max number of linkage groups. If it is nothing, maxncluster is set to 30 if ncluster = nothing and otherwise it is set to ncluster
clusteralg::Union{Nothing,AbstractString}=nothing
: clustering algorithm after spectral embedding.
minsilhouette::Real=0.0
: delete markers withg silhouette scores < minsilhouette.
minlodcluster::Union{Nothing, Real} = nothing
: minimum linkage LOD threshold. If it is nothing, estimated internally as the minimum lod keeping the resulting graph connected, ignoring the connected components of size < mincomponentsize.
mincomponentsize::Union{Nothing,Integer} = nothing
: the markers in the graph connectecd components of size < mincomponentsize are removed. If it is nothing, it is internally set.
maxrf::Union{Nothing,Real} = nothing
: keep pairwise linakge analyses only if recombation fraction <= maxrf.
binrf::Union{Nothing,Real}=nothing
: if binrf >= 0, perform linkage-based marker binning such that the recombation fraction for two markers in a bin is always <= binrf, and otherwise not perform linkage-based binning. If it is nothing, binrf is set to 0.001 if #markers > ncluster2000 and otherwise -1 if ncluster is not nothing, and binrf is set to 0.001 if #markers > maxncluster2000 and otherwise -1 if ncluster is nothing.
alwayskeep::Real=0.99
: neighbors are always kept if its recombation fraction >= alwayskeep, regardless of knncluster or knnorder.
minlodcluster::Union{Nothing,Real} = nothing
: min LOD score for clustering. If it is nothing, it is internally set.
minlodorder::Union{Nothing,Real} = nothing
: min LOD score for ordering. If it is nothing, it is internally set.
maxminlodcluster::Union{Nothing,Real} = nothing,
: if minlodcluster = nothing, minlodcluster is internally estimated with upbound maxminlodcluster.
maxminlodorder::Union{Nothing,Real} = nothing
: if minlodorder = nothing, minlodorder is internally estimated with upbound maxminlodorder.
knncluster::Union{Nothing,Function} = nothing
: an anonymous function knncluster(x) of #markers x. It returns #nearest neighbors for clustering. If it is nothing, knncluster = x->0.1*x.
knnorder::Union{Nothing,Function} = nothing
: an anonymous function knncluster(x) of #markers x in a linkage group. It returns #nearest neighbors for ordering. If it is nothing, knnorder = x->sqrt(x).
isparallel::Bool=true
: if true, multicore computing over chromosomes.
workdir::AbstractString=pwd()
: working directory for input and output files.
commentstring::AbstractString="##"
: rows that begin with commentstring will be ignored.
outstem::Union{Nothing,AbstractString}="outstem"
: stem of output filenames.
outext::AbstractString=".vcf.gz"
: extension of output file for imputed geno.
logfile::Union{Nothing,AbstractString,IO}= (isnothing(outstem) ? nothing : string(outstem,"_magicimpute.log"))
: log file or IO for writing log. If it is nothing, no log file.
verbose::Bool=true
: if true, print details on the stdout.
Example
julia> magicmap(genofile,pedinfo; ncluster=12)
MagicImpute
— ModuleMagicImpute
a package for genotype imputation in multiparental populations. Export two functions: magicimpute
and magicimpute!
.
MagicImpute.magicimpute
— Functionmagicimpute(genofile, pedinfo;
formatpriority, isphysmap, recomrate, commentstring,kwargs...)
genotype imputation from genofile and pedinfo.
Positional arguments
genofile::AbstractString
genotypic data file.
pedinfo::Union{MagicBase.JuncDist,AbstractString}
specifies pedigree information via a pedigree fille or a string designcode or via a struct juncdist::JuncDist.
Keyword arguments
See formmagicgeno
for the arguments (formatpriority
, isphysmap
, recomrate
,commentstring
) that are used for formming magicgeno. Note that formatpriority=["AD","GT"] by default.
mapfile::Union{Nothing, AbstractString}=nothing
: if it is nothing, use the marker map in the input genofile, and otherwise reset genetic marker map by that in mapfile. The mapfile can either be in VCF format or in CSV format. For VCF format, genetic map is provided in the "INFO" column using keywords "LINKAGEGROUP" and "POSCM". For CSV-format, it must contain at least five columns: "marker", "linkagegroup", "poscm", "physchrom", and "physposbp", where missing values are represented by "NA". If there exist columns "binno" and "represent", markers with the same "binno" are binned with the represent being the marker with non-zero "represent". All the rest columns are ignored.
See magicimpute!
for the other arguments.
Examples
julia> magicimpute("geno.vcf.gz","4ril_self3")
MagicImpute.magicimpute!
— Functionmagicimpute!(magicgeno::MagicGeno; kwargs...)
genotype imputation from magicgeno.
Keyword arguments
model::Union{AbstractString,AbstractVector}="jointmodel"
: prior depedence of ancestral prior process along the two homologous chromosomes within an offspring. If model is a string, it must be "depmodel", "indepmodel", or "jointmodel". If model is a vector, the first element specifies the model for founder imputation and the last element for offspring imputation.
likeparameters::LikeParameters=LikeParameters()
: parameters for genotypic data model. If isinfererror = true, parameters with values being nothing will be inferred.
threshlikeparameters::ThreshLikeParameters=ThreshLikeParameters()
: markers with inferred likeparameters values > threshlikeparameters values will be deleted.
priorlikeparameters::PriorLikeParameters=PriorLikeParameters()
: priors for likelihood parameters
israndallele::Bool=true
: if true, genotyping error model follows the random allelic model, and otherwise the random genotypic model.
isfounderinbred::Bool=true
: if true, founders are inbred, and otherwise they are outbred.
chrsubset::Union{Nothing,AbstractRange,AbstractVector}=nothing
: subset of chromosome indices. nothing
denotes all chromosomes. Delete chromosome indices that are out of range.
snpsubset::Union{Nothing,AbstractRange,AbstractVector}=nothing
: subset of marker indices within each chromosome. nothing
denotes all markers. Marker indices that are larger than the number of markers within the chromosome are deleted.
target::AbstractString = "all"
: target of imputation. target=all
imputes founders and offspring. target
must be "all", "founder", "offspring".
threshimpute::Real=0.9
: offspring genotypes are imputed if the maximum posterior probability > threshimpute.
byfounder::Integer=0
: alternatively impute each blocks of founders. The founders are partitioned such that the size of each block <= byfounder. If byfounder=-1, impute all founders simulteneously. If byfounder=0, reset to the maximum subpopulation size, and the partition is based on the fouders of each sub-population.
inputneighbor::Union{Nothing,AbstractDict}=nothing
: nearest neighbors for each markers, which is required for neighbor-based marker order refinement. If it is nothing and isordermarker = true, perform only random marker order refinement.
inputbinning::Union{Nothing,AbstractDict}=nothing
: a parition of markers into bins. If it is not nothing, first impute founders for representative markers of each bin and them impute founders for all markers.
isinfererror::Bool = true
: if true, infer marker specific likelihood parameters that have values of nothing in likeparameters.
tukeyfence::Real=3.0
: tukey fence for detecting outlier error rates (including foundererror, offspringerror, seqerror, and allelebalancemean).
minoutlier::Real=0.05
: markers with outlier error rates are removed only if their error rates > minoutlier.
iscorrectfounder::Union{Nothing, Bool} = nothing
: if true, perform parental error correction.
phasealg::AbstractString="unphase"
: if phasealg=forwardbackward, the output diplotype probabilities (in format GP), corresonding to the phased genotypes 0|0, 0|1, 1|0, and 1|1, are caculated based on the forward-backward algorithm, and the output phased offspring genotypes (in format GT) are given by those with the largest diplotype probabilities if they are greater than threshcall. If phasealg=viterbi, the output diplotype probabilities (GP) are set to those of phasealg=forwardbackward, and the output phased genotypes (GT) are caculated based on the Viterbi algorithm. If phasealg=unphase, the output genotype probabilities (GP), corresonding to the unphased genotypes 0/0, 0/1, and 1/1, are calculated based on the forward backward algorithm, and the output unphased genotypes (GT) are given by those with the largest genotype probabilities if they are greater than threshcall.
isdelmarker::Bool=true
: if true, perform marker deletion.
delsiglevel::Real = 0.01
: significance level for marker deletion
isordermarker::Bool=false
: if true, refine local marker ordering.
isspacemarker::Bool=false
: if true, estimate inter-marker distances.
trimcm::Real=20
: remove markers of each segment with distances to the flanking markers > trimcm. The number of markers of each segment must be less than 5% total number of markers.
skeletonsize::Union{Nothing,Integer} = nothing
: number of skeleton markers for piecewisely re-scaling inter-marker distances. If it is nothing, skeletonsize is set to the number of distint positions in the genetic map before re-scaling.
slidewin_neighbor::Union{Nothing,Integer} = 200
: max sliding window size for neighbor-based marker order refinement.
slidewin::Union{Nothing,Integer} = nothing
: max sliding window size for random marker order refinement
binriffle::Union{Nothing,Integer} = nothing
: valid only in the case of marker binning. Skip magicimputefounder after replacing representatives with binned markers if binriffle < 0. Keep magicimputefounder for binned markers but without refinning ordering if 0 <= binriffle <= 1. Keep magicimpute_founder for binned markers if binriffle >=2, and if isordermarker = true set random order refinement with slidewin = binriffle and ignore neighbor-based order refinement.
orderactions::AbstractVector = ["inverse","inverse00"]
: update actions for random marker order refinement. It must be a subset of ["inverse","inverse00", "inverse01","inverse10"].
orderactions_neighbor::AbstractVector = ["inverse","inverse01"]
: update actions for neighbor-based marker order refinement. It must be a subset of ["inverse","inverse00", "inverse01","inverse10"].
inittemperature::Real= isordermarker ? 2.0 : 0.0
: initial temperature of annealing algorithm for marker ordering.
coolrate::Real=0.7
: temperature is mutiplied by coolrate after each iteration of annealing agrogrithm.
minaccept::Real=0.15
: minimum accept rate for controlling the window size of ordering update.
isparallel::Bool=true
: if true, parallel multicore computing over chromosomes.
workdir::AbstractString=pwd()
: working directory for input and output files.
tempdirectory::AbstractString = tempdir()
: temporary directory for inter-mediate results.
outstem::Union{Nothing,AbstractString}="outstem"
: stem of output filenames.
outext::AbstractString=".vcf.gz"
: extension of output file for imputed geno.
logfile::Union{Nothing,AbstractString,IO}= (isnothing(outstem) ? nothing : string(outstem,"_magicimpute.log"))
: log file or IO for writing log. If it is nothing, no log file.
verbose::Bool=true
: if true, print details on the stdout.
Examples
julia> magicgeno = formmagicgeno("geno.vcf.gz","ped.csv")
julia> magicimpute!(magicgeno)
MagicReconstruct
— ModuleMagicReconstruct
a package for haplotype reconstruction in connected multiparental populations. Export two functions: magicreconstruct
and magicreconstruct!
.
MagicReconstruct.magicreconstruct
— Functionmagicreconstruct(genofile, pedinfo;
formatpriority, isphysmap, recomrate, commentstring,kwargs...)
haplotye reconstruction from genofile and pedinfo.
Positional arguments
genofile::AbstractString
genotypic data file.
pedinfo::Union{MagicBase.JuncDist,AbstractString}
specifies pedigree information via a pedigree fille or a string designcode or via a struct juncdist::JuncDist.
Keyword arguments
See formmagicgeno
for the arguments (formatpriority
, isphysmap
, recomrate
,commentstring
) that are used for formming magicgeno, except that formatpriority=["GP", "AD", "GT"] by default.
See magicreconstruct!
for the other arguments.
Examples
julia> magicreconstruct("geno.vcf.gz","4ril_self3")
MagicReconstruct.magicreconstruct!
— Functionmagicreconstruct!(magicgeno::MagicGeno; kwargs...)
haplotype reconstruction from magicgeno.
Keyword arguments
model::AbstractString="jointmodel"
: prior depedence of ancestral prior process along the two homologous chromosomes within an offspring. It must be "depmodel", "indepmodel", or "jointmodel".
israndallele::Bool=true
: if true, genotyping error model follows the random allelic model, and otherwise the random genotypic model.
isfounderinbred::Bool=true
: if true, founders are inbred, and otherwise they are outbred.
chrsubset::Union{Nothing,AbstractRange,AbstractVector}=nothing
: subset of chromosome indices. nothing
denotes all chromosomes. Delete chromosome indices that are out of range.
snpsubset::Union{Nothing,AbstractRange,AbstractVector}=nothing
: subset of marker indices within each chromosome. nothing
denotes all markers. Marker indices that are larger than the number of markers within the chromosome are deleted.
hmmalg::AbstractString="forwardbackward"
: HMM alogrithm for haplotype reconstruction, and it must be either "forwardbackward" or "viterbi".
isignorephase::Bool=false
: if true, the phases of offspring genotypes are ignored.
isMMA::Bool=true
: if true, the Mathematica version of RABBIT.
nplot_subpop::Integer=10
: plots for up to nplot_subpop offspring in each subpopulation.
isparallel::Bool=true
: if true, multicore computing over chromosomes.
workdir::AbstractString=pwd()
: working directory for input and output files.
tempdirectory::AbstractString = tempdir()
: temporary directory for inter-mediate results.
outstem::Union{Nothing,AbstractString}="outstem"
: stem of output filenames.
outext::AbstractString=".csv.gz"
: extension of output file for imputed geno.
logfile::Union{Nothing,AbstractString,IO}= (isnothing(outstem) ? nothing : string(outstem,"_magicimpute.log"))
: log file or IO for writing log. If it is nothing, no log file.
verbose::Bool=true
: if true, print details on the stdout.
Examples
julia> magicgeno = formmagicgeno("geno.vcf.gz","ped.csv")
julia> magicreconstruct(magicgeno,model="jointmodel")
MagicScan
— ModuleMagicScan
a package for genomic QTL scan in connected multiparental populations. Export one function: magicscan
.
MagicScan.magicscan
— Functionmagicscan(ancestryfile, phenofile; commentstring="##",kwargs...)
perform genomics scan of QTL for multiparental populations. See magicreconstruct
for generating ancestryfile by haplotype reconstruction.
Positional arguments
ancestryfile::AbstractString
specifies ancestry file resulting magicreconstruct
phenofile::AbstractString
specifies phenotypic file.
Keyword arguments
equation::Union{Nothing,FormulaTerm} = nothing
speficies linear model equation. By defulat, StatsModel.@formula(y ~ 1), where y is the last column name in phenofile.
thresholds::Union{Nothing,AbstractVector} = nothing
: list of thresholds used in plotting scanning profile.
nperm::Integer=200
: number of permutations of phenotypes for calculating thresholds that are not specified.
siglevels::AbstractVector = [0.05]
: significance levels for calculating thresholds by permutations.
islog10p::Bool=true
: if true, the profile refers to -log10P, and LOD otherwise.
missingstring=["NA","missing"]
: string denotes a missing phenotypic value.
commentstring::AbstractString
specifies the lines beginning with commentstring are ignored in genofile or pedfile given by pedinfo
.
workdir::AbstractString=pwd()
: working directory for input and output files.
outstem::Union{Nothing,AbstractString}="outstem"
specifies the stem of filename saving magicancestry. See MagicBase.savemagicancestry
for the description of outputfile "outstem_magicancestry.csv.gz".
logfile::Union{Nothing,AbstractString,IO}= (isnothing(outstem) ? nothing : string(outstem,"_magicreconstruct.log"))
: log file or IO for writing log. If it is nothing, no log file.
verbose::Bool=true
: if true, print details on the stdout.
Examples
julia> magicscan("magicancestry.csv.gz","pheno.csv")