Prepare genofile
The first required input for RABBIT is a genofile, which must be a vcf file with extension ".vcf" or ".vcf.gz". The version of vcf file format must be v4.1 or above.
Include genetic map
A vcf file contains the physical map in the first three columns: #CHROM, POS, ID. We can provide the genetic map by adding keywords LINKAGEGROUP
and POSCM
in the INFO column, assuming the linkage group is #CHROM.
#fileformat=VCFv4.3
#INFO=<ID=LINKAGEGROUP,Number=1,Type=String,Description="Linkage group in genetic map">
#INFO=<ID=POSCM,Number=1,Type=Float,Description="Genetic marker position in centiMorgan">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT P1 P2 Offspring1
chr1 34522 snp1 N . . . LINKAGEGROUP=LG1;POSCM=0.0 GT 0/0 1/1 ./1
chr1 54347 snp2 N . . . LINKAGEGROUP=LG1;POSCM=1.3 GT 1/1 0/0 0/1
chr1 66710 snp3 N . . . LINKAGEGROUP=LG1;POSCM=1.5 GT 0/0 ./. 0/0
See MagicMap.magicmap
for constructing the genetic map. See MagicBase.resetmap
for resetting the markermap for a vcf file.
Inferred error rates
Marker specific error rates can be inferred during genotype imputation. These estimates are saved in the INFO column using additional keywords. Denote by sequence allelic balance the expected ratio of number of reads for the reference allele to the number of reads for the alternative allele. We only consider heterozygous allelic balance, which is 1/2 if there is no allelic balance bias. For example, the INFO for snp2 could be "LINKAGEGROUP=LG1;POSCM=1.3;FOUNDERERROR=0.005;OFFSPRINGERROR=0.012;SEQERROR=0.0008;ALLELEBALANCEMEAN=0.48;ALLELEBALANCEDISPERSE=0.1;ALLELEDROPOUT=0.0".
FOUNDERERROR
: allelic error rate in founders.OFFSPRINGERROR
: allelic error rate in offspring.SEQERROR
: sequence based error rate.ALLELEBALANCEMEAN
: mean sequence allelic balance among offspring at a marker.ALLELEBALANCEDISPERSE
: overdispersion for the distribution of sequence allelic balance among offspring.ALLELEDROPOUT
: 0- and 1-inflation for the distribution of sequence allelic balance among offspring.
These keywords are also explained in the beginning comment lines of the vcf file.