S4 MagicMap

MagicMap

The magicmap sequentially performs the following:

  1. Binning duplicate markers. By default, isbinning = true only if the number of markers > 10^4. magicmap will skip this step if the outfile outstem*"_magicmap_binning.csv.gz" already exists.
  2. Pairwise linkage disequilibrium analysis. magicmap will skip this step if the outfile outstem*"_magicmap_magicld.csv.gz" already exists.
  3. Pairwise linkage analysis. magicmap will skip this step if the outfile outstem*"_magicmap_magiclinkage.csv.gz" already exists.
  4. Construct genetic map.
Note

It is recommended to repeat magicmap with different option values for map construction such as the threshold minlodcluster for clustering. If a repeat running has the same workdir and outstem, magicmap will automatically skip each of the first three steps that are time consuming by checking if the corresponding outfile exists. To avoid skipping some of the steps, manually delete the corresponding output files.

# code for Julia
using MagicMap
cd(@__DIR__)
genofile = outstem*"_magiccall_geno.vcf.gz"
pedfile = outstem*"_magicfilter_ped.csv"
magicmap(genofile,pedfile;    
    minncluster = 2, 
    maxncluster = 10,    
    outstem
)
# code for Linux shell. 
# For Window CMD, replace multiline key \ by  ^, and replace comment-key # by ::
julia rabbit_magicmap.jl -g example_magiccall_geno.vcf.gz \
    -p example_magicfilter_ped.csv \
    --minncluster 2 \
    --maxncluster 10 \        
    --nworker 5 \
    -o example

Output files

outfileDescription
outstem*"_magicmap.log"log file
outstem*"_magicmap_binning.csv.gz"results of pairwise duplicating
outstem*"_magicmap_magicld.log"log file for pairwise LD
outstem*"_magicmap_magicld.csv.gz"results of pairwise LD
outstem*"_magicmap_magiclinkage.log"log file for pairwise linkage
outstem*"_magicmap_magiclinkage.csv.gz"results of pairwise linkage
outstem*"_magicmap_construct_eigen.png"plot eigenvalues from spectral clustering
outstem*"_magicmap_construct_silhouette.csv"silhouette for each nlcuster
outstem*"_magicmap_construct_silhouette.png"plot silhouette for marker grouping
outstem*"_magicmap_construct_LD_heatmap.png"heatmap for pairwise LD
outstem*"_magicmap_construct_linkage_heatmap.png"heatmap for pairwise linkage
outstem*"_magicmap_construct_map.csv.gz"constructed mapfile for downstream analysis
outstem*"_magicmap_construct_compare_inputmap.png"compare with inputmap of genofile (if exists)

Ouput: LD & linkage heatmap

outstem*"_magicmap_construct_LD_heatmap.png" gives the heatmap for pairwise LD analyses, where LD is measured the squared allelic correlation.

outstem*"_magicmap_construct_linkage_heatmap.png" gives the heatmap for pairwise linkage analyses, where the matrix element is 1 - scaled recombination fraction.

Ouput: clustering

outstem*"_magicmap_construct_eigen.png" plots eigenvalues resulting from spectral clustering. The number of calculated eigenvalues is ncluster + 1 if ncluster is not nothing, and maxncluster + 1 otherwise. The large gap between ncluster-th and (ncluster + 1)-th eigenvalues indicates a potentially good clustering; the eigenvalues are in an increasing order.

outstem*"_magicmap_construct_silhouette.png" plots silhouette scores. The silhouette score measures the closeness between a marker and its linkage group. It ranges from -1 to 1, where a high value indicates that the marker is well matched to its linkage group.

Ouput: map comparison

outstem*"_magicmap_construct_compare_inputmap.png" compares constructed genetic map with the markermap in input genofile if it is not missing.