Skip to contents

Forward selection of traits or environmental variables using dc-CA.

Usage

# S3 method for class 'dcca'
FS(
  mod,
  ...,
  select = c("traits", "env"),
  consider,
  permutations = 999,
  n_axes = "all",
  initial_model = "1",
  factor2categories = TRUE,
  test = TRUE,
  threshold_P = 0.1,
  PvalAdjustMethod = "holm",
  max_step = 10,
  verbose = FALSE
)

Arguments

mod

dc-CA model.

...

unused.

select

Character, the default "traits" selects trait variables based on the environmental model of mod. Set select = "e" to select environmental variables based on the trait model of mod.

consider

character vector of names considered for addition, either trait names in mod$data$dataTraits or environmental variable names in in mod$data$dataEnv. The names may include transformations of predictor variables, such as log(.), if consider does not include factors or if factor2categories=FALSE. If consider includes factors, such transformations give in a error in the default setting (factor2categories=TRUE).

permutations

a list of control values for the permutations as returned by the function how, or the number of permutations required (default 999), or a permutation matrix where each row gives the permuted indices.

n_axes

number of eigenvalues to select upon. The sum of n_axes eigenvalues is taken as criterion. Default "full" for selection without dimension reduction to n_axes. If n_axes =1, selection is on the first eigenvalue for selection of variables that form an optimal one-dimensional model.

initial_model

character specifying what should be inside Condition(). Default: "1" (nothing, the intercept only). Examples: With selection of environmental varialbles, "region" for a within-region analysis or "A*B" for a within analysis specified by the interaction of factors A and B, with region, A, B in the data. Analogously, a trait or trait factor can be the initial model.

factor2categories

logical, default TRUE, to convert factors to their categories (set(s) of indicator values). If FALSE, the selection uses, the fit of a factor divided by its number of categories minus 1.

test

logical; default: TRUE.

threshold_P

significance level, after adjustment for testing multiplicity, for addition of a variable to the model.

PvalAdjustMethod

method for correction for multiple testing in p.adjust, default "holm", which is an improved version Bonferroni.

max_step

maximal number of variables selected.

verbose

show progress, default: TRUE.

Value

list with three elements: final... with selected variables, model_final, and process with account of the selection process

Details

The selection is on the basis of the additional fit (inertia) of a variable given the variables already in the model.

The function FS does not implement the max test and may thus be liberal. It is recommended to test the final model (element model_ final of the returned object) by anova.

See also

Examples

data("dune_trait_env")

# rownames are carried forward in results
rownames(dune_trait_env$comm) <- dune_trait_env$comm$Sites
abun <- dune_trait_env$comm[, -1] # must delete "Sites"

mod <- dc_CA(formulaEnv = abun ~ Moist + Mag,
             formulaTraits = ~ F + R + N + L,
             dataEnv = dune_trait_env$envir,
             dataTraits = dune_trait_env$traits,
             verbose = FALSE)

# selection of traits with environmental model of mod (~ Moist+Mag)
out1 <- FS(mod, consider = c("F", "R", "N", "L"), 
           select = "traits", verbose = FALSE) 

names(out1)
#> [1] "finalWithOneExtra" "model_final"       "formula"          
#> [4] "process"          
out1$finalWithOneExtra
#>   df   variance  pval pvaladj
#> F  1 0.28633635 0.001   0.004
#> N  1 0.24243384 0.001   0.003
#> L  1 0.03114819 0.126   0.252
out1$model_final
#> Step 1: the CCA ordination of the transposed matrix with trait constraints,
#>         useful in itself and also yielding CWMs of the orthonormalized traits for step 2.
#> 
#> Call: cca0(formula = formulaTraits, response = tY, data =
#> out0$data$dataTraits)
#> 
#>               Inertia Proportion Rank
#> Total          2.3490     1.0000     
#> Constrained    0.7657     0.3260    2
#> Unconstrained  1.5833     0.6740   19
#> 
#> Inertia is scaled Chi-square
#> 
#> Eigenvalues for constrained axes:
#>   CCA1   CCA2 
#> 0.4576 0.3081 
#> 
#> Eigenvalues for unconstrained axes:
#> CA1 CA2 CA3 CA4 CA5 CA6 CA7 CA8 
#>  NA  NA  NA  NA  NA  NA  NA  NA 
#> (Showing 8 of 19 unconstrained eigenvalues)
#> 
#> mean, sd, VIF and canonical coefficients with their optimistic [!] t-values:
#>      Avg    SDS    VIF   Regr1   tval1
#> F 6.1126 1.6685 1.0575  0.6413  8.7440
#> N 5.1674 1.6303 1.0575 -0.1125 -1.5334
#> 
#> Step 2: the RDA ordination of CWMs of the orthonormalized traits 
#>         of step 1 with environmental constraints:
#> 
#> Call: rda(formula = out1$CWMs_orthonormal_traits ~ Moist + Mag, data =
#> out1$data$dataEnv)
#> 
#>               Inertia Proportion Rank
#> Total          0.7657     1.0000     
#> Constrained    0.5288     0.6906    2
#> Unconstrained  0.2370     0.3094    2
#> 
#> Inertia is variance
#> 
#> Eigenvalues for constrained axes:
#>    RDA1    RDA2 
#> 0.31310 0.21567 
#> 
#> Eigenvalues for unconstrained axes:
#>     PC1     PC2 
#> 0.17114 0.06581 
#> 
#> mean, sd, VIF and canonical coefficients with their optimistic [!] t-values:
#>        Avg    SDS    VIF  Regr1  tval1
#> Moist 2.90 1.7578 1.3001 0.4769 4.6192
#> MagBF 0.15 0.3571 1.4553 0.0075 0.0684
#> MagHF 0.25 0.4330 1.4214 0.0527 0.4879
#> MagNM 0.30 0.4583 1.4393 0.1948 1.7933
#>      Avg    SDS    VIF  Regr1   tval1
#> F 6.1126 1.6685 1.0575  0.726  4.9779
#> N 5.1674 1.6303 1.0575 -0.539 -3.6956
#> 
#>                weighted variance
#> total                      2.349
#> traits_explain             0.766
#> env_explain                0.967
#> constraintsTE              0.529
#> attr(,"meaning")
#>                meaning                                                                
#> total          "total inertia (= weighted variation)"                                 
#> traits_explain "trait-constrained variation"                                          
#> env_explain    "environment-constrained variation"                                    
#> constraintsTE  "trait-constrained variation explained by the predictors in formulaEnv"

# selection of environmental variables with trait model of mod (~ F + R + N + L)
out2 <- FS(mod, consider =  c("A1", "Moist", "Mag", "Use", "Manure"), 
           select= "env", verbose = FALSE) 

names(out2)
#> [1] "finalWithOneExtra" "model_final"       "formula"          
#> [4] "process"          
out2$finalWithOneExtra
#>        df   variance  pval pvaladj
#> Moist   1 0.33061402 0.001   0.008
#> Manure  1 0.20050625 0.001   0.007
#> A1      1 0.04503044 0.089   0.534
out2$model_final
#> Step 1: the CCA ordination of the transposed matrix with trait constraints,
#>         useful in itself and also yielding CWMs of the orthonormalized traits for step 2.
#> 
#> Call: cca0(formula = formulaTraits, response = tY, data =
#> out0$data$dataTraits)
#> 
#>               Inertia Proportion Rank
#> Total          2.3490     1.0000     
#> Constrained    0.8701     0.3704    4
#> Unconstrained  1.4789     0.6296   19
#> 
#> Inertia is scaled Chi-square
#> 
#> Eigenvalues for constrained axes:
#>   CCA1   CCA2   CCA3   CCA4 
#> 0.4771 0.3122 0.0580 0.0228 
#> 
#> Eigenvalues for unconstrained axes:
#> CA1 CA2 CA3 CA4 CA5 CA6 CA7 CA8 
#>  NA  NA  NA  NA  NA  NA  NA  NA 
#> (Showing 8 of 19 unconstrained eigenvalues)
#> 
#> mean, sd, VIF and canonical coefficients with their optimistic [!] t-values:
#>      Avg    SDS    VIF   Regr1   tval1
#> F 6.1126 1.6685 1.1323  0.6048  8.2493
#> R 5.8061 1.1894 1.7564  0.0170  0.1865
#> N 5.1674 1.6303 1.7822 -0.1358 -1.4764
#> L 7.3839 0.8774 1.1378  0.1424  1.9382
#> 
#> Step 2: the RDA ordination of CWMs of the orthonormalized traits 
#>         of step 1 with environmental constraints:
#> 
#> Call: rda(formula = out1$CWMs_orthonormal_traits ~ Moist + Manure +
#> Condition(1), data = out1$data$dataEnv)
#> 
#>               Inertia Proportion Rank
#> Total          0.8701     1.0000     
#> Constrained    0.5311     0.6104    2
#> Unconstrained  0.3389     0.3896    4
#> 
#> Inertia is variance
#> 
#> Eigenvalues for constrained axes:
#>   RDA1   RDA2 
#> 0.3331 0.1980 
#> 
#> Eigenvalues for unconstrained axes:
#>     PC1     PC2     PC3     PC4 
#> 0.18449 0.10693 0.02887 0.01864 
#> 
#> mean, sd, VIF and canonical coefficients with their optimistic [!] t-values:
#>         Avg    SDS    VIF   Regr1   tval1
#> Moist  2.90 1.7578 1.0645  0.5519  6.2344
#> Manure 1.75 1.4448 1.0645 -0.0809 -0.9142
#>      Avg    SDS    VIF   Regr1   tval1
#> F 6.1126 1.6685 1.1323  0.7661  5.4523
#> R 5.8061 1.1894 1.7564 -0.1975 -1.1287
#> N 5.1674 1.6303 1.7822 -0.2553 -1.4483
#> L 7.3839 0.8774 1.1378  0.3167  2.2482
#> 
#>                weighted variance
#> total                      2.349
#> traits_explain             0.870
#> env_explain                0.751
#> constraintsTE              0.531
#> attr(,"meaning")
#>                meaning                                                                
#> total          "total inertia (= weighted variation)"                                 
#> traits_explain "trait-constrained variation"                                          
#> env_explain    "environment-constrained variation"                                    
#> constraintsTE  "trait-constrained variation explained by the predictors in formulaEnv"

# selection of environmental variables without a trait model 
# i.e. with a single constraint
mod3 <- cca0(mod$data$Y ~ Moist, data = mod$data$dataEnv)
out3 <- FS(mod3, consider = c("A1", "Moist", "Mag", "Use", "Manure"), 
           threshold_P = 0.05)

out3$finalWithOneExtra
#>       df  variance  pval pvaladj
#> Moist  1 0.4672020 0.001   0.008
#> MagNM  1 0.3088722 0.001   0.007
#> A1     1 0.1841344 0.029   0.174
out3$model_final
#> 
#> Call: FUN(formula = formula2, data = mydata, traceonly = FALSE, cca_object
#> = mod)
#> 
#>               Inertia Proportion Rank
#> Total          2.3490     1.0000     
#> Constrained    0.7761     0.3304    2
#> Unconstrained  1.5729     0.6696   19
#> 
#> Inertia is scaled Chi-square
#> 
#> Eigenvalues for constrained axes:
#>   CCA1   CCA2 
#> 0.4803 0.2958 
#> 
#> Eigenvalues for unconstrained axes:
#> CA1 CA2 CA3 CA4 CA5 CA6 CA7 CA8 
#>  NA  NA  NA  NA  NA  NA  NA  NA 
#> (Showing 8 of 19 unconstrained eigenvalues)
#> 
#> mean, sd, VIF and canonical coefficients with their optimistic [!] t-values:
#>       Avg    SDS    VIF   Regr1   tval1
#> Moist 2.9 0.3931 1.1374 -0.5995 -9.0496
#> MagNM 0.3 0.1025 1.1374 -0.1970 -2.9746
#> 

# selection of traits without an environmental model 
#                         i.e. with a single constraint
tY <- t(mod$data$Y)

mod4 <- cca0(tY ~ L, data = mod$data$dataTraits)

names(mod$data$dataTraits)
#>  [1] "Species"      "Species_abbr" "SLA"          "Height"       "LDMC"        
#>  [6] "Seedmass"     "Lifespan"     "F"            "R"            "N"           
#> [11] "L"           
out4 <- FS(mod4, 
           consider =  c("SLA", "Height", "LDMC", "Seedmass", "Lifespan", 
                         "F", "R", "N", "L"))

out4$finalWithOneExtra
#>          df  variance  pval pvaladj
#> F         1 0.4536777 0.001   0.010
#> N         1 0.3120442 0.001   0.009
#> Seedmass  1 0.1060672 0.060   0.480
out4$model_final
#> 
#> Call: FUN(formula = formula2, data = mydata, traceonly = FALSE, cca_object
#> = mod)
#> 
#>               Inertia Proportion Rank
#> Total          2.3490     1.0000     
#> Constrained    0.7657     0.3260    2
#> Unconstrained  1.5833     0.6740   19
#> 
#> Inertia is scaled Chi-square
#> 
#> Eigenvalues for constrained axes:
#>   CCA1   CCA2 
#> 0.4576 0.3081 
#> 
#> Eigenvalues for unconstrained axes:
#> CA1 CA2 CA3 CA4 CA5 CA6 CA7 CA8 
#>  NA  NA  NA  NA  NA  NA  NA  NA 
#> (Showing 8 of 19 unconstrained eigenvalues)
#> 
#> mean, sd, VIF and canonical coefficients with their optimistic [!] t-values:
#>      Avg    SDS    VIF   Regr1   tval1
#> F 6.1126 1.6685 1.0575  0.6413  8.7440
#> N 5.1674 1.6303 1.0575 -0.1125 -1.5334
#>