April 13, 2017 at 21:41 UTC #67
I would like to test whether including a random effect in my model accounts significant variance in my data. What is the best way to test this? Does afex have a built-in function for comparing models with and without a random effect?
April 13, 2017 at 22:26 UTC #68
This is not a simple question, but the short answer is that
afexdoes not have a function for automatically testing whether or not a certain random effects term accounts for a significant portion of the variance. The reason for the absence of this functionality is that following Barr, Levy, Scheepers, and Tily (2013) and Schielzeth and Forstmeier (2009) including all possible random effects (i.e., ‘the maximal random effects structure justified by the design’) is the conservative thing to do. So random effects that are part of the design should in principle be included (and there are rumors that reviewers require this).
A slightly different perspective on this question comes from
lme4author Doug Bates, Harald Baayen, Reinhold Kliegl and others (Bates, Kliegl, Vasishth, & Baayen, 2015; Matuschek, Kliegl, Vasishth, Baayen, & Bates, 2017) who highlight that an overparameterized model (i.e., models that contain more random effects than can be estimated given the current data) may be prone to failing to converge and exhibit reduced statistical power. Note that these authors nevertheless agree that removing random effects for which there is actually random variation is potentially dangerous (in terms of increasing the Type 1 error rate). So the question then becomes what is the threshold for which we want to say that a random effects term is large enough or ‘significant’. In Matuschek et al., they suggest a downward procedure (i.e., beginning with the maximal model and beginning with the largest effect) with alpha = .2 (instead of the more common alpha = .05). Again, the idea is to avoid situations in which a random effects term is reduced that actually shows random variability.
To gather information about the size of a random effect (irrespective of significance), one can simply call
summaryon mixed objects:
require(afex) data("sk2011.2") sk2_aff <- droplevels(sk2011.2[sk2011.2$what == "affirmation",]) sk_m1 <- mixed(response ~ instruction*inference*type+(inference*type|id), sk2_aff) summary(sk_m1)
Linear mixed model fit by REML ['merModLmerTest'] Formula: response ~ instruction * inference * type + (inference * type | id) Data: sk2_aff REML criterion at convergence: 10393.7 Scaled residuals: Min 1Q Median 3Q Max -3.9241 -0.3314 0.0654 0.3796 3.7675 Random effects: Groups Name Variance Std.Dev. Corr id (Intercept) 141.84 11.910 inference1 158.83 12.603 -0.71 type1 39.81 6.310 -0.24 -0.09 type2 10.43 3.230 0.43 0.23 -0.41 inference1:type1 54.38 7.374 -0.05 -0.33 -0.55 -0.42 inference1:type2 4.33 2.081 -0.39 0.50 0.16 -0.31 -0.31 Residual 443.39 21.057 Number of obs: 1134, groups: id, 63 [...]
The size of the random effects is given by the variance or standard deviation of the parameters (and the variances can be related to the residual variance also given). Note that
lme4function and no
afexfunction so the output is given in terms of the model parameters and not the factors. So for
type, which is a factor with three levels, we have two parameters where neither can be easily mapped to the three factor levels and should be considered jointly (the same is true for the
We can now fit another model for which we remove the highest order random effect, the
inference:typeinteraction. Then we can compare the model fit of the two models using the
sk_m2 <- mixed(response ~ instruction*inference*type+(inference+type|id), sk2_aff) anova(sk_m1, sk_m2)
Data: sk2_aff Models: sk_m2: response ~ instruction * inference * type + (inference + type | sk_m2: id) sk_m1: response ~ instruction * inference * type + (inference * type | sk_m1: id) Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq) sk_m2 23 10496 10612 -5225.1 10450 sk_m1 34 10462 10633 -5196.9 10394 56.568 11 3.997e-08 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Here the likelihood ratio tests performed by
anovaindicates that removing the highest order effect leads to a significant reduction in model fit with chi^2 (11) = 56.57, p < .0001.
Note that for
afexversions prior to 0.17 (which will arrive on CRAN shortly), there was an error in the model labels when calling
anovain which case the following produces the same output:
anova(sk_m1$full_model, sk_m2$full_model, refit = FALSE)
– Bates, D., Kliegl, R., Vasishth, S., & Baayen, H. (2015). Parsimonious Mixed Models. arXiv:1506.04967 [Stat]. Retrieved from http://arxiv.org/abs/1506.04967
– Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68(3), 255–278. https://doi.org/10.1016/j.jml.2012.11.001
– Matuschek, H., Kliegl, R., Vasishth, S., Baayen, H., & Bates, D. (2017). Balancing Type I error and power in linear mixed models. Journal of Memory and Language, 94, 305–315. https://doi.org/10.1016/j.jml.2017.01.001
– Schielzeth, H., & Forstmeier, W. (2009). Conclusions beyond support: overconfident estimates in mixed models. Behavioral Ecology, 20(2), 416–420. https://doi.org/10.1093/beheco/arn145
You must be logged in to reply to this topic.