Forum Replies Created

AuthorPosts

henrikKeymaster
I had to remove my previous response because most things I said there were kind of incorrect. But I think I this response should be correct now.
In your design you have realized all 16 combinations of the factors
Correction
andNarrative
. However, not withinsubjects, but only across subjects. This means you can estimate fixed effects for their interaction, but no random slopes for it and no withinsubjects ANOVA. In fact, the absence of any replicates for any factor on the individual level, precludes basically any calculation of randomslopes. However, following the logic that in your situation the random variability is confounded with the residual variance, the following mixed model should in principle be appropriate (along the lines discussed here):
mixed(dv ~ Correction*Narrative + Order + (1participants), data)
To make sure that the model is in fact not grossly anticonservative one could run a simulation study from generated data without effect to see if the alphalevels are within nominal ranges, but this might be outside what could be expected from a masters thesis (but would be pretty cool, see the previous link for some hints how to do it).
If
Order
had more than 4 levels, one could probably also treat it like a randomeffects grouping factor (e.g., see here). But with 4 levels that seems not really advisable. You can only hope it does not really effect the results.I hope that helps. I would probably go with it.
henrikKeymasterWhere is
Correction
in your design? Are these the levels withinOrder
? Does this entail that you had only four specific pairings ofCorrection
andNarrative
?To know how to set up a mixed model it is important to know if you have replicates for each cell of the design and participant. In other words, did each participant only provide one response per
Narrative
level or several. For mixed models one usually wants/needs several responses per unit of observation and cell of the design (i.e., replicates) to estimate the full random effects structure.I think to get more help the best would be to either post a mockup of your data (as in this example) or your data.
henrikKeymasterIn general the afex ANOVA functions are for fully factorial designs. If your design is not fully factorial, I guess it is not possible to use the afex ANOVA functions.
For analyzing completely replicated latin square designs in R see: https://stat.ethz.ch/pipermail/rhelp/2003April/031783.html
henrikKeymasterI am not sure I fully understand the nature of your latin square design without actually seeing the design figure/plot (i.e., I do not understand the meaning of treatment factor, column blocking factor, and row blocking factor). It may help to provide it, especially if there are replications (i.e., given that 4 x 4 = 16 and you have 37 participants, that is quite likely). Example here.
As long as a single unit of observation (i.e., probably participant) provides data for more than one condition, setting up a mixed model is probably doable. For this question it is important to know if one of the factors (
Correction
orNarrative
) is a withinsubjects (i.e., repeatedmeasures) factor or not.Even if none of your factors is a repeatedmeasures factor, setting up a latin square design with mixed models is possible. An example is given here.
Note that in any case, the presence of interactions makes an analysis using a latin square approach difficult as it assumes no interaction. An interaction in the data thus invalidates the results and suggests the necessity of collecting new data allowing for an interaction, potentially with a reduced design. But I think it would be helpful to first described exactly what each participant saw to evaluate whether or not a simple mixed model analysis is appropriate and one can have a look at the interaction.
henrikKeymasterGreat question as I was honestly surprised
afex
does not do that automatically. I guess I always remove myNA
s from the data before running analyses and hadn’t encountered that before.Fortunately, the solution is super simple. We can simply pass
na.rm=TRUE
to the call of the ANOVA function and this will be passed on tomean()
which does the automatic aggregation. Then the mean for each participant and cell are calculated excludingNA
s:aov_ez("id", "response", sk2011.1[ sk2011.1$what == "affirmation",], within = c("inference", "type"), between = "instruction", na.rm=TRUE) # Contrasts set to contr.sum for the following variables: instruction # Anova Table (Type 3 tests) # # Response: response # Effect df MSE F ges p.value # 1 instruction 1, 38 1072.56 0.12 .001 .73 # 2 inference 1, 38 995.53 12.68 ** .11 .001 # 3 instruction:inference 1, 38 995.53 13.07 *** .11 .0009 # 4 type 1, 38 188.61 0.06 .0001 .81 # 5 instruction:type 1, 38 188.61 3.04 + .005 .09 # 6 inference:type 1, 38 499.82 30.59 *** .13 <.0001 # 7 instruction:inference:type 1, 38 499.82 11.33 ** .05 .002 #  # Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘+’ 0.1 ‘ ’ 1 # Warning message: # More than one observation per cell, aggregating the data using mean (i.e, fun_aggregate = mean)!
henrikKeymasterAfter contacting
pbkrtest
author Søren Højsgaard he told me that it was actually a bug inlme4::refit
. So I looked into it and wrote a bug fix that was just merged intolme4
: https://github.com/lme4/lme4/pull/421So either you wait until the new version of
lme4
appears on CRAN (and I have no idea when this might be) or you install the development version:library("devtools"); install_github("lme4/lme4",dependencies=TRUE)
Note that installing the development version might require other dependencies depending on your OS (e.g.,RTools
on Windows orXCode
on Mac).henrikKeymasterThe problem appears to be in the function calculating the parametric bootstrap
pbkrtest::PBmodcomp
:require(lme4) require(optimx) require(pbkrtest) ## Linear mixed effects model: data(beets, package="pbkrtest") sug < lmer(sugpct ~ block + sow + harvest + (1block:harvest), data=beets, REML=FALSE, control =lmerControl(optimizer = "optimx",optCtrl = list(method = "nlminb", starttests = FALSE, kkt = FALSE))) sug.h < update(sug, .~. harvest) PBmodcomp(sug, sug.h, nsim=50) # Error in optwrap(object@optinfo$optimizer, ff, x0, lower = lower, control = control$optCtrl, : # must specify 'method' explicitly for optimx
I will contact the author and let you know what comes out of it. I can do little at the time.
henrikKeymasterDoes it help to load
optimx
at the clusters before running the code?
clusterEvalQ(my.cluster, library(optimx))
If not, please let me know and I will investigate this further.
henrikKeymasterIn principle mixed models have the same constraint as all statistical models that you cannot estimate parameters for cells or conditions where you have no data at all (i.e., structurally missing). For example, when your control condition does not interact with one other factor (e.g., for your factor
A
, the control condition only exists for levela1
and not for levela2
) then you cannot have an interaction ofA
withcontrol
.As far as I understand your design you have 5 betweensubjects or independentsamples groups. For each unit of observation in this group you furthermore have pretest and posttest observations. This would allow to model the data as a 5 x 2 design with factors
group
andtest_time
:~ group*test_time + (test_timeid)
(note that the random effects structure assumes that you have replicates for each unit of observation andtest_time
condition, which you probably should have)The problem with this approach is obviously that it flattens out the 2 x 2 design underlying 4 of your 5 groups. I do not see a way how to incorporate this structure with the control in one model. I think there are two ways to address the 2 x 2 structure subsequently, once you have run the initial model and have determined how the control group relates to the other 4 groups. You could either test the 2 x 2 design from the initial model using
lsmeans
(using a combination oflsmeans
,contrast
, andtest
) or run a second model on the reduced data that has a 2 x 2 x 2 design (i.e., your 4 groups times the repeatedmeasures factor). The second approach seems somewhat more straight forward to implement. This reply was modified 6 years ago by henrik. Reason: clarified ranom effects structure
henrikKeymasterThis is not a simple question, but the short answer is that
afex
does not have a function for automatically testing whether or not a certain random effects term accounts for a significant portion of the variance. The reason for the absence of this functionality is that following Barr, Levy, Scheepers, and Tily (2013) and Schielzeth and Forstmeier (2009) including all possible random effects (i.e., ‘the maximal random effects structure justified by the design’) is the conservative thing to do. So random effects that are part of the design should in principle be included (and there are rumors that reviewers require this).A slightly different perspective on this question comes from
lme4
author Doug Bates, Harald Baayen, Reinhold Kliegl and others (Bates, Kliegl, Vasishth, & Baayen, 2015; Matuschek, Kliegl, Vasishth, Baayen, & Bates, 2017) who highlight that an overparameterized model (i.e., models that contain more random effects than can be estimated given the current data) may be prone to failing to converge and exhibit reduced statistical power. Note that these authors nevertheless agree that removing random effects for which there is actually random variation is potentially dangerous (in terms of increasing the Type 1 error rate). So the question then becomes what is the threshold for which we want to say that a random effects term is large enough or ‘significant’. In Matuschek et al., they suggest a downward procedure (i.e., beginning with the maximal model and beginning with the largest effect) with alpha = .2 (instead of the more common alpha = .05). Again, the idea is to avoid situations in which a random effects term is reduced that actually shows random variability.To gather information about the size of a random effect (irrespective of significance), one can simply call
summary
on mixed objects:require(afex) data("sk2011.2") sk2_aff < droplevels(sk2011.2[sk2011.2$what == "affirmation",]) sk_m1 < mixed(response ~ instruction*inference*type+(inference*typeid), sk2_aff) summary(sk_m1)
Which gives:
Linear mixed model fit by REML ['merModLmerTest'] Formula: response ~ instruction * inference * type + (inference * type  id) Data: sk2_aff REML criterion at convergence: 10393.7 Scaled residuals: Min 1Q Median 3Q Max 3.9241 0.3314 0.0654 0.3796 3.7675 Random effects: Groups Name Variance Std.Dev. Corr id (Intercept) 141.84 11.910 inference1 158.83 12.603 0.71 type1 39.81 6.310 0.24 0.09 type2 10.43 3.230 0.43 0.23 0.41 inference1:type1 54.38 7.374 0.05 0.33 0.55 0.42 inference1:type2 4.33 2.081 0.39 0.50 0.16 0.31 0.31 Residual 443.39 21.057 Number of obs: 1134, groups: id, 63 [...]
The size of the random effects is given by the variance or standard deviation of the parameters (and the variances can be related to the residual variance also given). Note that
summary
is anlme4
function and noafex
function so the output is given in terms of the model parameters and not the factors. So fortype
, which is a factor with three levels, we have two parameters where neither can be easily mapped to the three factor levels and should be considered jointly (the same is true for theinference:type
interaction).
We can now fit another model for which we remove the highest order random effect, theinference:type
interaction. Then we can compare the model fit of the two models using theanova
function:sk_m2 < mixed(response ~ instruction*inference*type+(inference+typeid), sk2_aff) anova(sk_m1, sk_m2)
This gives:
Data: sk2_aff Models: sk_m2: response ~ instruction * inference * type + (inference + type  sk_m2: id) sk_m1: response ~ instruction * inference * type + (inference * type  sk_m1: id) Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq) sk_m2 23 10496 10612 5225.1 10450 sk_m1 34 10462 10633 5196.9 10394 56.568 11 3.997e08 ***  Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Here the likelihood ratio tests performed by
anova
indicates that removing the highest order effect leads to a significant reduction in model fit with chi^2 (11) = 56.57, p < .0001.Note that for
afex
versions prior to 0.17 (which will arrive on CRAN shortly), there was an error in the model labels when callinganova
in which case the following produces the same output:
anova(sk_m1$full_model, sk_m2$full_model, refit = FALSE)
References:
– Bates, D., Kliegl, R., Vasishth, S., & Baayen, H. (2015). Parsimonious Mixed Models. arXiv:1506.04967 [Stat]. Retrieved from http://arxiv.org/abs/1506.04967
– Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68(3), 255–278. https://doi.org/10.1016/j.jml.2012.11.001
– Matuschek, H., Kliegl, R., Vasishth, S., Baayen, H., & Bates, D. (2017). Balancing Type I error and power in linear mixed models. Journal of Memory and Language, 94, 305–315. https://doi.org/10.1016/j.jml.2017.01.001
– Schielzeth, H., & Forstmeier, W. (2009). Conclusions beyond support: overconfident estimates in mixed models. Behavioral Ecology, 20(2), 416–420. https://doi.org/10.1093/beheco/arn145henrikKeymasterThe suggested citation at the time of writing is:
Henrik Singmann, Ben Bolker, Jake Westfall and Frederik Aust (2016). afex: Analysis of Factorial Experiments. R package version 0.161. https://CRAN.Rproject.org/package=afexHowever, this changes with any update of
afex
and the easiest way to find out the current suggestion is to call:
citation("afex")
This produces:
To cite package ‘afex’ in publications use: Henrik Singmann, Ben Bolker, Jake Westfall and Frederik Aust (2016). afex: Analysis of Factorial Experiments. R package version 0.161. https://CRAN.Rproject.org/package=afex A BibTeX entry for LaTeX users is @Manual{, title = {afex: Analysis of Factorial Experiments}, author = {Henrik Singmann and Ben Bolker and Jake Westfall and Frederik Aust}, year = {2016}, note = {R package version 0.161}, url = {https://CRAN.Rproject.org/package=afex}, }
Btw,
citation("package")
works for anypackage
and also without any package to learn how to citeR
. 
AuthorPosts