Forum Replies Created

Viewing 25 posts - 26 through 50 (of 86 total)
  • Author
    Posts
  • in reply to: Compute effect sizes for mixed() objects #306
    henrik
    Keymaster

    Good question and honestly, I am not sure. Pangea seems to require some d measure of standardized effect size. I think you could use one of two approaches:
    1. Use some reasonable default value (e.g., small effect size) and explain that this is something of a lower bound of power because you do not expect a smaller effect, but likely larger.
    2. Alternatively, simply standardize the observed mean difference from a previous study by a reasonable measure of standard deviation for that specific difference. In the mixed model case maybe the by-participant random slope standard deviation. I guess it really depends on the specific case on how to do it. But if one makes a reasonable argument why this is okay for the sake of power analysis, then that should be okay.

    As should be maybe clear from this paragraph. I do not often use power-analysis myself. For such highly parameterized models like mixed-models they require so many assumptions that it is really unclear what the value of them is. If possible I would avoid them and make other arguments for why I decided to collect a specific number of participants (e.g., prior samples sizes, money or time restrictions).

    in reply to: Afex documentation follow up contrasts. #305
    henrik
    Keymaster

    Thanks for reporting this. Was indeed an error. Now fixed on github and soon on CRAN. See:
    https://stats.stackexchange.com/a/361436/442

    in reply to: Model equation for correlated and uncorrelated random slopes #298
    henrik
    Keymaster

    From ?afex::mixed:

    Expand Random Effects

    expand_re = TRUE allows to expand the random effects structure before passing it to lmer. This allows to disable estimation of correlation among random effects for random effects term containing factors using the || notation which may aid in achieving model convergence (see Bates et al., 2015). This is achieved by first creating a model matrix for each random effects term individually, rename and append the so created columns to the data that will be fitted, replace the actual random effects term with the so created variables (concatenated with +), and then fit the model. The variables are renamed by prepending all variables with rei (where i is the number of the random effects term) and replacing ":" with "_by_".

    Hence, try: mixed(Y ~ A*B*C + (A*B*C || Subj), data, expand_re = TRUE)

    in reply to: Compute effect sizes for mixed() objects #295
    henrik
    Keymaster

    Unfortunately this is currently not possible. The problem is that the effect on the response scale need to be normalized by some estimate of variability (e.g., standard deviation). And it is not really clear which estimate to take here in the case of a mixed model, as there are usually several. This is also one of the reasons why there is not easy way to calculate R^2 in LMMs: http://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#how-do-i-compute-a-coefficient-of-determination-r2-or-an-analogue-for-glmms

    I believe that most of these problems are also discussed in a recent Psych Methods paper which can be found here:
    Rights, J. D., & Sterba, S. K. (2018). Quantifying explained variance in multilevel models: An integrative framework for defining R-squared measures. Psychological Methods. Advance online publication. http://dx.doi.org/10.1037/met0000184

    The fact that calculating a global measure of model fit (such as R2) is already riddled with complications and that no simple single number can be found, should be a hint that doing so for a subset of the model parameters (i.e., main-effects or interactions) is even more difficult. Given this, I would not recommend to try finding a measure of standardized effect sizes for mixed models.

    It is also important to note that APA in fact recommends unstandardized compared to standardized effect sizes. This is even listed in the first paragraph on effect sizes on wikipedia: https://en.wikipedia.org/wiki/Effect_size

    I believe that a similar message of reporting unstandardized effect sizes is being conveyed in a different recent Psych Methods paper:
    Pek, J., & Flora, D. B. (2018). Reporting effect sizes in original psychological research: A discussion and tutorial. Psychological Methods, 23(2), 208-225. http://dx.doi.org/10.1037/met0000126

    I know that you still need to somehow handle the reviewer. My first suggestion is to report unstandardized effect sizes and cite the corresponding APA recommendation (we did this e.g., here, Table 2). Alternatively, you could try to follow some of the recommendations in the Rights and Sterba paper. Finally, if this also does not help you, you might tell the reviewer something like:

    Unfortunately, due to the way that variance is partitioned in linear mixed models (e.g., Rights & Sterba, in press, Psych Methods), there does not exist an agreed upon way to calculate standard effect sizes for individual model terms such as main effects or interactions. We nevertheless decided to primarily employ mixed models in our analysis, because mixed models are vastly superior in controlling for Type I errors than alternative approaches and consequently results from mixed models are more likely to generalize to new observations (e.g., Barr, Levy, Scheepers, & Tily, 2013; Judd, Westfall, & Kenny, 2012). Whenever possible, we report unstandardized effect sizes which is in line with general recommendation of how to report effect sizes (e.g., Pek & Flora, 2018).

    References:
    Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68(3), 255–278. https://doi.org/10.1016/j.jml.2012.11.001
    Judd, C. M., Westfall, J., & Kenny, D. A. (2012). Treating stimuli as a random factor in social psychology: A new and comprehensive solution to a pervasive but largely ignored problem. Journal of Personality and Social Psychology, 103(1), 54–69. https://doi.org/10.1037/a0028347
    Pek, J., & Flora, D. B. (2018). Reporting effect sizes in original psychological research: A discussion and tutorial. Psychological Methods, 23, 208–225. https://doi.org/10.1037/met0000126
    Rights, J. D., & Sterba, S. K. (in press). Quantifying explained variance in multilevel models: An integrative framework for defining R-squared measures. Psychological Methods. https://doi.org/10.1037/met0000184

    in reply to: Mixed model specification and centering of predictor #292
    henrik
    Keymaster

    1. The warning by afex is just there to remind you that this is a non-trivial issue. I do not recommend mean centering as a general solution. If you have thought about a good point of centering (maybe at 30 in your case) then you might ignore the warning. It is just there to force you to think about your problem.

    2. That is the problem with continuous variables. You need to make sure that the story your results tell are not contingent on some arbitrary value you choose for centering at. Maybe it makes sense to explore the possible range of where you can center and report the results throughout (in the sense of a multiverse analysis). You should give the reader the chance to fully understand what the consequences of your more or less arbitrary choices on your results are.

    Finally, you say:

    Is this really appropriate? IST is a person-specific measure like IQ, measured by a pencil and paper test before the EEG session. Therefore, channels do not contribute to random variation of IST (which is really constant within each person).

    Anyway, even if I try to run the extended model as you suggested, I get convergence errors. I guess this means that even if this model was justified by the design, I would have to reduce it to get a stable model (which would be the model I proposed initially).

    Yes, it is of course appropriate. The ist| chan parts allows the effect of ist to differ idiosyncratically per channel. Maybe the channels right above the area responsible for ist more strongly react to the corresponding signal. You really have to think about the effect of ist independent of the effect of id here.

    And if it shows convergence warnings, try to suppress the estimation of correlations via || and expand_re = TRUE.

    henrik
    Keymaster

    Not really. (1|face_id/face_emotion) is just a shorthand for (1|face_id) + (1|face_id:face_emotion), see: https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#model-specification
    This would analogously hold for the random slopes of course.
    Please see also: https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#nested-or-crossed

    I do not see why this idea of treating it as nested would make sense in your case. Nested really is only important when you have a truly
    hierarchical structure, for example students in classroom and each student can only occur in exactly one classroom. In your case the lower level thing (i.e., emotions) occur however in different faces. So it does not really apply.

    henrik
    Keymaster

    In your case, the source of random variability most likely comes from the identity of the face. So this would be the natural random-effects grouping factor (i.e., face_id). The important question now would be, if each emotion exists for each face_id? If this is the case, then this should be added to the model as well, together with the other random slopes you are missing. That is, a reasonable model could be:
    lmm_model <- mixed(RT ~ session * training * face_emotion + (session*face_emotion|p#) + (session*training*face_emotion|face_id)

    The important thing to keep in mind in this kind of design, is that the two random-effeccts grouping factors are crossed. This means, the question of multiple observations need to be determined for each combination of random-effects grouping factor and fixed effect individually. So, for the face question you need to ask yourself: Do I have multiple observations per face_id for each level of session (or emotion or training), across participants? That is, whether or not these multiple observations come from the same participant or not does not play any role. And I guess that across your experiment, you have multiple observations for each face_id for each of your factors. Hence, the random-effects structure I asked above.

    Said more explicitly: For the by-participant random-effects structure you have to ask yourself, which factors are within-participant, ignoring the face_id factor. Conversly, for the by-face_id random-effects structure you have to ask yourself which factors vary within face_id, ignoring the participant factor.

    Hope that helps!

    in reply to: Mixed model specification and centering of predictor #285
    henrik
    Keymaster

    (1) Is the model correctly specified? I’m not sure because you could also say that chan is nested in id since we recorded the same 64 EEG channels for each subject.

    If you measured the same 64 channels for each subject this means these two variables are crossed and not nested. Nested means that some specific levels of one factors (e.g., EEG channel) only appears within specific levels of another factors (e.g., ID). So for example, a nesting would exist if for each participant you would have an idiosyncratic set of EEG channels. Which of course seems quite unlikely.

    However, one thing to consider is that ist is also a within-channel factor. So maybe
    m <- mixed(erds ~ difficulty * ist + (difficulty | id) + (difficulty * ist| chan), erds, method="S") might be more appropriate (i.e., reflect the maximum random-effects structure justified by the design).

    (2) I get different results whether or not I scale the continuous predictor ist. […]
    Why does centering/scaling the ist predictor affect the estimates, and more noticeably the p-values of the difficulty levels? Whereas in the first case, both difficulty1 and difficulty2 are highly significant, these factors are not significant in the latter case anymore. What is going on? What is the correct way to deal with this situation?

    A few things to consider here.

    First, afex tries to discourage you to inspect the parameter estimates as you do via summary. These are very often not very helpful, especially in cases such as your, where you have factors with more than two levels. I would highly suggest to use the print method (i.e., just m), nice(), or anova() when your interest is in the fixed-effects. You can then use the interplay with emmeans to look at specific effects.

    And now, to your specific question, yes centering can have dramatic effect on the interpretation of your model. This is for example discussed on CrossValidated: https://stats.stackexchange.com/q/65898/442
    However, there exist also numerous papers discussing this. Some specifically in the context of mixed models. For example two papers that I know of are:

    Dalal, D. K., & Zickar, M. J. (2012). Some Common Myths About Centering Predictor Variables in Moderated Multiple Regression and Polynomial Regression. Organizational Research Methods, 15(3), 339–362. https://doi.org/10.1177/1094428111430540

    Wang, L., & Maxwell, S. E. (2015). On disaggregating between-person and within-person effects with longitudinal data using multilevel models. Psychological Methods, 20(1), 63–83. https://doi.org/10.1037/met0000030

    The thing to keep in mind is that variables are tested when the other variables are set to 0. So the 0 value should be meaningful. Often centering makes it meaningful, because it is then on the mean. But other values of 0 can be meaningful as well (e.g., the midpoint of a scale).

    in reply to: precise estimates from emmeans across lm and aov_ez #272
    henrik
    Keymaster

    A new version of afex, version 0.21-2, that fixes this bug, has just appeared on CRAN: https://cran.r-project.org/package=afex

    I apologize for the delay!

    henrik
    Keymaster

    I suggest you take an intensive look at our chapter and further literature. This is probably more efficient than waiting for my responses. For example, the chapter contains references to the main references for whether or not it is appropriate to reduce the random-effects structure based on data. One camp, basically Barr, Levy, Scheepers, and Tily (2013), suggest that this is almost never a good idea. The other camp, Bates et al. and Matuschek et al. (see rferences below), suggest that it is reasonable in case of convergence problems. I think both sides have good arguments. But if your model does not converge, what else can you do other than reduce the random-effects structure. My first preferred solution is to start by removing the correlation among random-effects (also discussed in the chapter).

    I am not sure how relevant it is to try to build the equivalent model to a repeated-measures ANOVA. Mixed models are more adequate for your data. So try to use them in the best way possible.
    Perhaps one concrete response. Including a fixed and random effect for trialno is essentially equivalent to using the residuals. You “control” for this effect (you can search CV, there are plenty of relevant questions with answers on controlling in a regression framework).

    The old lme syntax did not allow random-slopes in the same way lme4 did. People who still use this older approach therefore have to use the nesting syntax. As discussed in detail in for example Barr et al., the recommended approach nowadays uses random slopes. Some more details can again be found on CV. For example:
    https://stats.stackexchange.com/a/131011/442
    https://stats.stackexchange.com/q/325316/442
    https://stats.stackexchange.com/q/14088/442
    https://stats.stackexchange.com/q/117660/442

    Good luck!

    Bates, D., Kliegl, R., Vasishth, S., & Baayen, H. (2015). Parsimonious Mixed Models. arXiv:1506.04967 [stat]. Abgerufen von http://arxiv.org/abs/1506.04967
    Matuschek, H., Kliegl, R., Vasishth, S., Baayen, H., & Bates, D. (2017). Balancing Type I error and power in linear mixed models. Journal of Memory and Language, 94, 305–315. https://doi.org/10.1016/j.jml.2017.01.001

    henrik
    Keymaster

    There are many questions here. In general, I would advise you to read our introductory chapter that covers quite a few of them: http://singmann.org/download/publications/singmann_kellen-introduction-mixed-models.pdf

    Can I assume that this is a mixed model “equivalent” of a “normal” repeated measure anova (as done with SPSS) computed on aggregated data (wide format with one mean logRT for each combination of A * B * C) ?

    This is of course a different model that will produce different results, but this is probably the recommended model (if you ignore response and trialno). For the within-subject design this is the “maximal model justified by the design”, which is recommended following Barr, Levy, Scheeperes and Tily (2013, JML).

    How can I extend the above syntax to account for trialnumber (a tendency to have a negative correlation between trialno and logRT)?
    What would be considered “best practice” if I wanted to add response as a separate factor making it a A * B * C * response design and considering that adding response breaks the balance of A*B*C? I’m not sure how to approach both fixed and random parts for this

    Let us first consider response. As each participant should have (in principle) both yes and no responses for each cell of the design, the natural way to extend this model would be to add response as both fixed and random:
    mixed(logRT ~ A * B * C * response + (A * B * C * response | Subject), df)
    I do not immediately see how the issue of imbalance is a huge problem. Of course, balance would be better, but the model should be able to deal with it.
    The question of trialno is more difficult. Of cxourse you could simply add this as fixed and random effect as well, for example:
    mixed(logRT ~ A * B * C * response + trialno + (A * B * C * response + trialno | Subject), df)
    However, this would only allow a linear effect. Maybe some other functional form is better. If this is important another type of model such as a GAMM may be better. See:
    Baayen, H., Vasishth, S., Kliegl, R., & Bates, D. (2017). The cave of shadows: Addressing the human factor with generalized additive mixed models. Journal of Memory and Language, 94, 206–234. https://doi.org/10.1016/j.jml.2016.11.006

    Should I consider any non default covariance structures for both random and fixed effects? I’ve been advised to use compound symmetry for the fixed part – but to be honest I don’t know any reason for it, nor I can explain differences between various structures (avaible for example in SPSS procedure MIXED)

    Unfortunately, this is at the moment not really possible with lme4 (which mixed uses).

    One final thought. Your model seems to have quite a lot of data, in this case method = "S" is probably indicated in the call to mixed(). It is faster and very similar.

    Hope that helps.

    in reply to: Ges missing from output #251
    henrik
    Keymaster

    The summary method provides the standard ANOVA output from package car which performs the ANOVA calculation. And this output does not include effect sizes.

    Please try RM2 (i.e., just calling the object), nice(RM2), or anova(RM2). See examples below:

    library("afex")
    data(md_12.1)
    a1 <- aov_ez("id", "rt", md_12.1, within = c("angle", "noise"))
    
    a1 ## nice(a1)
    # Anova Table (Type 3 tests)
    # 
    # Response: rt
    #        Effect          df     MSE         F ges p.value
    # 1       angle 1.92, 17.31 3702.02 40.72 *** .39  <.0001
    # 2       noise        1, 9 8460.00 33.77 *** .39   .0003
    # 3 angle:noise 1.81, 16.27 1283.22 45.31 *** .19  <.0001
    # ---
    # Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘+’ 0.1 ‘ ’ 1
    # 
    # Sphericity correction method: GG 
    
    anova(a1)
    # Anova Table (Type 3 tests)
    # 
    # Response: rt
    #             num Df den Df    MSE      F     ges    Pr(>F)    
    # angle       1.9233 17.309 3702.0 40.719 0.39012 3.402e-07 ***
    # noise       1.0000  9.000 8460.0 33.766 0.38660  0.000256 ***
    # angle:noise 1.8080 16.272 1283.2 45.310 0.18827 3.454e-07 ***
    # ---
    # Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
    
    summary(a1)
    # Univariate Type III Repeated-Measures ANOVA Assuming Sphericity
    # 
    #               Sum Sq num Df Error SS den Df F value    Pr(>F)    
    # (Intercept) 19425660      1   292140      9 598.449 1.527e-09 ***
    # angle         289920      2    64080     18  40.719 2.087e-07 ***
    # noise         285660      1    76140      9  33.766  0.000256 ***
    # angle:noise   105120      2    20880     18  45.310 9.424e-08 ***
    # ---
    # Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
    # [...]

    Hope that helps!

    in reply to: Visualizing Random Effects #246
    henrik
    Keymaster

    The problem is that afex::mixed returns an object, of class "mixed", where the original lme4 object is in slot full_model:

    library("afex")
    data("Machines", package = "MEMSS") 
    m1 <- mixed(score ~ Machine + (Machine|Worker), data=Machines)
    
    str(m1, 1)
    # List of 5
    #  $ anova_table      :Classes ‘anova’ and 'data.frame':	1 obs. of  4 variables:
    #   ..- attr(*, "heading")= chr [1:3] "Mixed Model Anova Table (Type 3 tests, KR-method)\n" "Model: score ~ Machine + (Machine | Worker)" "Data: Machines"
    #   ..- attr(*, "sig_symbols")= chr [1:4] " +" " *" " **" " ***"
    #  $ full_model       :Formal class 'merModLmerTest' [package "lmerTest"] with 13 slots
    #  $ restricted_models: NULL
    #  $ tests            : NULL
    #  $ data             :'data.frame':	54 obs. of  3 variables:
    #  - attr(*, "class")= chr "mixed"
    #  - attr(*, "type")= num 3
    #  - attr(*, "method")= chr "KR"

    So one just needs to pass this slot to the corresponding function and not the whole object. This works for example via $:

    library("lattice")
    dotplot(ranef(m1$full_model, condV=TRUE))

    Hope that helps!

    in reply to: aov_ez error #243
    henrik
    Keymaster

    Wonderful that this issue has solved itself. Thanks for the update.

    in reply to: precise estimates from emmeans across lm and aov_ez #239
    henrik
    Keymaster

    The development version of afex from github now returns the correct results. I will update the post when I submit this version to CRAN.

    devtools::install_github("singmann/afex")
    library(afex)
    packageVersion("afex")
    # [1] ‘0.21.0’
    slp <- read.csv("https://pastebin.com/raw/qTajz9ak")
    
    lm.1 <- lm(sleep ~ cond.f + anxiety.m, data = slp)
    emmeans(lm.1, ~ cond.f + anxiety.m)
    #  cond.f             anxiety.m   emmean        SE  df lower.CL upper.CL
    #  group-based     4.746059e-17 69.27959 0.7289347 596 67.84800 70.71119
    #  group + partner 4.746059e-17 76.35960 0.7279758 596 74.92989 77.78931
    #  self help       4.746059e-17 60.99081 0.7293691 596 59.55836 62.42325
    # 
    # Confidence level used: 0.95 
    
    anvmod <- aov_ez(id = "id", dv = "sleep", data = slp, between = c("cond.f"), 
                     covariate = c("anxiety.m"), factorize = FALSE)
    emmeans(anvmod, ~ cond.f + anxiety.m)
    #  cond.f             anxiety.m   emmean        SE  df lower.CL upper.CL
    #  group-based     4.746059e-17 69.27959 0.7289347 596 67.84800 70.71119
    #  group + partner 4.746059e-17 76.35960 0.7279758 596 74.92989 77.78931
    #  self help       4.746059e-17 60.99081 0.7293691 596 59.55836 62.42325
    # 
    # Confidence level used: 0.95 
    
    emmeans(anvmod, ~ cond.f + anxiety.m, model = "multivariate")
    #  cond.f             anxiety.m   emmean        SE  df lower.CL upper.CL
    #  group-based     4.746059e-17 69.27959 0.7289347 596 67.84800 70.71119
    #  group + partner 4.746059e-17 76.35960 0.7279758 596 74.92989 77.78931
    #  self help       4.746059e-17 60.99081 0.7293691 596 59.55836 62.42325
    # 
    # Confidence level used: 0.95 

    Thanks again for reporting this.

    in reply to: Why contr.sum for random effects grouping factors? #238
    henrik
    Keymaster

    Exactly. mixed transforms all categorical covariates that is part of the formula to contr.sum. I thought it might lead to bugs if I only do this selectively (e.g., try to detect what are the grouping variables).

    in reply to: Why contr.sum for random effects grouping factors? #236
    henrik
    Keymaster

    Hmm, I do not see a situation where it would matter. I repeat, they will always be encoded with one parameter per level (i.e., one-hot encoding).

    in reply to: Why contr.sum for random effects grouping factors? #234
    henrik
    Keymaster

    What you observe and describe is of course the case (you bring the evidence), but is not directly related to afex but to lme4 and R in general. But let me explain.

    First, what I have said in my last response holds for the random-effects grouping factors. They will always be encoded with one parameter for each level. Thus, in the example the coding for Worker is irrelevant as long as you estimate random intercepts for it. Then, lme4 will always estimate one idiosyncratic random intercept for each level of worker. Hence the equivalence of m1 and m2.

    Second, suppressing the intercept for categorical covariates in R does something weird. It then estimates one parameter per level, but does not actually reduce the number of estimated parameters. See:

    ncol(model.matrix(~Machine, Machines))
    # [1] 3
    ncol(model.matrix(~0+Machine, Machines))
    # [1] 3

    This is different from the case with numerical covariates (note that the next code is statistically nonsensically):

    ncol(model.matrix(~as.numeric(Machine), Machines))
    # [1] 2
    ncol(model.matrix(~0+as.numeric(Machine), Machines))
    # [1] 1

    So when you use (0 + Machine|Worker) the coding scheme is again irrelevant because, again, one parameter is estimated per level.

    Hope that clears everything up.

    in reply to: precise estimates from emmeans across lm and aov_ez #233
    henrik
    Keymaster

    This is indeed a bug in afex with ANCOVAs. It has to do with the way the aov object is used internally and I am working on a fix. So long, one can circumvent the use of the aov object and simply use the lm object. This somewhat non-intuitively is done by setting model = "multivariate" in the call to emmeans:

    library(afex)
    slp <- read.csv("https://pastebin.com/raw/qTajz9ak")
    
    lm.1 <- lm(sleep ~ cond.f + anxiety.m, data = slp)
    emmeans(lm.1, ~ cond.f + anxiety.m)
    #  cond.f             anxiety.m   emmean        SE  df lower.CL upper.CL
    #  group-based     4.746059e-17 69.27959 0.7289347 596 67.84800 70.71119
    #  group + partner 4.746059e-17 76.35960 0.7279758 596 74.92989 77.78931
    #  self help       4.746059e-17 60.99081 0.7293691 596 59.55836 62.42325
    # 
    # Confidence level used: 0.95 
    
    anvmod <- aov_ez(id = "id", dv = "sleep", data = slp, between = c("cond.f"), 
                     covariate = c("anxiety.m"), factorize = FALSE)
    emmeans(anvmod, ~ cond.f + anxiety.m, model = "multivariate")
    #  cond.f             anxiety.m   lsmean        SE  df lower.CL upper.CL
    #  group-based     4.746059e-17 69.27959 0.7289347 596 67.84800 70.71119
    #  group + partner 4.746059e-17 76.35960 0.7279758 596 74.92989 77.78931
    #  self help       4.746059e-17 60.99081 0.7293691 596 59.55836 62.42325
    # 
    # Confidence level used: 0.95

    I will post here again, when I have fixed the bug in afex. Thanks a lot for reporting it!

    in reply to: Why contr.sum for random effects grouping factors? #228
    henrik
    Keymaster

    No, I do not agree with that. To be honest, I do not fully understand where this purported equivalence should come from.

    I agree that having random-slopes that have a sum-to-zero coding can lead to somewhat awkward parameterizations. Why should the deviations from the grand-mean be normally distributed across participants? However, other coding schemes do not necessarily have better properties. So I am not sure what can be gained tby using different coding schemes for the fixed-effects and the random-slopes.

    But maybe there is also a misunderstanding here. The coding for the random-effects grouping factors is using dummy-coding or what seems to be called one-hot encoding in machine learning. Each level of the random-effects grouping factor has its own parameter that is 1 for this level and 0 for all others. Thus, there is no intercept and random intercepts are simply estimated for each level individually.

    Is this what you were after?

    in reply to: Why contr.sum for random effects grouping factors? #225
    henrik
    Keymaster

    mixed simply uses contr.sum for all categorical covariates per default. I agree that for the random effects that can lead to an awkward parameterization. However, it is not immediately clear to me how one could program it in a different way. So the reason is just convenience and having no apparent alternative. If you have very specific alternative ideas you can simply set check_contrasts = FALSE and set the contrasts in the desired way. I honestly do not see the benefit of offering anything else via the mixed interface. But feel free to convince me otherwise.

    in reply to: warning: missing cells for some factors #222
    henrik
    Keymaster

    There are several different ways to test prespecified contrasts. This can be done in afex in the way described here. As you correctly notice, the p-values of the different methoda re identical and the warnings appear to be inconsequential.

    You could also fit the model with normal (i.e., sum-to-zero) contrasts and then setup the contrasts later via emmeans. There are surely further possibilities (e.g., fit the model with lmerTest::lmer and use summary). All of those should give you the same results (as long as you use the same method for calculating the df).

    in reply to: Post-Hoc Test with lsmeans error with uncorrelated random effects #217
    henrik
    Keymaster

    One surprising solution (surprising in the sense that I only know about it for a few days) is to set set_data_arg = FALSE. In the current CRAN version (but not in the current development and next CRAN version), this throws quite a few errors which can be safely ignored.

    library("afex")
    data("Machines", package = "MEMSS") 
    m1 <- mixed(score ~ Machine + (Machine||Worker), data=Machines, 
                expand_re = TRUE, set_data_arg = FALSE)
    emm_options(lmer.df = "Satterthwaite")
    emmeans(m1, "Machine")
    #  Machine   emmean       SE    df lower.CL upper.CL
    #  A       52.35556 2.397324  7.35 46.74094 57.97017
    #  B       60.32222 2.633283  9.04 54.36898 66.27546
    #  C       66.27222 2.836796 11.56 60.06544 72.47901
    # 
    # Degrees-of-freedom method: satterthwaite 
    # Confidence level used: 0.95 

    Alternatively, a different method for calculating the df can also be chosen:

    m2 <- mixed(score ~ Machine + (Machine||Worker), data=Machines, expand_re = TRUE)
    emm_options(lmer.df = "Satterthwaite")
    emmeans(m2, "Machine")
    # Error in eval(predvars, data, env) : object 're1.Machine1' not found
    
    emm_options(lmer.df = "asymptotic")
    emmeans(m2, "Machine")
    #  Machine   emmean       SE  df asymp.LCL asymp.UCL
    #  A       52.35556 2.397324 Inf  47.65689  57.05422
    #  B       60.32222 2.633283 Inf  55.16108  65.48336
    #  C       66.27222 2.836796 Inf  60.71220  71.83224
    # 
    # Degrees-of-freedom method: asymptotic 
    # Confidence level used: 0.95 
    
    emm_options(lmer.df = "kenward-roger")
    emmeans(m2, "Machine")
    #  Machine   emmean       SE    df lower.CL upper.CL
    #  A       52.35556 2.397324  7.35 46.74087 57.97024
    #  B       60.32222 2.633283  9.04 54.36897 66.27548
    #  C       66.27222 2.836796 11.60 60.06746 72.47698
    # 
    # Degrees-of-freedom method: kenward-roger 
    # Confidence level used: 0.95 
    

    I am currently deliberating whether to use set_data_arg = FALSE as the new default.

    in reply to: warning: missing cells for some factors #216
    henrik
    Keymaster

    The question here really is what you want to achieve. The main goal of mixed is to provide tests of effects, such as main-effects or interactions (also called model terms). So splitting the variable into its part is somehow orthogonal to the intended goal. The reason this is not directly clear from your call is the use of summary. summary gives you output based on the parameters and not based on the terms, as the other functions that deal with mixed objects such as nice or anova (or even print). Invoking one of those also makes the difference between the two calls apparent:

    library(afex)
    set_sum_contrasts()
    data(sk2011.1)
    d <- aggregate(response ~ id + inference, data = sk2011.1, FUN = mean)
    contrast_mat <- contr.sum(4)
    d$c1 <- contrast_mat[, 1][d$inference]
    d$c2 <- contrast_mat[, 2][d$inference]
    d$c3 <- contrast_mat[, 3][d$inference]
    m1 <- mixed(response ~ c1 + c2 + c3 + (1|id), d)
    nice(m1)
    # Mixed Model Anova Table (Type 3 tests, KR-method)
    # 
    # Model: response ~ c1 + c2 + c3 + (1 | id)
    # Data: d
    #   Effect     df        F p.value
    # 1     c1 1, 117  7.79 **    .006
    # 2     c2 1, 117     0.67     .41
    # 3     c3 1, 117 10.52 **    .002
    # ---
    # Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘+’ 0.1 ‘ ’ 1
    
    m2 <- mixed(response ~ inference + (1|id), d)
    nice(m2)
    # Mixed Model Anova Table (Type 3 tests, KR-method)
    # 
    # Model: response ~ inference + (1 | id)
    # Data: d
    #      Effect     df       F p.value
    # 1 inference 3, 117 5.15 **    .002
    # ---
    # Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘+’ 0.1 ‘ ’ 1

    If, for some reason, you are actually interested in p-values for the individual parameters (which seems quite questionable for factors with more than two levels as in the example), you can get this also via the lmerTest summary method:

    lmerTest::summary(m2$full_model)
    # [...]
    # Fixed effects:
    #             Estimate Std. Error      df t value Pr(>|t|)    
    # (Intercept)   79.141      2.495  39.000  31.725  < 2e-16 ***
    # inference1     8.372      3.000 117.000   2.791  0.00614 ** 
    # inference2    -2.459      3.000 117.000  -0.820  0.41395    
    # inference3    -9.728      3.000 117.000  -3.243  0.00154 ** 
    # ---
    # Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
    

    Note that for this to work you need to invoke set_sum_contrasts() earlier. Alternatively, set set_data_arg = FALSE. One final comment: You might also want to take a look at the per_parameter argument.

    in reply to: MANOVA? #212
    henrik
    Keymaster

    In principle yes, but potentially not in the way you want it.

    Let me explain: To calculate the ANOVA afex uses car::Anova which per default calculates both univariate (i.e., repeated-measures ANOVA) and multivariate tests for the repeated-measures factors. Per default, only the univariate tests are reported. But the multivariate tests can be obtained easily via summary(model$Anova). For example:

    
    library("afex")
    data(md_12.1)
    a1 <- aov_ez("id", "rt", md_12.1, within = c("angle", "noise"))
    
    summary(a1$Anova)
    # Type III Repeated Measures MANOVA Tests:
    # 
    # ------------------------------------------
    #  
    # Term: (Intercept) 
    # 
    #  Response transformation matrix:
    #            (Intercept)
    # X0_absent            1
    # X0_present           1
    # X4_absent            1
    # X4_present           1
    # X8_absent            1
    # X8_present           1
    # 
    # Sum of squares and products for the hypothesis:
    #             (Intercept)
    # (Intercept)   116553960
    # 
    # Multivariate Tests: (Intercept)
    #                  Df test stat approx F num Df den Df     Pr(>F)    
    # Pillai            1   0.98518 598.4492      1      9 1.5266e-09 ***
    # Wilks             1   0.01482 598.4492      1      9 1.5266e-09 ***
    # Hotelling-Lawley  1  66.49435 598.4492      1      9 1.5266e-09 ***
    # Roy               1  66.49435 598.4492      1      9 1.5266e-09 ***
    # ---
    # Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
    # 
    # ------------------------------------------
    #  
    # Term: angle 
    # 
    #  Response transformation matrix:
    #            angle1 angle2
    # X0_absent       1      0
    # X0_present      1      0
    # X4_absent       0      1
    # X4_present      0      1
    # X8_absent      -1     -1
    # X8_present     -1     -1
    # 
    # Sum of squares and products for the hypothesis:
    #         angle1 angle2
    # angle1 1128960 403200
    # angle2  403200 144000
    # 
    # Multivariate Tests: angle
    #                  Df test stat approx F num Df den Df     Pr(>F)    
    # Pillai            1  0.887597 31.58624      2      8 0.00015963 ***
    # Wilks             1  0.112403 31.58624      2      8 0.00015963 ***
    # Hotelling-Lawley  1  7.896559 31.58624      2      8 0.00015963 ***
    # Roy               1  7.896559 31.58624      2      8 0.00015963 ***
    # ---
    # Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
    # 
    # ------------------------------------------
    #  
    # Term: noise 
    # 
    #  Response transformation matrix:
    #            noise1
    # X0_absent       1
    # X0_present     -1
    # X4_absent       1
    # X4_present     -1
    # X8_absent       1
    # X8_present     -1
    # 
    # Sum of squares and products for the hypothesis:
    #         noise1
    # noise1 1713960
    # 
    # Multivariate Tests: noise
    #                  Df test stat approx F num Df den Df     Pr(>F)    
    # Pillai            1  0.789552 33.76596      1      9 0.00025597 ***
    # Wilks             1  0.210448 33.76596      1      9 0.00025597 ***
    # Hotelling-Lawley  1  3.751773 33.76596      1      9 0.00025597 ***
    # Roy               1  3.751773 33.76596      1      9 0.00025597 ***
    # ---
    # Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
    # 
    # ------------------------------------------
    #  
    # Term: angle:noise 
    # 
    #  Response transformation matrix:
    #            angle1:noise1 angle2:noise1
    # X0_absent              1             0
    # X0_present            -1             0
    # X4_absent              0             1
    # X4_present             0            -1
    # X8_absent             -1            -1
    # X8_present             1             1
    # 
    # Sum of squares and products for the hypothesis:
    #               angle1:noise1 angle2:noise1
    # angle1:noise1        416160        171360
    # angle2:noise1        171360         70560
    # 
    # Multivariate Tests: angle:noise
    #                  Df test stat approx F num Df den Df     Pr(>F)    
    # Pillai            1  0.918223 44.91353      2      8 4.4722e-05 ***
    # Wilks             1  0.081777 44.91353      2      8 4.4722e-05 ***
    # Hotelling-Lawley  1 11.228381 44.91353      2      8 4.4722e-05 ***
    # Roy               1 11.228381 44.91353      2      8 4.4722e-05 ***
    # ---
    # Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
    # 
    # Univariate Type III Repeated-Measures ANOVA Assuming Sphericity
    # 
    #                   SS num Df Error SS den Df       F    Pr(>F)    
    # (Intercept) 19425660      1   292140      9 598.449 1.527e-09 ***
    # angle         289920      2    64080     18  40.719 2.087e-07 ***
    # noise         285660      1    76140      9  33.766  0.000256 ***
    # angle:noise   105120      2    20880     18  45.310 9.424e-08 ***
    # ---
    # Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
    # 
    # 
    # Mauchly Tests for Sphericity
    # 
    #             Test statistic p-value
    # angle              0.96011 0.84972
    # angle:noise        0.89378 0.63814
    # 
    # 
    # Greenhouse-Geisser and Huynh-Feldt Corrections
    #  for Departure from Sphericity
    # 
    #              GG eps Pr(>F[GG])    
    # angle       0.96164  3.402e-07 ***
    # angle:noise 0.90398  3.454e-07 ***
    # ---
    # Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
    # 
    #               HF eps   Pr(>F[HF])
    # angle       1.217564 2.086763e-07
    # angle:noise 1.117870 9.424093e-08
    # Warning message:
    # In summary.Anova.mlm(a1$Anova, multivariate = TRUE) :
    #   HF eps > 1 treated as 1

    Hope that helps!

Viewing 25 posts - 26 through 50 (of 86 total)