 This topic has 2 replies, 2 voices, and was last updated 1 year, 12 months ago by statmerkur.

AuthorPosts


December 10, 2017 at 21:20 GMT+0000 #163statmerkurParticipant
I followed your ANOVA vignette and passed the
afex_aov
object tolsmeans
to obtain the marginal means of my withinsubjects factor. The emmeans are consistent with the SPSS emmeans but the standard errors and thus the pairwise comparisons don’t match:R
factor_within lsmean SE df lower.CL upper.CL w1 273.8519 6.325405 28.92 260.9134 286.7903 w2 263.5556 6.325405 28.92 250.6171 276.4940 w3 266.4074 6.325405 28.92 253.4689 279.3459 contrast estimate SE df t.ratio p.value w1  w2 10.296296 3.310553 48 3.110 0.0094 w1  w3 7.444444 3.310553 48 2.249 0.0875 w2  w3 2.851852 3.310553 48 0.861 1.0000
SPSS
factor_within lsmean SE lower.CL upper.CL w1 273.852 6.779 259.860 287.842 w2 263.556 5.639 251.917 275.194 w3 266.407 6.502 252.988 279.827 contrast estimate SE p.value w1  w2 10.296 3.064 0.008 w1  w3 7.444 3.836 0.192 w2  w3 2.852 2.962 1.000
This is the R code I used:
# library("haven") # library("reshape2") library("afex") library("lsmeans") d < data.frame(id = gl(27, 3), factor_between = gl(3, 27, labels = c("b1", "b2", "b3")), factor_within = gl(3, 1, 81, labels = c("w1", "w2", "w3")), value = c(296, 270, 281, 208, 207, 199, 313, 292, 278, 323, 296, 331, 227, 224, 226, 290, 327, 323, 224, 233, 221, 296, 282, 286, 304, 298, 281, 305, 273, 278, 287, 279, 271, 282, 272, 285, 251, 235, 236, 243, 250, 285, 231, 224, 223, 295, 278, 269, 313, 266, 280, 243, 249, 246, 267, 262, 247, 311, 291, 304, 262, 253, 248, 272, 265, 270, 239, 229, 249, 234, 233, 220, 327, 302, 291, 270, 253, 260, 281, 273, 305)) # export to SPSS # d_spss < dcast(d, id + factor_between ~ factor_within, value.var = "value") # write_sav(d_spss, "d_spss.sav") set_sum_contrasts() a1 < aov_ez(id = "id", dv = "value", data = d, between = "factor_between", within = "factor_within") nice(a1, es = "pes") # matches with SPSS print(m1 < lsmeans(a1, ~ factor_within)) # SEs don't match the SPSS SEs ... pairs(m1, adjust = "bonferroni") # ... therefore pairwise comparisons don't match
And this is the SPSS syntax I used:
DATASET NAME DataSet1 WINDOW=FRONT. GLM w1 w2 w3 BY factor_between /WSFACTOR=factor_within 3 Polynomial /MEASURE=value /METHOD=SSTYPE(3) /EMMEANS=TABLES(factor_within) COMPARE ADJ(BONFERRONI) /PRINT=ETASQ /CRITERIA=ALPHA(.05) /WSDESIGN=factor_within /DESIGN=factor_between.
Is there something wrong with my R code or why do I get these diverging results?

December 12, 2017 at 13:10 GMT+0000 #171henrikKeymaster
As can be seen from the output, the difference is whether or not one wants to assume equal variances for the followup tests (as does
lsmeans
) or nor (as doesSPSS
).One can almost replicate the results from
SPSS
inR
, by running independent analyses for each difference. I demonstrate this here for thew1  w2
contrast:library("tidyr") dw < spread(d, factor_within, value) dw$d1 < dw$w1  dw$w2 summary(lm(d1 ~ 1, dw)) # Call: # lm(formula = d1 ~ 1, data = dw) # # Residuals: # Min 1Q Median 3Q Max # 47.296 6.296 1.296 8.204 36.704 # # Coefficients: # Estimate Std. Error t value Pr(>t) # (Intercept) 10.296 3.016 3.414 0.00211 ** #  # Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 # # Residual standard error: 15.67 on 26 degrees of freedom
Note that to get the standard errors, I rely on the fact that a withinsubject contrast of two means is simply the test on whether the difference differs from zero. And this can be tested with a linear model with an Intercept only. I do not know how to get the standard error with
t.test
. As can be seen, the estimate is again identical and now the standard error is pretty close to the one reported (this also holds for the other two contrasts). I am not sure why the standard error is not identical, because I did a similar analysis some time ago in which it was. Maybe this has something to do with your specificSPSS
call (e.g.,Polynomial
). In any case, the pvalue then still needs to be corrected.What should be preferred is not a simple question. One can make arguments for both decisions.
Independent tests (ttests or ANOVAs depending on the followup test) is the approach recommended in the Maxwell and Delaney stats book (which is a great book). IIRC, their main argument is basically that the equal variance assumption is likely false in real psychological situations.
I do not think this is a very strong argument (why?). In contrast, I think that in cases with limited data the positive effect of pooling on precision of the variance estimate seems to be a stronger argument for pooling. That is why I usually use
lsmeans
on the ANOVA result. In addition, it makes everything quite easy because one can program all tests directly on the ANOVA output.As many time in statistics, this is one of the situations where you as the user has to make an informed decision.

December 16, 2017 at 10:40 GMT+0000 #172statmerkurParticipant
This makes perfect sense and I can think of both situations where using
lsmeans
would be useful and where it wouldn’t. So in case of unequal cell sizes, should the dependent ttests still match theSPSS
sEMMEANS
or how can one do an equivalent analysis in R then?


AuthorPosts
 You must be logged in to reply to this topic.