Why contr.sum for random effects grouping factors?

This topic has 9 replies, 2 voices, and was last updated 7 years, 7 months ago by henrik.

Viewing 9 reply threads

Author

Posts
- April 9, 2018 at 20:13 GMT+0000 #223
  statmerkur
  Participant
  It’s clear to me why one should use orthogonal contrast for categorical predictors in mixed models when one wants to estimate main effects (and interactions between them). But why does mixed() use contrast coding (i.e. contr.sum) for the random effects grouping factors?
  - This topic was modified 7 years, 8 months ago by statmerkur.
- April 9, 2018 at 20:31 GMT+0000 #225
  
  henrik
  Keymaster
  
  mixed simply uses contr.sum for all categorical covariates per default. I agree that for the random effects that can lead to an awkward parameterization. However, it is not immediately clear to me how one could program it in a different way. So the reason is just convenience and having no apparent alternative. If you have very specific alternative ideas you can simply set check_contrasts = FALSE and set the contrasts in the desired way. I honestly do not see the benefit of offering anything else via the mixed interface. But feel free to convince me otherwise.
- April 9, 2018 at 20:57 GMT+0000 #227
  
  statmerkur
  Participant
  
  I was just curious whether there was a specific reason for that. So, would you agree that using orthogonal contrasts for categorical covariates and, say, treatment coding for random effects grouping factors is equivalent to using orthogonal contrasts for both categorical covariates and random effects grouping factors?
- April 10, 2018 at 07:57 GMT+0000 #228
  
  henrik
  Keymaster
  
  No, I do not agree with that. To be honest, I do not fully understand where this purported equivalence should come from.
  
  I agree that having random-slopes that have a sum-to-zero coding can lead to somewhat awkward parameterizations. Why should the deviations from the grand-mean be normally distributed across participants? However, other coding schemes do not necessarily have better properties. So I am not sure what can be gained tby using different coding schemes for the fixed-effects and the random-slopes.
  
  But maybe there is also a misunderstanding here. The coding for the random-effects grouping factors is using dummy-coding or what seems to be called one-hot encoding in machine learning. Each level of the random-effects grouping factor has its own parameter that is 1 for this level and 0 for all others. Thus, there is no intercept and random intercepts are simply estimated for each level individually.
  
  Is this what you were after?
- April 10, 2018 at 22:33 GMT+0000 #232
  statmerkur
  Participant
  There seems to be no difference between models with different coding schemes for the random-effects grouping factors, i.e. m1 = m2and m3a = m4a. Hence I don’t understand why afex sets contr.sum for the random-effects grouping factors (Worker in the example below).
  
  Besides that, AFAIU, m3b and m4b are models where random slopes are coded differently (treatment coding vs sum coding) and they seem to estimate the same random effects which in turn are the same as m5s (which also suppresses the fixed intercept) estimates for the random effects.
  
  Why is that?
```
library(afex)
data("Machines", package = "MEMSS")

m1<- mixed(score ~ Machine + (Machine|Worker), Machines)
contrasts(Machines$Machine) <- contr.sum(length(levels(Machines$Machine)))
m2 <- mixed(score ~ Machine + (Machine|Worker), Machines, check_contrasts = F)
m1$full_model # Machine sum coded + Worker sum coded
m2$full_model # Machine sum coded + Worker treatment coded

contrasts(Machines$Machine) <- contr.treatment(length(levels(Machines$Machine)))
m3a <- mixed(score ~ Machine + (Machine|Worker), Machines, check_contrasts = F)
m3b <- mixed(score ~ Machine + (0 + Machine|Worker), Machines, check_contrasts = F)
contrasts(Machines$Worker) <- contr.sum(length(levels(Machines$Worker)))
m4a <- mixed(score ~ Machine + (Machine|Worker), Machines, check_contrasts = F)
m4b <- mixed(score ~ Machine + (0 + Machine|Worker), Machines)
m5 <- mixed(score ~ 0 + Machine + (0 + Machine|Worker), Machines, check_contrasts = F)
m3a$full_model # Machine treatment coded + Worker treatment coded
m4a$full_model # Machine treatment coded + Worker sum coded

m3b$full_model # Machine treatment coded + Worker treatment coded + random intercept suppressed 
m4b$full_model # Machine sum coded + Worker sum coded + random intercept suppressed 
m5$full_model  #  Machine treatment coded + Worker sum coded + fixed and random intercept suppressed 
```
- April 11, 2018 at 21:01 GMT+0000 #234
  henrik
  Keymaster
  What you observe and describe is of course the case (you bring the evidence), but is not directly related to afex but to lme4 and R in general. But let me explain.
  
  First, what I have said in my last response holds for the random-effects grouping factors. They will always be encoded with one parameter for each level. Thus, in the example the coding for Worker is irrelevant as long as you estimate random intercepts for it. Then, lme4 will always estimate one idiosyncratic random intercept for each level of worker. Hence the equivalence of m1 and m2.
  
  Second, suppressing the intercept for categorical covariates in R does something weird. It then estimates one parameter per level, but does not actually reduce the number of estimated parameters. See:
```
ncol(model.matrix(~Machine, Machines))
# [1] 3
ncol(model.matrix(~0+Machine, Machines))
# [1] 3
```
  This is different from the case with numerical covariates (note that the next code is statistically nonsensically):
```
ncol(model.matrix(~as.numeric(Machine), Machines))
# [1] 2
ncol(model.matrix(~0+as.numeric(Machine), Machines))
# [1] 1
```
  So when you use (0 + Machine|Worker) the coding scheme is again irrelevant because, again, one parameter is estimated per level.
  
  Hope that clears everything up.
- April 11, 2018 at 21:38 GMT+0000 #235
  
  statmerkur
  Participant
  
  Thanks, that cleared things up for me.
  What I still don’t understand is in which case the coding for the random-effects grouping factors does make a difference. Can you please give an (R code) example for this situation?
- April 13, 2018 at 08:07 GMT+0000 #236
  
  henrik
  Keymaster
  
  Hmm, I do not see a situation where it would matter. I repeat, they will always be encoded with one parameter per level (i.e., one-hot encoding).
- April 13, 2018 at 10:39 GMT+0000 #237
  
  statmerkur
  Participant
  
  OK, so mixed converts treatment coded random effects grouping factors to sum coded factors (via contr.sum) just by convention?
- April 14, 2018 at 18:42 GMT+0000 #238
  
  henrik
  Keymaster
  
  Exactly. mixed transforms all categorical covariates that is part of the formula to contr.sum. I thought it might lead to bugs if I only do this selectively (e.g., try to detect what are the grouping variables).
Author

Posts

Viewing 9 reply threads

You must be logged in to reply to this topic.

Forums

Recent Topics

Login

Search Forums

Why contr.sum for random effects grouping factors?