Home › forums › Mixed Models › PB pvalues for a poisson glmm with an offset?
This topic contains 4 replies, has 2 voices, and was last updated by CMac 1 year, 2 months ago.

AuthorPosts

December 7, 2017 at 06:03 UTC #152
Hello! I would like to use the mixed function to calculate pvalues using the parametric bootstrap (PB) method on a GLMM, specifically a loglink Poisson model with an offset. Is this possible? I have included an offset argument in the mixed() function as one would when using glmer(), but I don’t think the mixed function likes it; I get an ‘object of type builtin is not subsettable’ error.
This is the script I am using:
mod1<mixed((index*abundance)~factor1*factor2+(1subject), data=exp, family=poisson(link=”log”), offset=(log(abundance), method=”PB”)It works fine as:
mod1<glmer((index*abundance)~factor1*factor2+(1subject), data=exp, family=poisson(link=”log”), offset=(log(abundance), REML=F)… but it would be great to have some pvalues.
Please advise if I should be using a different formula to use an offset with mixed(). Many thanks in advance! 
December 7, 2017 at 10:57 UTC #153
Can you provide some example data? Without being able to reproduce the problem I am unable to fix it.
The example data can be as trivial as possible. See the following link for advise on how to create it: https://stackoverflow.com/q/5963269/289572
 This reply was modified 1 year, 2 months ago by henrik.

December 8, 2017 at 21:40 UTC #157
Whoops, sorry, whilst creating the example data I identified the cause of the ‘type builtin not subsettable’ was related to my rookie mistake naming my dataframe ‘exp’ (doh!).
However things still don’t seem right: the mixed function now runs, but is taking an extremely long time to run compared with glmer, and hasn’t completed calculating Pvalues after several hours on my full dataset. It also produces a lot of nonconvergence warnings.
Here’s an example data subset and code. The mixed function does eventually complete calculations on this data, but still comes up with the nonconvergence warnings. It would be great if you could let me know if it’s working as expected or not. I hope I haven’t done anything else silly (I have checked everything I can think of). Thanks very much!!
#R version 3.4.3 (20171130)  "KiteEating Tree" #packages library(lme4) library(afex) #example data zz<"factor1 factor2 subject abundance max A a 1 659 658 A a 2 1408 1407 A b 1 826 789 A b 2 1348 1271 B a 3 242 242 B a 4 487 480 B b 3 605 585 B b 4 618 599 C a 5 382 346 C a 6 613 598 C b 5 370 336 C b 6 1002 963" df < read.table(text=zz, header = TRUE) #factors df$factor1<as.factor(df$factor1) df$factor2<as.factor(df$factor2) df$subject<as.factor(df$subject) #create index variable df$index<df$max/df$abundance #lme4::glmer mod1<glmer((index*abundance)~factor1*factor2+(1subject), data=df, family=poisson(link="log"), offset=(log(df$abundance))) #afex::mixed mod2<mixed((index*abundance)~factor1*factor2+(1subject), data=df, family=poisson(link="log"), offset=(log(df$abundance)), method="PB")
 This reply was modified 1 year, 2 months ago by henrik. Reason: nicer formatting of code

December 10, 2017 at 17:58 UTC #160
As the help page says, ‘
method = "PB"
calculates pvalues using parametric bootstrap’. This means, it calculates a reference distribution of chi^2square values assuming the null model is true, by generating data from the full model and then fitting both models (i.e., full and restricted model) to it. For each of those fits the chisquare value is obtained. The pvalues is then the proportion of sampled chisquare values larger than the actual chisquare value based on the real data (it actually is the number of samples larger plus one divided by the number of samples). Because of these repeated fits you also get the repeated nonconvergence warnings.Per default it uses 1000 samples for this per test. This means, it has to be considerably slower than just fitting in
glmer
. In fact, roughly 1000 times the number of effects in your model slower. You can change the number of parametric bootstrap samples via theargs_test
andnsim
, e.g.,args_test = list(nsim = 500)
. However, values lower than 500 seem suspicious. Note that you can also distribute the estimation using multiple cores (see examples ofmixed
help).If you do not have the time to wait, you can also use
method = "LRT"
. Note however that for this you should have enough levels for each of your randomeffects grouping factors (more than 30, better more than 50). Otherwise the results might be anticonservative. 
December 10, 2017 at 20:00 UTC #162
Thanks Henrik. I will just be more patient then 🙂
I was just concerned because I am running some lmers on the same dataset (different response variables) with mixed using method=”PB” and they only take a few minutes to run, compared with the hours in which the glmer had not completed its calculations. The dataset is not that big, only 144 samples.Thank you very much for your help and explanations!

AuthorPosts
You must be logged in to reply to this topic.