library(brms)
library(ggplot2)
library(knitr)
library(lme4)
library(magrittr)
library(mice)
library(tidyverse)
Planned analyses
In what follows, we outline how we plan to analyze the data. Please note that depending on violation of assumptions or non-convergence of models, we likely need to further adjust the model, which is why we cannot preregister the exact model we will ultimately run. But the general approach will be as follows.
Load data
<- read.csv("data/data_simulated.csv") d
Data wrangling
<- d %>%
d mutate(persistence = persistence - .5,
anonymity = anonymity - .5,
topic = factor(topic, labels = c("corona", "gender", "other")),
repetition = as.factor(repetition),
group = as.factor(group),
# make expressions as positive integers
expressions = abs(min(expressions)) + expressions,
expressions = as.integer(expressions)
)
# make expressions zero-inflated
sample(c(1:nrow(d)), 100), ]$expressions <- 0
d[
# introduce NAs
sample(c(1:nrow(d)), 30), ]$expressions <- NA d[
Data imputation
<- mice(d, m = 100, print = FALSE) d
Analyze data
Bayesian mixed effects modeling
Fixed effects
<-
fit_fe brm_multiple(
~ 1 + persistence * anonymity +
expressions 1 | topic) +
(1 | group),
(data = d,
silent = 2,
refresh = 0,
chains = 2,
family = zero_inflated_poisson("log")
)
summary(fit_fe)
Family: zero_inflated_poisson
Links: mu = log; zi = identity
Formula: expressions ~ 1 + persistence * anonymity + (1 | topic) + (1 | group)
Data: d (Number of observations: 960)
Draws: 200 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 200000
Multilevel Hyperparameters:
~group (Number of levels: 48)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.03 0.02 0.00 0.09 1.01 13744 40109
~topic (Number of levels: 3)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.09 0.14 0.00 0.44 1.05 2245 813
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
Intercept 0.81 0.07 0.67 0.94 1.06 2092
persistence -0.10 0.05 -0.19 -0.01 1.01 9585
anonymity 0.12 0.05 0.03 0.21 1.01 10175
persistence:anonymity 0.12 0.09 -0.06 0.31 1.01 8376
Tail_ESS
Intercept 1220
persistence 38273
anonymity 63450
persistence:anonymity 25836
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
zi 0.03 0.01 0.01 0.06 1.02 4681 5829
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Random effects
<-
fit_re brm_multiple(
~ 1 + persistence * anonymity +
expressions 1 + persistence * anonymity | topic) +
(1 + persistence * anonymity | group),
(data = d,
silent = 2,
refresh = 0,
chains = 2,
family = zero_inflated_poisson("log")
)
summary(fit_re)
Family: zero_inflated_poisson
Links: mu = log; zi = identity
Formula: expressions ~ 1 + persistence * anonymity + (1 + persistence * anonymity | topic) + (1 + persistence * anonymity | group)
Data: d (Number of observations: 960)
Draws: 200 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 200000
Multilevel Hyperparameters:
~group (Number of levels: 48)
Estimate Est.Error l-95% CI u-95% CI
sd(Intercept) 0.03 0.02 0.00 0.09
sd(persistence) 0.06 0.05 0.00 0.18
sd(anonymity) 0.06 0.05 0.00 0.18
sd(persistence:anonymity) 0.12 0.10 0.00 0.35
cor(Intercept,persistence) -0.03 0.45 -0.83 0.80
cor(Intercept,anonymity) -0.01 0.45 -0.82 0.81
cor(persistence,anonymity) 0.03 0.45 -0.81 0.82
cor(Intercept,persistence:anonymity) 0.03 0.45 -0.80 0.83
cor(persistence,persistence:anonymity) -0.02 0.45 -0.82 0.80
cor(anonymity,persistence:anonymity) -0.02 0.45 -0.82 0.80
Rhat Bulk_ESS Tail_ESS
sd(Intercept) 1.01 17536 44380
sd(persistence) 1.01 19179 49567
sd(anonymity) 1.01 27423 25849
sd(persistence:anonymity) 1.01 21086 51097
cor(Intercept,persistence) 1.01 17439 43783
cor(Intercept,anonymity) 1.01 14408 22168
cor(persistence,anonymity) 1.01 9238 3278
cor(Intercept,persistence:anonymity) 1.01 14660 16370
cor(persistence,persistence:anonymity) 1.01 29892 95587
cor(anonymity,persistence:anonymity) 1.01 11672 35895
~topic (Number of levels: 3)
Estimate Est.Error l-95% CI u-95% CI
sd(Intercept) 0.12 0.17 0.00 0.62
sd(persistence) 0.28 0.35 0.01 1.27
sd(anonymity) 0.24 0.31 0.01 1.17
sd(persistence:anonymity) 0.42 0.59 0.01 2.08
cor(Intercept,persistence) -0.03 0.46 -0.84 0.82
cor(Intercept,anonymity) 0.02 0.46 -0.82 0.84
cor(persistence,anonymity) -0.05 0.47 -0.86 0.81
cor(Intercept,persistence:anonymity) 0.02 0.46 -0.82 0.84
cor(persistence,persistence:anonymity) -0.04 0.46 -0.85 0.81
cor(anonymity,persistence:anonymity) 0.02 0.46 -0.82 0.85
Rhat Bulk_ESS Tail_ESS
sd(Intercept) 1.02 6204 2206
sd(persistence) 1.01 10929 7453
sd(anonymity) 1.01 10454 6697
sd(persistence:anonymity) 1.04 2876 1164
cor(Intercept,persistence) 1.01 21790 18450
cor(Intercept,anonymity) 1.01 21695 57354
cor(persistence,anonymity) 1.02 7616 4088
cor(Intercept,persistence:anonymity) 1.01 21972 35474
cor(persistence,persistence:anonymity) 1.01 25380 54882
cor(anonymity,persistence:anonymity) 1.01 16162 32143
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
Intercept 0.82 0.10 0.62 1.04 1.04 3243
persistence -0.10 0.21 -0.56 0.36 1.01 8330
anonymity 0.12 0.18 -0.29 0.52 1.01 10602
persistence:anonymity 0.15 0.44 -0.54 1.00 1.04 2758
Tail_ESS
Intercept 1639
persistence 7268
anonymity 6515
persistence:anonymity 1165
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
zi 0.03 0.01 0.01 0.06 1.03 4211 2437
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).