Mixed-Effects Models 详解:以反应时数据分析为例
1.概述
本文将以反应时数据分析为例详细介绍如何使用R语言进行混合效应模型(Mixed-Effects Models)分析(Freeman, Heathcote, Chalmers & Hockley, 2010)。这是对曾发布于“荷兰心理统计联盟”的使用R语言进行混合线性模型(mixed linear model) 分析代码及详解和“表哥有话讲”的基于R的混合线性模型的实现两篇文章的进一步扩充。
混合效应模型(Mixed-Effects Models),是指同时包含固定效应(fixed effects)及随机效应(random effects)的模型。其在不同领域有一些近义词,如多层线性模型(Hierarchical Linear Model)、线性混合模型(Linear Mixed Model)等。
2.实验和数据简介
本次分析中所用到的反应时原始数据来自Freeman等人的研究(Freeman et al., 2010),Freeman 等人采用2(task:lexical decision task vs. naming task)×2(neighborhood density:low vs. high)×2(stimulus frequency)×2(stimulus:word vs. nonword)的四因素混合实验设计(如图1所示),其中task为被试间变量,其余均为被试内变量。研究共招募了45名被试(词汇判断任务25名,命名任务20人)对300个单词(words)和300个非词(nonwords)的词汇判断反应时(lexical decision latencies)以及命名反应时(word naming latencies)。
图1 混合实验设计示意图
3
预处理
1
首先导入原始数据和所有分析所需的统计包,见下图。注意我们剔除掉了原始数据当中所有错误反应的试次。library("afex") # needed for mixed() and attaches lme4 automatically.
library("emmeans") # emmeans is needed for follow-up tests (and not anymore loaded automatically).
library("multcomp") # for advanced control for multiple testing/Type 1 errors.
library("dplyr") # for working with data frames
library("tidyr") # for transforming data frames from wide to long and the other way round.
library("lattice") # for plots
library("latticeExtra") # for combining lattice plots, etc.
lattice.options(default.theme = standard.theme(color = FALSE)) # black and white
lattice.options(default.args = list(as.table = TRUE)) # better ordering
data("fhch2010") # load
fhch <- droplevels(fhch2010[ fhch2010$correct,]) # remove errors
str(fhch2010) # structure of the data
## 'data.frame': 13222 obs. of 10 variables:
## $ id : Factor w/ 45 levels "N1","N12","N13",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ task : Factor w/ 2 levels "naming","lexdec": 1 1 1 1 1 1 1 1 1 1 ...
## $ stimulus : Factor w/ 2 levels "word","nonword": 1 1 1 2 2 1 2 2 1 2 ...
## $ density : Factor w/ 2 levels "low","high": 2 1 1 2 1 2 1 1 1 1 ...
## $ frequency: Factor w/ 2 levels "low","high": 1 2 2 2 2 2 1 2 1 2 ...
## $ length : Factor w/ 3 levels "4","5","6": 3 3 2 2 1 1 3 2 1 3 ...
## $ item : Factor w/ 600 levels "abide","acts",..: 363 121 202 525 580 135 42 368 227 141 ...
## $ rt : num 1.091 0.876 0.71 1.21 0.843 ...
## $ log_rt : num 0.0871 -0.1324 -0.3425 0.1906 -0.1708 ...
## $ correct : logi TRUE TRUE TRUE TRUE TRUE TRUE .
2
然后,检查一下每种条件下的被试数量和每位被试所接受的实验刺激数量。这个需要借助‘dplyr’包里面的功能实现。## are all participants in only one task?
fhch2010 %>% group_by(id) %>%
summarise(task = n_distinct(task)) %>%
as.data.frame() %>%
{.$task == 1} %>%
all()
## [1] TRUE
## participants per condition:
fhch2010 %>% group_by(id) %>%
summarise(task = first(task)) %>%
ungroup() %>%
group_by(task) %>%
summarise(n = n())
## # A tibble: 2 x 2
## task n
## <fct> <int>
## 1 naming 20
## 2 lexdec 25
## number of different items per participant:
fhch2010 %>% group_by(id, stimulus) %>%
summarise(items = n_distinct(item)) %>%
ungroup() %>%
group_by(stimulus) %>%
summarise(min = min(items),
max = max(items),
mean = mean(items))
## # A tibble: 2 x 4
## stimulus min max mean
## <fct> <int> <int> <dbl>
## 1 word 139 150 147.
## 2 nonword 134 150 146.
3
本次数据分析的因变量是反应时,通常来说,反应时都是呈正偏态分布的,因此,我们对其进行了log转换来使其更“正态”一些。下面的代码分别绘制了log_rt和rt的分布,从图上看,采用log转换确实让分布看起来更“正态”了一些。
如果原始数据呈正态分布,那直接使用原始数据进行分析。当原始数据非正态分布时,或者用density来看不是线性分布,需要进行log转换,以符合基本假设。
在很多时候,两种都可以分析一下,如果没有特别大差别,详细注明本次数据结果是原始数据,还是log后数据。因为不同分析方法,直接导致后续不同方面
译者注
译者注:但需要注意的是,跟传统统计分析(如ANOVA)数据分布需要符合正态分布的假设以外,mixed-effects models并没有数据分布一定要正态这一前提条件,相反,mixed-effects models的一个优点就在于它能handle不同分布的数据。因此,笔者这里内心存疑—我们真的需要对反应时数据进行log转换吗?
fhch_long <- fhch %>% gather("rt_type", "rt", rt, log_rt)
histogram(~rt|rt_type, fhch_long, breaks = "Scott", type = "density",
scale = list(x = list(relation = "free")))
4.描述统计
重新回顾一下我们的变量:被试间变量task (lexical decision task vs. naming task);被试内变量stimulus (word vs. nonword), density (low vs. high), frequency (low vs. high)。在进行正式分析之前,可以图示化我们的数据,看看数据结果跟我们内心的预期是否符合。由于所有的数据都嵌套(nested data structure)在被试水平(participant-level)和刺激水平(item-level)上,下面我们分别来查看两种水平上的数据结果趋势。
被试水平(participant-level)上的数据趋势,注意
agg_p <- fhch %>% group_by(id, task, stimulus, density, frequency) %>%
summarise(mean = mean(log_rt)) %>%
ungroup()
xyplot(mean ~ density:frequency|task+stimulus, agg_p, jitter.x = TRUE, pch = 20, alpha = 0.5,
panel = function(x, y, ...) {
panel.xyplot(x, y, ...)
tmp <- aggregate(y, by = list(x), mean)
panel.points(tmp$x, tmp$y, pch = 13, cex =1.5)
}) +
bwplot(mean ~ density:frequency|task+stimulus, agg_p, pch="|", do.out = FALSE)
刺激水平(item-level)上的数据趋势,注意
agg_i <- fhch %>% group_by(item, task, stimulus, density, frequency) %>%
summarise(mean = mean(log_rt)) %>%
ungroup()
xyplot(mean ~ density:frequency|task+stimulus, agg_i, jitter.x = TRUE, pch = 20, alpha = 0.2,
panel = function(x, y, ...) {
panel.xyplot(x, y, ...)
tmp <- aggregate(y, by = list(x), mean)
panel.points(tmp$x, tmp$y, pch = 13, cex =1.5)
}) +
bwplot(mean ~ density:frequency|task+stimulus, agg_i, pch="|", do.out = FALSE)
结果
结合上面的绘图我们初步可以得出以下结果:
①对于命名任务(naming task)来说,非词汇(nonword)反应时>词汇(word)反应时。
②就词汇(word)反应时而言,被试在词汇判断任务(lexical decision task)中的整体反应>在命名任务(naming task)中的整体反应。
③在命名任务(naming task)中,frequency对非词汇(nonword)反应时的主效应显著。
④就词汇(word)反应时而言,无论是命名任务(naming task)还是词汇判断任务(lexical decision task),似乎低频词汇所需要的反应时更长。
⑤从图上看,density在四种情况下的主效应均不显著。
5
模型建立
1
首先,从理念上来讲,mixed-effects models的建立分两种取向:
第一种是"maximal model approach"(Barr, Levy, Scheepers, & Tily, 2013),即一开始就放入所有能放入的random intercepts, random slopes和random correlations。如果模型有问题,再进行删减。
第二种则是一开始先跑一个比较简单一点的模型(如:只放random intercepts,不放random slopes and randomcorrelations),然后再次基础之上慢慢添加变量。本例将采取的是"maximal model approach"。
译者个人是倾向于‘maximal model approach’的。既然mixed-effects models最大的优势在于能同时考察fixed effects并控制random effects,那么理论上一开始我们就应该在模型中控制所有能控制的random effects才对。如果采用"simple model approach",可以想象一开始模型不会出现什么大的问题(如: convergence problem, singularityfit),然后再慢慢添加变量,直到模型出现问题就停止,这种做法本质上跟p-hacking没有什么区别,不是对待科学应有的态度。
1
第二步,我们来确定random intercepts和random slopes。由于所有的数据都嵌套(nested data structure)在被试水平(participant-level)和刺激水平(item-level)上,所以我们的random intercepts也就出来了,就是每位被试的id以及item。接下来是random slopes。在被试水平(participant-level)上共有三个变量可以纳入作为random slopes: stimulus,density, frequency; 在刺激水平(item-level)上则只有task这一个变量可以作为random slope。
1
第三步,需要注意有些时候跑模型会遇到convergence problem或者singularity fit。这个时候你的模型可能没有问题,但是R却无法确定此时此刻你的模型是否是最优解。
遇到这种情况,一种解决办法是稍微简化一下原有的"maximal model",通常的做法是去掉random correlations,只保留random intercepts andrandom slopes。照着这个思路,本例中我们一共会跑如下四个模型:
(1) 保留所有的random intercepts, randomslopes and random correlations—see model 1 in Table 1。
(2) 只去掉刺激水平(item-level)上的random correlation—see model 2 in Table 1。
(3) 只去掉被试水平(participant-level)上的random correlation—see model 3 in Table 1。
(4) 既去掉被试水平(participant-level)上又去掉刺激水平(item-level)上的random correlations—see model 4 in Table 1。
1
第四步,计算p-values。一般有三种方法:LRT(=LikelihoodRatio-Tests),
KR (= Kenward-Roger) F test,
the ‘Satterthwaite’ approximation。
前人研究表明 (Luke, 2017),这三种方法各有利弊,但相对来说目前效果最好的是KR F test,最不推荐LRT。在本例中我们将采用 ‘Scatterthwaite’ approximation和LRT。还有一种策略就是这三种方法都试一遍,如果能得到相同的分析结果,侧面也说明了结果的可靠性和稳健性。
1
6
数据结果
6.1
Satterthwaite Results
#----------------Satterthwaite Results------------
m1s <- mixed(log_rt ~ task*stimulus*density*frequency + (stimulus*density*frequency|id)+
(task|item), fhch, method = "S",
control = lmerControl(optCtrl = list(maxfun = 1e6)))
m2s <- mixed(log_rt ~ task*stimulus*density*frequency + (stimulus*density*frequency|id)+
(task||item), fhch, method = "S",
control = lmerControl(optCtrl = list(maxfun = 1e6)), expand_re = TRUE)
m3s <- mixed(log_rt ~ task*stimulus*density*frequency + (stimulus*density*frequency||id)+
(task|item), fhch, method = "S",
control = lmerControl(optCtrl = list(maxfun = 1e6)), expand_re = TRUE)
m4s <- mixed(log_rt ~ task*stimulus*density*frequency + (stimulus*density*frequency||id)+
(task||item), fhch, method = "S",
control = lmerControl(optCtrl = list(maxfun = 1e6)), expand_re = TRUE)
load(system.file("extdata/", "freeman_models.rda", package = "afex"))
m1s
m2s
m3s
m4s
以下分别是四个模型的结果,注意如果我们遇到了"convergence problem",在模型结果部分会出现Warnings。
> Model 1
## Warning: lme4 reported (at least) the following warnings for 'full':
## * unable to evaluate scaled gradient
## * Model failed to converge: degenerate Hessian with 1 negative eigenvalues
## Mixed Model Anova Table (Type 3 tests, S-method)
##
## Model: log_rt ~ task * stimulus * density * frequency + (stimulus *
## Model: density * frequency | id) + (task | item)
## Data: fhch
## Effect df F p.value
## 1 task 1, NA 128.72 <NA>
## 2 stimulus 1, NA 117.02 <NA>
## 3 density 1, NA 1.20 <NA>
## 4 frequency 1, NA 1.30 <NA>
## 5 task:stimulus 1, NA 63.74 <NA>
## 6 task:density 1, NA 10.59 <NA>
## 7 stimulus:density 1, NA 0.39 <NA>
## 8 task:frequency 1, NA 55.81 <NA>
## 9 stimulus:frequency 1, NA 85.68 <NA>
## 10 density:frequency 1, NA 0.03 <NA>
## 11 task:stimulus:density 1, NA 12.87 <NA>
## 12 task:stimulus:frequency 1, NA 119.08 <NA>
## 13 task:density:frequency 1, NA 5.22 <NA>
## 14 stimulus:density:frequency 1, NA 2.45 <NA>
## 15 task:stimulus:density:frequency 1, NA 10.16 <NA>
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1
> Model 2
## Warning: lme4 reported (at least) the following warnings for 'full':
## * unable to evaluate scaled gradient
## * Model failed to converge: degenerate Hessian with 1 negative eigenvalues
## Mixed Model Anova Table (Type 3 tests, S-method)
##
## Model: log_rt ~ task * stimulus * density * frequency + (stimulus *
## Model: density * frequency | id) + (task || item)
## Data: fhch
## Effect df F p.value
## 1 task 1, 43.54 13.69 *** .0006
## 2 stimulus 1, 51.06 150.61 *** <.0001
## 3 density 1, 192.25 0.31 .58
## 4 frequency 1, 72.78 0.52 .47
## 5 task:stimulus 1, 52.03 71.20 *** <.0001
## 6 task:density 1, 201.56 15.92 *** <.0001
## 7 stimulus:density 1, 287.88 1.06 .30
## 8 task:frequency 1, 76.76 80.05 *** <.0001
## 9 stimulus:frequency 1, 177.48 55.45 *** <.0001
## 10 density:frequency 1, 235.01 0.12 .73
## 11 task:stimulus:density 1, 300.16 14.21 *** .0002
## 12 task:stimulus:frequency 1, 190.61 109.33 *** <.0001
## 13 task:density:frequency 1, 248.09 5.46 * .02
## 14 stimulus:density:frequency 1, 104.15 3.72 + .06
## 15 task:stimulus:density:frequency 1, 111.32 10.07 ** .002
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1
> Model 3
## Mixed Model Anova Table (Type 3 tests, S-method)
##
## Model: log_rt ~ task * stimulus * density * frequency + (stimulus *
## Model: density * frequency || id) + (task | item)
## Data: fhch
## Effect df F p.value
## 1 task 1, 43.52 13.68 *** .0006
## 2 stimulus 1, 50.57 151.33 *** <.0001
## 3 density 1, 584.49 0.36 .55
## 4 frequency 1, 70.26 0.56 .46
## 5 task:stimulus 1, 51.50 71.29 *** <.0001
## 6 task:density 1, 578.65 17.89 *** <.0001
## 7 stimulus:density 1, 584.50 1.19 .28
## 8 task:frequency 1, 74.11 82.66 *** <.0001
## 9 stimulus:frequency 1, 584.68 63.34 *** <.0001
## 10 density:frequency 1, 584.54 0.11 .74
## 11 task:stimulus:density 1, 578.66 14.86 *** .0001
## 12 task:stimulus:frequency 1, 578.82 124.10 *** <.0001
## 13 task:density:frequency 1, 578.69 5.92 * .02
## 14 stimulus:density:frequency 1, 88.40 4.16 * .04
## 15 task:stimulus:density:frequency 1, 94.42 10.60 ** .002
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1
> Model 4
## Mixed Model Anova Table (Type 3 tests, S-method)
##
## Model: log_rt ~ task * stimulus * density * frequency + (stimulus *
## Model: density * frequency || id) + (task || item)
## Data: fhch
## Effect df F p.value
## 1 task 1, 43.54 13.67 *** .0006
## 2 stimulus 1, 51.05 150.79 *** <.0001
## 3 density 1, 587.35 0.35 .55
## 4 frequency 1, 71.90 0.53 .47
## 5 task:stimulus 1, 52.02 71.50 *** <.0001
## 6 task:density 1, 582.30 17.50 *** <.0001
## 7 stimulus:density 1, 587.35 1.13 .29
## 8 task:frequency 1, 75.90 81.49 *** <.0001
## 9 stimulus:frequency 1, 587.51 62.27 *** <.0001
## 10 density:frequency 1, 587.39 0.11 .74
## 11 task:stimulus:density 1, 582.31 14.61 *** .0001
## 12 task:stimulus:frequency 1, 582.45 121.11 *** <.0001
## 13 task:density:frequency 1, 582.34 5.84 * .02
## 14 stimulus:density:frequency 1, 90.80 3.90 + .05
## 15 task:stimulus:density:frequency 1, 97.08 10.52 ** .002
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1
结果总结
总结一下以上的结果。
首先,Model 1和Model 2都出现了convergence problem,而Model 3和Model 4则没有。请注意convergence problem并不是说模型本身出现了问题,而是R并不确定当前的模型是否是拟合数据的最优解。
其次,除了Model 1无法提供p-values 以外,Model 2, Model 3和Model 4都得出了同样的分析结果—几乎完全相同的Effect值,df值,F值以及p-values。这说明我们之前遇到的convergence problem是一个false positive,我们可以不用再担心它了。
我们得到了2个显著的主效应: main effects for taskand stimulus。四个显著的二阶交互作用: ‘task: stimulus’, ‘task : density’, ‘task : frequency’, and ‘stimulus : frequency’。三个显著的三阶交互作用: ‘task: stimulus : density’, ‘task : stimulus : frequency’, and ‘task : density :frequency’。一个边缘显著的三阶交互作用: stimulus : density :frequency。一个显著的四阶交互作用: task : stimulus : density : frequency。
6.2
LRT Results1
使用LRT Methods跑模型得出了和上述一致的分析结果,这让我们对分析结果有了更充足的信心。具体见R代码。
#-------------------------------- LRT Results ------------------------------
m1lrt <- mixed(log_rt ~ task*stimulus*density*frequency + (stimulus*density*frequency|id)+
(task|item), fhch, method = "LRT",
control = lmerControl(optCtrl = list(maxfun = 1e6)))
m2lrt <- mixed(log_rt ~ task*stimulus*density*frequency + (stimulus*density*frequency|id)+
(task||item), fhch, method = "LRT",
control = lmerControl(optCtrl = list(maxfun = 1e6)), expand_re = TRUE)
m3lrt <- mixed(log_rt ~ task*stimulus*density*frequency + (stimulus*density*frequency||id)+
(task|item), fhch, method = "LRT",
control = lmerControl(optCtrl = list(maxfun = 1e6)), expand_re = TRUE)
m4lrt <- mixed(log_rt ~ task*stimulus*density*frequency + (stimulus*density*frequency||id)+
(task||item), fhch, method = "LRT",
control = lmerControl(optCtrl = list(maxfun = 1e6)), expand_re = TRUE)
res_lrt <- cbind(nice_lrt[[1]], " " = " ",
nice_lrt[[4]][,-(1:2)])
colnames(res_lrt)[c(3,4,6,7)] <- paste0(
rep(c("m1_", "m4_"), each =2), colnames(res_lrt)[c(3,4)])
res_lrt
nice_lrt[[2]]
7
事后检验/多重比较
我们用R当中的‘emmeans’ package来进行事后比较。本例中将着重分析两个交互作用: ‘task : stimulus :frequency’ interaction and ‘task : stimulus :density : frequency’ interaction。
7.1
task : stimulus : frequency Interaction
emm_options(lmer.df = "asymptotic") # also possible: 'satterthwaite', 'kenward-roger'
emm_i1 <- emmeans(m2s, "frequency", by = c("stimulus", "task"))
## NOTE: Results may be misleading due to involvement in interactions
emm_i1
## stimulus = word, task = naming:
## frequency emmean SE df asymp.LCL asymp.UCL
## low -0.32326 0.0415 Inf -0.4047 -0.2418
## high -0.38193 0.0457 Inf -0.4715 -0.2923
##
## stimulus = nonword, task = naming:
## frequency emmean SE df asymp.LCL asymp.UCL
## low -0.14314 0.0458 Inf -0.2330 -0.0533
## high 0.06363 0.0497 Inf -0.0337 0.1610
##
## stimulus = word, task = lexdec:
## frequency emmean SE df asymp.LCL asymp.UCL
## low 0.02323 0.0373 Inf -0.0499 0.0964
## high -0.03996 0.0410 Inf -0.1203 0.0404
##
## stimulus = nonword, task = lexdec:
## frequency emmean SE df asymp.LCL asymp.UCL
## low 0.10406 0.0412 Inf 0.0234 0.1847
## high -0.00646 0.0445 Inf -0.0937 0.0808
##
## Results are averaged over the levels of: density
## Degrees-of-freedom method: asymptotic
## Confidence level used: 0.95
接下来我们再比较在所有的task:stimulus combination (word/naming vs.nonword/naming vs. word/lexdec vs. nonword/lexdec)上面frequency的两个水平(highvs. low)之间是否存在显著差异。代码及结果如下:
update(pairs(emm_i1), by = NULL, adjust = "holm")
## contrast stimulus task estimate SE df z.ratio p.value
## low - high word naming 0.0587 0.0162 Inf 3.620 0.0003
## low - high nonword naming -0.2068 0.0177 Inf -11.703 <.0001
## low - high word lexdec 0.0632 0.0152 Inf 4.165 0.0001
## low - high nonword lexdec 0.1105 0.0164 Inf 6.719 <.0001
##
## Results are averaged over the levels of: density
## P value adjustment: holm method for 4 tests
7.2
task : stimulus :density : frequency Interaction最后一个例子是四阶交互作用(task : stimulus : density : frequency),首先我们来看看density : frequency交互在task : stimulus交互上面的变化。代码及具体结果如下:
emm_i2 <- emmeans(m2s, c("density", "frequency"), by = c("stimulus", "task"))
con1 <- contrast(emm_i2, "trt.vs.ctrl1", by = c("frequency", "stimulus", "task")) # density
con2 <- contrast(con1, "trt.vs.ctrl1", by = c("contrast", "stimulus", "task"))
test(con2, joint = TRUE, by = c("stimulus", "task"))
## stimulus task df1 df2 F.ratio p.value
## word naming 1 Inf 0.105 0.7464
## nonword naming 1 Inf 2.537 0.1112
## word lexdec 1 Inf 1.790 0.1809
## nonword lexdec 1 Inf 16.198 0.0001
接下来我们进行两两比较,代码和结果如下:
emm_i2
## stimulus = word, task = naming:
## density frequency emmean SE df asymp.LCL asymp.UCL
## low low -0.31384 0.0448 Inf -0.4016 -0.22606
## high low -0.33268 0.0408 Inf -0.4127 -0.25269
## low high -0.37741 0.0466 Inf -0.4688 -0.28601
## high high -0.38645 0.0472 Inf -0.4790 -0.29390
##
## stimulus = nonword, task = naming:
## density frequency emmean SE df asymp.LCL asymp.UCL
## low low -0.10399 0.0500 Inf -0.2019 -0.00606
## high low -0.18230 0.0442 Inf -0.2688 -0.09575
## low high 0.07823 0.0520 Inf -0.0237 0.18019
## high high 0.04902 0.0495 Inf -0.0479 0.14599
##
## stimulus = word, task = lexdec:
## density frequency emmean SE df asymp.LCL asymp.UCL
## low low 0.03714 0.0404 Inf -0.0420 0.11624
## high low 0.00932 0.0368 Inf -0.0628 0.08144
## low high -0.04513 0.0419 Inf -0.1273 0.03705
## high high -0.03479 0.0424 Inf -0.1180 0.04837
##
## stimulus = nonword, task = lexdec:
## density frequency emmean SE df asymp.LCL asymp.UCL
## low low 0.04480 0.0449 Inf -0.0432 0.13282
## high low 0.16331 0.0399 Inf 0.0852 0.24144
## low high -0.00728 0.0467 Inf -0.0988 0.08425
## high high -0.00563 0.0445 Inf -0.0928 0.08151
##
## Degrees-of-freedom method: asymptotic
## Confidence level used: 0.95
# desired contrats:
des_c <- list(
ll_hl = c(1, -1, 0, 0),
lh_hh = c(0, 0, 1, -1)
)
update(contrast(emm_i2, des_c), by = NULL, adjust = "holm")
## contrast stimulus task estimate SE df z.ratio p.value
## ll_hl word naming 0.01883 0.0210 Inf 0.898 1.0000
## lh_hh word naming 0.00904 0.0212 Inf 0.427 1.0000
## ll_hl nonword naming 0.07831 0.0220 Inf 3.554 0.0027
## lh_hh nonword naming 0.02921 0.0211 Inf 1.385 0.9763
## ll_hl word lexdec 0.02782 0.0199 Inf 1.396 0.9763
## lh_hh word lexdec -0.01034 0.0199 Inf -0.520 1.0000
## ll_hl nonword lexdec -0.11852 0.0209 Inf -5.670 <.0001
## lh_hh nonword lexdec -0.00164 0.0198 Inf -0.083 1.0000
##
## P value adjustment: holm method for 8 tests
8 参考文献
Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68(3), 255-278. https://doi.org/10.1016/j.jml.2012.11.001
Bretz, F., Hothorn, T., & Westfall, P. H. (2011). Multiple comparisons using R. Boca Raton, FL: CRC Press. https://CRAN.R-project.org/package=multcomp
Freeman, E., Heathcote, A., Chalmers, K., & Hockley, W. (2010). Item effects in recognition memory for words. Journal of Memory and Language, 62(1), 1-18. https://doi.org/10.1016/j.jml.2009.09.004
Lenth, R. (2017). emmeans: Estimated Marginal Means, aka Least-Squares Means. R package version 0.9.1. https://CRAN.R-project.org/package=emmeans
Maxwell, S. E., & Delaney, H. D. (2004). Designing experiments and analyzing data: a model-comparisons perspective. Mahwah, N.J.: Lawrence Erlbaum Associates.
作 者:Freeman, Heathcote, Chalmers & Hockley
译 者:Ted
校 对:赵加伟、胡传鹏、方圆
排 版:念靖晴
Open Science Club
Make Psychological Science Open & Trustworthy!