R包vegan的Mantel tests
之后使用两组距离测度矩阵执行Mantel tests,例如确定样方之间的群落组成差异是否与样方之间的温度差异或样方之间的物理距离相关,或者说“共变”。这些测试可用于解决环境是针对微生物群落的“选择”,还是存在强烈的距离衰减模式,表明存在扩散限制。这些通常是生物地理学研究中的重要问题。
vegan包的Mantel tests方法
本篇同样以群落分析为例,简介R包vegan的Mantel tests。
假设存在如下数据集。第1列是样方名称,第2-5列为各样方中的环境参数(即盐度、温度等),第6-7列为各样方的纬度和经度,第8列及之后为各样方中的物种及其丰度。我们期望通过Mantel tests,查看对于该数据集,作用于物种变化的最主要因素,是由环境引起的“选择”,还是由地理因素的扩散限制所致。
加载R包,如上所述,首先计算两组样方距离测度,然后执行Mantel tests。
library(vegan)#读取上述数据集
df <- read.csv('Your_OTU_table.csv', header= TRUE)
##计算距离
#根据物种丰度数据,计算样方间的 Bray-curtis 距离
abund <- df[ ,8:ncol(df)]
dist.abund <- vegdist(abund, method = 'bray')
#根据环境测量指标,计算样方间的欧几里得距离
#这里只选择了其中的温度指标,期望关注物种变化与温度的相关性
temp <- df$Temperature
dist.temp <- dist(temp, method = 'euclidean')
#如果期望关注多种环境的协同作用,就选择一个环境子集,计算样方间的欧几里得距离
#例如使用 4 种环境数据,但此时需要执行数据标准化,以消除量纲差异
env <- df[ ,2:5]
scale.env <- scale(env, center = TRUE, scale = TRUE)
dist.env <- dist(scale.env, method = 'euclidean')
#根据经纬度,计算样方间实际的地理距离
geo <- data.frame(df$Longitude, df$Latitude)
d.geo <- distm(geo, fun = distHaversine) #library(geosphere)
dist.geo <- as.dist(d.geo)
##执行 Mantel tests,详情 ?mantel,以下为 3 个示例
#物种丰度和温度的相关性,以 spearman 相关系数为例,9999 次置换检验显著性(Mantel tests 基于随机置换的方法获取 p 值)
abund_temp <- mantel(dist.abund, dist.temp, method = 'spearman', permutations = 9999, na.rm = TRUE)
abund_temp
#物种丰度和地理距离的相关性,以 spearman 相关系数为例,9999 次置换检验显著性
abund_geo <- mantel(dist.abund, dist.geo, method = 'spearman', permutations = 9999, na.rm = TRUE)
abund_geo
#物种丰度和 4 种环境组合的相关性,以 spearman 相关系数为例,9999 次置换检验显著性
abund_env <- mantel(dist.abund, dist.env, method = 'spearman', permutations = 9999, na.rm = TRUE)
abund_env
基于物种丰度的距离矩阵与基于温度指标的距离矩阵之间有很强的相关性(Mantel statistic R: 0.667,p value = 1e-04)。换句话说,随着样方在温度方面的差异逐渐增大,它们在物种组成方面的差异也越来越大。
#物种丰度和温度的相关性> abund_temp
Mantel statistic based on Spearman's rank correlation rho
Call:
mantel(xdis = dist.abund, ydis = dist.temp, method = "spearman", permutations = 9999, na.rm = TRUE)
Mantel statistic r: 0.677
Significance: 1e-04
Upper quantiles of permutations (null model):
90% 95% 97.5% 99%
0.148 0.198 0.246 0.290
Permutation: free
Number of permutations: 9999
基于物种丰度的距离矩阵与样方间的地理距离没有显著关系(Mantel statistic R: 0.138,p value = 0.052)。因此可知,对于该测试数据集,不存在物种丰度的距离衰减效应。
#物种丰度和地理距离的相关性> abund_geo
Mantel statistic based on Spearman's rank correlation rho
Call:
mantel(xdis = dist.abund, ydis = dist.geo, method = "spearman", permutations = 9999, na.rm = TRUE)
Mantel statistic r: 0.1379
Significance: 0.0525
Upper quantiles of permutations (null model):
90% 95% 97.5% 99%
0.107 0.140 0.170 0.204
Permutation: free
Number of permutations: 9999
同时对于4种环境变量组合,累积的环境因素与群落物种组成高度相关(Mantel statistic r: 0.686, p value = 1e-04)。
#物种丰度和 4 种环境组合的相关性> abund_env
Call:
mantel(xdis = dist.abund, ydis = dist.env, method = "spearman", permutations = 9999, na.rm = TRUE)
Mantel statistic r: 0.6858
Significance: 1e-04
Upper quantiles of permutations (null model):
90% 95% 97.5% 99%
0.151 0.201 0.244 0.292
Permutation: free
Number of permutations: 9999
综上结论,对于该数据集,与地理距离相比,群落物种组成与环境参数的相关性更强。因此在该系统中,主要发生环境对群落作出的“选择”,地理因素的扩散限制相对微弱。
作图观测相关性的示例
最后不妨作图观测变量间的关系,加深对这种相关性的理解。
#某物种与温度的相关性,横轴温度,纵轴物种丰度,颜色表示样方的纬度
xx = ggplot(df, aes(x = Temperature, y = Pelagibacteraceae.OTU_307744)) +
geom_smooth(method = 'lm', alpha = 0.2, colour = 'black') +
geom_point(aes(colour = Latitude), size = 4) +
labs(y = 'Pelagibacteraceae (OTU 307744) (%)', x = 'Temperature (C)') +
theme( axis.text.x = element_text(face = 'bold',colour = 'black', size = 12),
axis.text.y = element_text(face = 'bold', size = 11, colour = 'black'),
axis.title= element_text(face = 'bold', size = 14, colour = 'black'),
panel.background = element_blank(),
panel.border = element_rect(fill = NA, colour = 'black'),
legend.title = element_text(size =12, face = 'bold', colour = 'black'),
legend.text = element_text(size = 10, face = 'bold', colour = 'black')) +
scale_colour_continuous(high = 'navy', low = 'salmon')
xx
fit <- lm(df$Temperature~df$Pelagibacteraceae.OTU_307744)
summary(fit)
Call:
lm(formula = df$Temperature ~ df$Pelagibacteraceae.OTU_307744)
Residuals:
Min 1Q Median 3Q Max
-2.2053 -0.9336 -0.5215 0.5028 3.8232
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.4082 0.4476 0.912 0.372
df$Pelagibacteraceae.OTU_307744 1.3008 0.1280 10.165 1.45e-09 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.634 on 21 degrees of freedom
Multiple R-squared: 0.8311, Adjusted R-squared: 0.823
F-statistic: 103.3 on 1 and 21 DF, p-value: 1.454e-09
#分面图展示多组变量的相关性,横轴温度,纵轴为多个物种的丰度,颜色表示样方的纬度
library(reshape2)
otus <- df[ ,1:11]
otus_melt <- melt(otus, id = c('Station', 'Salinity', 'Temperature', 'Oxygen', 'Nitrate', 'Latitude', 'Longitude'))
xx <- ggplot(otus_melt, aes(x = Temperature, y = value)) +
facet_wrap(.~variable, scales = 'free_y') +
geom_smooth(method = 'lm', alpha = 0.2, colour = 'black') +
geom_point(aes(colour = Latitude), size = 4) +
labs(y = 'Relative Abundance (%)', x = 'Temperature (C)') +
theme( axis.text.x = element_text(face = 'bold',colour = 'black', size = 12),
axis.text.y = element_text(face = 'bold', size = 10, colour = 'black'),
axis.title= element_text(face = 'bold', size = 14, colour = 'black'),
panel.background = element_blank(),
panel.border = element_rect(fill = NA, colour = 'black'),
legend.title = element_text(size =12, face = 'bold', colour = 'black'),
legend.text = element_text(size = 10, face = 'bold', colour = 'black'),
legend.position = 'top', strip.background = element_rect(fill = 'grey90', colour = 'black'),
strip.text = element_text(size = 9, face = 'bold')) +
scale_colour_continuous(high = 'navy', low = 'salmon')
xx
上述主要展示的变量间相关性的散点图。
接下来是对于距离测度间的相关性。
#将上文获得的距离测度,转化为数据框,一一对应起来aa <- as.vector(dist.abund)
tt <- as.vector(dist.temp)
gg <- as.vector(dist.geo)
mat <- data.frame(aa, tt, gg)
#基于物种丰度的距离与基于温度指标的距离之间的相关性散点图,上文已知二者显著相关;同时颜色表示样方间地理距离
mm <- ggplot(mat, aes(y = aa, x = tt)) +
geom_point(size = 4, alpha = 0.75, colour = "black",shape = 21, aes(fill = gg/1000)) +
geom_smooth(method = "lm", colour = "black", alpha = 0.2) +
labs(x = "Difference in Temperature (C)", y = "Bray-Curtis Dissimilarity", fill = "Physical Separation (km)") +
theme( axis.text.x = element_text(face = "bold",colour = "black", size = 12),
axis.text.y = element_text(face = "bold", size = 11, colour = "black"),
axis.title= element_text(face = "bold", size = 14, colour = "black"),
panel.background = element_blank(),
panel.border = element_rect(fill = NA, colour = "black"),
legend.position = "top",
legend.text = element_text(size = 10, face = "bold"),
legend.title = element_text(size = 11, face = "bold")) +
scale_fill_continuous(high = "navy", low = "skyblue")
mm
#基于物种丰度的距离与样方间地理距离之间的相关性散点图,上文已知二者无相关性
mm <- ggplot(mat, aes(y = aa, x = gg/1000)) +
geom_point(size = 3, alpha = 0.5) +
labs(x = "Physical separation (km)", y = "Bray-Curtis Dissimilarity") +
theme( axis.text.x = element_text(face = "bold",colour = "black", size = 12),
axis.text.y = element_text(face = "bold", size = 11, colour = "black"),
axis.title= element_text(face = "bold", size = 14, colour = "black"),
panel.background = element_blank(),
panel.border = element_rect(fill = NA, colour = "black"))
mm
参考资料