查看原文
其他

每天学习一点R:18.barplot条形图之相关性分析结果分布

红皇后学术 红皇后学术 2022-06-07

今天继续介绍基于barplot()函数的图像绘制,但今天的重点并不是如何绘制图像,而是如何通过前处理进行数据的相关性分析和频率分布统计

原图展示

还是上一篇推文中介绍的对称条形图的文献,其中包含一个对疾病和健康marker微生物的丰度进行相关性分析,进而使用条形图展示其相关系数分布的图形。

今天就来仿照绘制一下这幅图,我绘制出来的最终图像是下面这个样子。

还是照例先给出完成的绘图代码:

disease <- read.table("Disease.txt",header = TRUE,row.names = 1,sep = "\t")healthy <- read.table("Healthy.txt",header = TRUE,row.names = 1,sep = "\t")disease <- t(disease)healthy <- t(healthy)
cor.d <- cor(disease,method = "spearman")cor.h <- cor(healthy,method = "spearman")write.table(cor.d,"correlation_disease.xls",sep="\t",quote=F,col.names=NA)write.table(cor.h,"correlation_healthy.xls",sep="\t",quote=F,col.names=NA)
library(reshape2)cor.d <- melt(cor.d)cor.h <- melt(cor.h)
d.value <- cor.d$valued.value <- as.vector(d.value)d.value <- as.numeric(d.value)d.value <- d.value[d.value > 0]h.value <- cor.h$valueh.value <- as.vector(h.value)h.value <- as.numeric(h.value)h.value <- h.value[h.value > 0]
hist1 <- hist(d.value,breaks = c(0,0.05,0.1,0.15,0.2,0.25,0.3,0.35,0.4,0.45,0.5,0.55,0.6,0.65,0.7,0.75,0.8,0.85,0.9,0.95,1),plot = FALSE)hist2 <- hist(h.value,breaks = c(0,0.05,0.1,0.15,0.2,0.25,0.3,0.35,0.4,0.45,0.5,0.55,0.6,0.65,0.7,0.75,0.8,0.85,0.9,0.95,1),plot = FALSE)
tiff(filename = "correlation_density.tif",width = 5400,height = 7200,compression = "lzw",res = 600,type = "windows")par(mfrow = c(1,2))par(mar = c(3,8,6,2))par(xpd = TRUE)bar1 <- barplot(hist1$density,space = 0,horiz = TRUE,axes = FALSE,col = "grey80")axis(side = 1,at = c(0,0.5,1.0,1.5,2.0),labels = c("","","","",""),lwd = 3)text(x = c(0,1.0,2.0),y = -1.5,labels = c(0,1.0,2.0),cex = 2.5)axis(side = 2,line = 1,at = c(0,4,8,12,16,20),labels = c("","","","","",""),lwd = 3)text(x = -0.28,y = c(0,4,8,12,16,20),labels = c("0.0","0.2","0.4","0.6","0.8","1.0"),cex = 2.5,adj = c(1,0.5))mtext(side = 2,text = "Occurrence rate (n = 22)",line = 5.8,cex = 3)text(x = 0.8,y = 21,labels = "Disease-enriched\nmarkers",cex = 2.5)par(mar = c(3,0.5,6,9.5))bar2 <- barplot(hist2$density,space = 0,horiz = TRUE,axes = FALSE,col = "grey80")axis(side = 1,at = c(0,0.5,1.0,1.5,2.0),labels = c("","","","",""),lwd = 3)text(x = c(0,1.0,2.0),y = -1.5,labels = c(0,1.0,2.0),cex = 2.5)text(x = 1.2,y = 21,labels = "Healthy-enriched\nmarkers",cex = 2.5)text(x = 1.5,y = 0.5,labels = "Density",cex = 3)dev.off()

绘制图像所需的数据就是用于进行相关性的marker微生物在各样品中的丰度数据。


绘图过程详解

相关性分析

首先将绘图数据导入。

disease <- read.table("Disease.txt",header = TRUE,row.names = 1,sep = "\t")
disease <- t(disease)

相关性分析要求行为样本,进行相关性统计的变量为列,因而有些数据可能需要转置处理。

cor.d <- cor(disease,method = "spearman")
write.table(cor.d,"correlation_disease.xls",sep="\t",quote=F,col.names=NA)

使用cor()命令进行相关性分析,其可选方法包括spearman、pearson和kendall,输出结果为相关系数统计矩阵。

使用write.table()将相关性分析结果输出为xls表格,方便日后阅读和使用。

⚠️ cor()只能输出相关系数,而不同输出显著性p值,如果需要进行显著性检验,需要使用其它命令进行操作,相关内容会在之后的推文中介绍。

相关系数频率分布统计

首先要将“宽格式”的相关系数数据转化为“长格式”的数据。

library(reshape2)cor.d <- melt(cor.d)

之后将其中相关系数数值列提出来变为向量。

d.value <- cor.d$valued.value <- as.vector(d.value)d.value <- as.numeric(d.value)d.value <- d.value[d.value > 0]

⚠️ 因为本次绘图只是统计正相关的数据,因而使用[d.value > 0]值保留大于0的相关系数

之后统计相关系数在不同范围内的分布频数。

hist1 <- hist(d.value,breaks = c(0,0.05,0.1,0.15,0.2,0.25,0.3,0.35,0.4,0.45,0.5,0.55,0.6,0.65,0.7,0.75,0.8,0.85,0.9,0.95,1),plot = FALSE)

使用hist()进行频数的统计,应用breaks定义统计的范围,治理以0.05为间隔统计相关系数的分布频数。

图像绘制

最后对统计的相关系数分布频数进行图像绘制。

par(mfrow = c(1,2))par(mar = c(3,8,6,2))par(xpd = TRUE)bar1 <- barplot(hist1$density,space = 0,horiz = TRUE,axes = FALSE,col = "grey80")axis(side = 1,at = c(0,0.5,1.0,1.5,2.0),labels = c("","","","",""),lwd = 3)text(x = c(0,1.0,2.0),y = -1.5,labels = c(0,1.0,2.0),cex = 2.5)axis(side = 2,line = 1,at = c(0,4,8,12,16,20),labels = c("","","","","",""),lwd = 3)text(x = -0.28,y = c(0,4,8,12,16,20),labels = c("0.0","0.2","0.4","0.6","0.8","1.0"),cex = 2.5,adj = c(1,0.5))mtext(side = 2,text = "Occurrence rate (n = 22)",line = 5.8,cex = 3)text(x = 0.8,y = 21,labels = "Disease-enriched\nmarkers",cex = 2.5)par(mar = c(3,0.5,6,9.5))bar2 <- barplot(hist2$density,space = 0,horiz = TRUE,axes = FALSE,col = "grey80")axis(side = 1,at = c(0,0.5,1.0,1.5,2.0),labels = c("","","","",""),lwd = 3)text(x = c(0,1.0,2.0),y = -1.5,labels = c(0,1.0,2.0),cex = 2.5)text(x = 1.2,y = 21,labels = "Healthy-enriched\nmarkers",cex = 2.5)text(x = 1.5,y = 0.5,labels = "Density",cex = 3)

图像绘制的命令与正常的条形图绘制基本一致,这里就不再赘述了,详细的参数用法可以参见之前的几篇推文。

⚠️ 因为图像的条形是横向分布,因而在绘制时要使用horiz = TRUE参数。

扩展阅读



您可能也对以下帖子感兴趣

文章有问题?点此查看未经处理的缓存