GseaVis 优雅的可视化 GSEA 富集结果
见怪不怪
1引言
对于 GSEA 富集结果的可视化有一些 R 包和函数,比如 Y 叔 的 enrichplot 中的gseaplot
和gseaplot2
函数可以漂亮的绘制图形,还有之前介绍的 gggsea等, 桌面版 GSEA 软件虽然分析结果很全面,但是出图对大家不是很友好。在这里我参考了 enrichplot 包里的部分代码
并结合自己的代码写了这个 GseaVis 小 R 包,提供了一个更具有可操作性的可视化方法,来绘制 GSEA 富集图形。
在这里首先感谢 余光创 老师开发 R 包 clusterProfiler 和 enrichplot 所付出的努力和贡献。
相关链接:
2介绍
GseaVis 可以做什么:
经典的 GSEA 图形可视化。 标注你感兴趣的基因。 新的图形展示。 更多的可调节参数。 添加 NES 和 P 值。 更加人性化的通路名称展示。
如果喜欢,记得在 github 上送上你的小星星哦。
3安装
github 地址:
https://github.com/junjunlab/GseaVis
# install.packages("devtools")
devtools::install_github("junjunlab/GseaVis")
参考手册:
https://github.com/junjunlab/GseaVis/wiki
4输入
输入文件是 clusterProfiler 的 GSEA
/gseGO
/gseKEGG
函数的富集结果。
5Load test data
加载内置数据:
library(GseaVis)
# load data
test_data <- system.file("extdata", "gseaRes.RDS", package = "GseaVis")
gseaRes <- readRDS(test_data)
gseaRes
# Gene Set Enrichment Analysis
#
#...@organism UNKNOWN
#...@setType UNKNOWN
#...@geneList Named num [1:27970] 6.02 5.96 5.84 5.8 5.72 ...
- attr(*, "names")= chr [1:27970] "Ecscr" "Gm32341" "B130034C11Rik" "Hkdc1" ...
#...nPerm
#...pvalues adjusted by 'BH' with cutoff <1
#...4917 enriched terms found
...
6classic plot
指定通路名称绘图:
# all plot
gseaNb(object = gseaRes,
geneSetID = 'GOBP_NUCLEOSIDE_DIPHOSPHATE_METABOLIC_PROCESS')
只保留曲线:
# retain curve
gseaNb(object = gseaRes,
geneSetID = 'GOBP_NUCLEOSIDE_DIPHOSPHATE_METABOLIC_PROCESS',
subPlot = 1)
保留曲线和热图:
# retain curve and heatmap
gseaNb(object = gseaRes,
geneSetID = 'GOBP_NUCLEOSIDE_DIPHOSPHATE_METABOLIC_PROCESS',
subPlot = 2)
通路名称太长截断换行:
# wrap the term title
gseaNb(object = gseaRes,
geneSetID = 'GOBP_NUCLEOSIDE_DIPHOSPHATE_METABOLIC_PROCESS',
subPlot = 2,
termWidth = 30)
标记通路里的一些基因:
# add gene in specific pathway
mygene <- c("Entpd8","Htr2a","Nt5e","Actn3","Entpd1",
"Pfkp", "Tpi1","Igf1","Ddit4","Ak9")
# plot
gseaNb(object = gseaRes,
geneSetID = 'GOBP_NUCLEOSIDE_DIPHOSPHATE_METABOLIC_PROCESS',
subPlot = 2,
addGene = mygene)
改变基因标签颜色和箭头类型:
# change gene color and arrow type
gseaNb(object = gseaRes,
geneSetID = 'GOBP_NUCLEOSIDE_DIPHOSPHATE_METABOLIC_PROCESS',
subPlot = 2,
addGene = mygene,
arrowType = 'open',
geneCol = 'black')
保留所有图形:
# all plot
gseaNb(object = gseaRes,
geneSetID = 'GOBP_NUCLEOSIDE_DIPHOSPHATE_METABOLIC_PROCESS',
subPlot = 3,
addGene = mygene,
rmSegment = TRUE)
7New style GSEA
另外一种展现方式,将通路基因绘制到曲线里:
# new style GSEA
gseaNb(object = gseaRes,
geneSetID = 'GOBP_NUCLEOSIDE_DIPHOSPHATE_METABOLIC_PROCESS',
newGsea = T)
移除点图层:
# new style GSEA remove point
gseaNb(object = gseaRes,
geneSetID = 'GOBP_NUCLEOSIDE_DIPHOSPHATE_METABOLIC_PROCESS',
newGsea = T,
addPoint = F)
改变热图颜色:
# change heatmap color
gseaNb(object = gseaRes,
geneSetID = 'GOBP_NUCLEOSIDE_DIPHOSPHATE_METABOLIC_PROCESS',
newGsea = T,
addPoint = F,
newHtCol = c("blue","white", "red"))
添加基因名:
# new style GSEA with gene name
gseaNb(object = gseaRes,
geneSetID = 'GOBP_NUCLEOSIDE_DIPHOSPHATE_METABOLIC_PROCESS',
newGsea = T,
addGene = mygene)
移除红色标记线:
# remove red segment
gseaNb(object = gseaRes,
geneSetID = 'GOBP_NUCLEOSIDE_DIPHOSPHATE_METABOLIC_PROCESS',
newGsea = T,
rmSegment = T,
addGene = mygene)
移除热图:
# remove heatmap
gseaNb(object = gseaRes,
geneSetID = 'GOBP_NUCLEOSIDE_DIPHOSPHATE_METABOLIC_PROCESS',
newGsea = T,
rmSegment = T,
rmHt = T,
addGene = mygene)
8Add NES and Pvalue
添加 NES scores 和 Pvalue:
# add pvalue and NES
gseaNb(object = gseaRes,
geneSetID = 'GOBP_NUCLEOSIDE_DIPHOSPHATE_METABOLIC_PROCESS',
newGsea = T,
addGene = mygene,
addPval = T)
调整标签位置和颜色:
# control label ajustment
gseaNb(object = gseaRes,
geneSetID = 'GOBP_NUCLEOSIDE_DIPHOSPHATE_METABOLIC_PROCESS',
newGsea = T,
addGene = mygene,
addPval = T,
pvalX = 0.75,pvalY = 0.8,
pCol = 'black',
pHjust = 0)
经典图形添加 P 值标签:
# clsaasic with pvalue
gseaNb(object = gseaRes,
geneSetID = 'GOBP_NUCLEOSIDE_DIPHOSPHATE_METABOLIC_PROCESS',
addGene = mygene,
addPval = T,
pvalX = 0.75,pvalY = 0.8,
pCol = 'black',
pHjust = 0)
9Multiple GSEA plot
你可以使用循环批量绘图:
# bacth plot
terms <- c('GOBP_NUCLEOSIDE_DIPHOSPHATE_METABOLIC_PROCESS',
'GOBP_REGULATION_OF_OSSIFICATION',
'GOBP_TISSUE_MIGRATION',
'GOBP_CELL_MATRIX_ADHESION')
# plot
lapply(terms, function(x){
gseaNb(object = gseaRes,
geneSetID = x,
addPval = T,
pvalX = 0.75,pvalY = 0.75,
pCol = 'black',
pHjust = 0)
}) -> gseaList
# combine
cowplot::plot_grid(plotlist = gseaList,ncol = 2,align = 'hv')
只保留曲线和热图:
# retain curve and heatmap
# plot
lapply(terms, function(x){
gseaNb(object = gseaRes,
geneSetID = x,
addPval = T,
pvalX = 0.75,pvalY = 0.75,
pCol = 'black',
pHjust = 0,
subPlot = 2)
}) -> gseaList1
# combine
cowplot::plot_grid(plotlist = gseaList1,ncol = 2,align = 'hv')
New style plot:
# new style plot
# plot
lapply(terms, function(x){
gseaNb(object = gseaRes,
geneSetID = x,
newGsea = T,
addPval = T,
pvalX = 0.75,pvalY = 0.75,
pCol = 'black',
pHjust = 0,
subPlot = 2)
}) -> gseaList1
# combine
cowplot::plot_grid(plotlist = gseaList1,ncol = 2,align = 'hv')
10More parameters
更多参数见函数:
?gseaNb
11结尾
最后大家有任何建议或者想法请在 github 上面留言或者评论。
欢迎加入生信交流群。加我微信我也拉你进 微信群聊 老俊俊生信交流群
(微信交流群需收取20元入群费用(防止骗子和便于管理)
)。
老俊俊微信:
知识星球:
所以今天你学习了吗?
今天的分享就到这里了,敬请期待下一篇!
最后欢迎大家分享转发,您的点赞是对我的鼓励和肯定!
如果觉得对您帮助很大,赏杯快乐水喝喝吧!
往期回顾
◀scRNAtoolVis 绘制单细胞 Marker 基因均值表达热图
◀genesorteR 快速准确鉴定亚群 Marker 基因
◀...