快速上手!热图+表达趋势折线图+通路富集结果组合可视化(含代码) |数据挖掘与分析
●
ClusterGVis是一个单细胞数据处理可视化的R包,由中国药科大学Jun Zhang博士开发的系列可视化工具包之一,可以同时绘制聚类+分组表达趋势折线图+功能注释的组合图,通过一张热图可以了解差异基因可以划分成几个cluster,每个cluster的表达随着时间是如何变化,以及这些cluster变化的基因通过GO或者KEGG功能注释了解其功能。
如下根据其github以及微信教程简单整理一下其用法。
1. R包(以及依赖包)的安装
# install.packages("devtools")
devtools::install_github("junjunlab/ClusterGVis")
BiocManager::install(“org.Hs.eg.db”)
install.packages(“ggplot2”)
2. 导入需要的R包
library(ClusterGVis)
library(org.Hs.eg.db)
library(Seurat)
library(dplyr)
3. 代码展示
#导入单细胞数据
pbmc <- readRDS("./pbmc3k_final.rds")
#细胞注释
new.cluster.ids <- c("Naive CD4 T", "CD14+ Mono", "Memory CD4 T", "B", "CD8 T")
names(new.cluster.ids) <- levels(pbmc)
pbmc <- RenameIdents(pbmc, new.cluster.ids)
#寻找标记基因
pbmc.markers.all <- Seurat::FindAllMarkers(pbmc,
only.pos = TRUE,
min.pct = 0.25,
logfc.threshold = 0.25)
#选取top10的标记基因
pbmc.markers <- pbmc.markers.all %>%
dplyr::group_by(cluster) %>%
dplyr::top_n(n = 10, wt = avg_log2FC)
#利用prepareDataFromscRNA函数准备数据,showAverage 参数设为 TRUE 则表示对基因细胞亚群一样的细胞取均值进行绘图,否则就是所有细胞进行绘图,默认使用seurat对象的 RNA assay的data数据。
st.data <- prepareDataFromscRNA(object = pbmc,
diffData = pbmc.markers,
showAverage = TRUE)
#对每个Cluster进行GO富集分析(或KEGG)
enrich <- enrichCluster(object = st.data,
OrgDb = org.Hs.eg.db,
type = "BP",
organism = "hsa",
pvalueCutoff = 0.5,
topn = 5,
seed = 5201314)
#挑选需要展示的特征基因
markGenes = unique(pbmc.markers$gene)[sample(1:length(unique(pbmc.markers$gene)),40,
replace = F)]
#绘制表达折线图
visCluster(object = st.data, plot.type = "line")
#绘制热图
pdf('sc1.pdf',height = 10,width = 6,onefile = F)
visCluster(object = st.data,
plot.type = "heatmap",
column_names_rot = 45,
markGenes = markGenes,
cluster.order = c(1:9))
dev.off()
# heatmap plot
pdf('sc3.pdf',height = 10,width = 6,onefile = F)
visCluster(object = st.data,
plot.type = "heatmap",
column_names_rot = 45,
markGenes = c("CD3D","CD3D_1","CD3D_2"),
cluster.order = c(1:9))
dev.off()
# 绘制所有细胞
# no average cells
pbmc.markers1 <- pbmc.markers.all %>%
dplyr::group_by(cluster) %>%
dplyr::top_n(n = 6, wt = avg_log2FC)
# retain duplicate diff gene in multiple clusters
st.data <- prepareDataFromscRNA(object = pbmc,
diffData = pbmc.markers1,
showAverage = FALSE)
# heatmap plot
pdf('sc4.pdf',height = 10,width = 8,onefile = F)
visCluster(object = st.data,
plot.type = "heatmap",
markGenes = unique(pbmc.markers1$gene),
column_title_rot = 45,
cluster.order = 1:9,
show_column_names = F)
dev.off()
# 修改亚群顺序和修改注释颜色
# change celltype order and color
pdf('sc5.pdf',height = 10,width = 8,onefile = F)
visCluster(object = st.data,
plot.type = "heatmap",
markGenes = unique(pbmc.markers1$gene),
column_title_rot = 45,
cluster.order = 1:9,
show_column_names = F,
sample.cell.order = rev(new.cluster.ids),
sample.col = jjAnno::useMyCol("paired",n = 9))
dev.off()
# 添加富集注释
# add GO annotation
pdf('sc6.pdf',height = 12,width = 16,onefile = F)
visCluster(object = st.data,
plot.type = "both",
column_title_rot = 45,
markGenes = unique(pbmc.markers1$gene),
markGenes.side = "left",
# annoTerm.data = enrich,
show_column_names = F,
line.side = "left",
cluster.order = c(1:9),
add.bar = T)
dev.off()
ClusterGVis包对于转录组数据的可视化效果非常好,用一张热图就可以展示基因的表达模式,聚类和GO功能注释,大家感兴趣的可以去github(https://github.com/junjunlab/ClusterGVis)深入学习。
相关阅读
从尘肺病研究看单细胞测序:如何用自己的数据讲一个好故事|单细胞专题
不写代码如何用TCGA数据1分钟在线绘制生存曲线-数据挖掘与分析
点击下方图片进入云平台资料汇总:
所见即所得,绘图高规格联川云平台,让科研更自由