图形“掰弯”利器--circlize
circlize包由顾祖光博士开发并于2014年发表于Bioinformatics杂志。circlize包对不同种类、不同组的数据提供了足够的track(轨道)、sector(扇区)、cell(单元格)来呈现。
环状图的优势在于:
第一,优雅地展示具有多类别、多组别的数据;
第二,直观地表示多个track在同一对象上的数据情况,更容易地展示元素之间的关系。
circlize包主要包括了三部分内容:
(I)详细概述了工作原理和一般的循环功能;
(II)介绍了专门为可视化基因组数据集而设计的功能;
(III)绘制和弦图来可视化对象之间的关系。
本次主要介绍第一部分,后续仍在学习中,欢迎感兴趣的朋友相互探讨学习。
01
环状图外观介绍
环状图是将直角坐标系中的原始数据转换成极坐标,并进一步映射到环状图中,像直角坐标系具有x、y范围一样,环状图也需要设定x、y范围。
绘图规则:初始化布局circos.initialize()、创建轨道circos.track()、添加图形circos.point()、circos.line()…、清除circos.clear()。
轨道和扇区:绿色圆圈是一个轨道,红色代表一个扇区,扇区和轨道的交集为单元格。
1library(circlize)
2factors = letters[1:8]##定义扇区
3circos.initialize(factors, xlim = c(0, 1))##初始化布局,限定扇区数和x范围
4for(i in 1:3) {circos.track(ylim = c(0, 1))}
5circos.info(plot = TRUE)
6draw.sector(get.cell.meta.data("cell.start.degree", sector.index = "a"),##绘制扇区a,获得其起始位置
7 get.cell.meta.data("cell.end.degree", sector.index = "a"),
8 rou1 = get.cell.meta.data("cell.top.radius", track.index = 1), ##最大半径
9 col = "#FF000040")
10draw.sector(0, 360,
11 rou1 = get.cell.meta.data("cell.top.radius", track.index = 1),##track1最大半径
12 rou2 = get.cell.meta.data("cell.bottom.radius", track.index = 1),##track1最小半径,二者决定绘图区域为轨道1
13 col = "#00FF0040")
14circos.clear()##清除
图形参数:圆形布局的一些基本参数可以通过circos.par()来设置。
1start.degree:放第一个扇区的起始度
2gap.degree:两个相邻部门之间的差距
3gap.after:和gap.degree一样
4track.margin:绘图区域之外的空白区域
5cell.padding:细胞填充,第一个和第三个填充值是单位圆半径的百分比,第二和第四个值是度数
6unit.circle.segments:线段的最小长度是单位圆的长度(2π)除以unit.circle.segments.线段的最小长度越小,拟合的曲线越接近
7track.height:轨道的默认高度
8points.overflow.warning:将此值设置为false以关闭警告
9canvas.xlim:画布上x方向坐标的范围,将Canvas.ylim设置为c(0,1)只会绘制四分之一的圆圈
10canvas.ylim:画布上的范围在y方向上坐标
11clock.wise:绘制扇区的顺序,默认顺时针
panel.fun参数:在circos.track()中的panel.fun参数根据factors中定义的类别,从circos.track()所用的数据中中自动提取不同cell的x和y的值,这将是这部分绘图主要应用的参数。该参数中需要介绍一下get.cell.meta.data()函数,此函数可以用于获取sector.index、sector.numeric.index、track.index、xlim、ylim、xcenter、ycenter等。
1factors = letters[1:8]
2circos.initialize(factors,xlim=c(0,1))
3circos.track(ylim = c(0,1), panel.fun = function(x, y) {
4 sector.index = get.cell.meta.data("sector.index")##等价于CELL_META$sector.index
5 xcenter = get.cell.meta.data("xcenter")##等价于CELL_META$xcenter
6 ycenter = get.cell.meta.data("ycenter")##等价于CELL_META$ycenter
7 circos.text(xcenter, ycenter, sector.index)##在各个cell中添加sector.index
8})
9circos.clear()
02
简单绘图函数
circos.trackPoints()可由带for循环circos.points()实现,直接使用circos.points()和panel.fun也很方便,其他低级函数与其同伴circos.track*()函数的使用也一样。
1circos.lines()添加线
2circos.segments()添加线段
3circos.text()添加文本,注意朝向(facing)参数
4circos.rect()绘制矩形
5circos.polygon()绘制多边形
6circos.axis()坐标轴的设置
7circos.arrow圆形箭头
X轴的设置
1factors = letters[1:8]
2circos.par(points.overflow.warning = FALSE)
3circos.initialize(factors = factors, xlim = c(0, 10))
4circos.track(ylim = c(0,10), panel.fun = function(x, y) {
5 circos.text(CELL_META$xcenter, CELL_META$ycenter*3, CELL_META$sector.index)})
6circos.axis(sector.index = "a")
7circos.axis(sector.index = "b", direction = "inside", labels.facing = "outside")
8circos.axis(sector.index = "c", h = "bottom")
9circos.axis(sector.index = "d", h = "bottom", direction = "inside", labels.facing = "reverse.clockwise")
10circos.axis(sector.index = "e", h = 5, major.at = c(1, 3, 5, 7, 9))
11circos.axis(sector.index = "f", h = 5, major.at = c(1, 3, 5, 7, 9), labels = c("a", "c", "e", "g", "f"), minor.ticks = 0)
12circos.axis(sector.index = "g", h = 5, major.at = c(1, 3, 5, 7, 9), labels = c("a1", "c1", "e1", "g1", "f1"), major.tick = FALSE, labels.facing = "reverse.clockwise")
13circos.axis(sector.index = "h", h = 2, major.at = c(1, 3, 5, 7, 9), labels = c("a1", "c1", "e1", "g1", "f1"), labels.facing = "clockwise")
14circos.clear()
Y轴的设置
1factors = letters[1:8]
2circos.par(points.overflow.warning = FALSE)
3circos.par(gap.degree = 8)##需要将扇区间隔调大确保y-axs有足够空间
4circos.initialize(factors = factors, xlim = c(0, 10))
5circos.trackPlotRegion(factors = factors, ylim = c(0, 10), track.height = 0.5)
6par(cex = 0.8)
7for(a in letters[2:4]) {circos.yaxis(side = "left", sector.index = a)}
8for(a in letters[5:7]) {circos.yaxis(side = "right", sector.index = a)}
9circos.clear()
低等绘图函数的使用
1set.seed(12345)
2fa = letters[1:10]
3circos.initialize(fa, xlim = c(0, 1))
4circos.track(ylim = c(0, 1), panel.fun = function(x, y) {
5 circos.text(0.5, 1.5,CELL_META$sector.index , track.index = 1,cex = 1.5)
6 circos.points(runif(20), runif(20), cex = 0.5, pch = 16, col = "blue")
7 circos.points(runif(20), runif(20), cex = 0.5, pch = 16, col = "green")##在track 1中的标记为a的cell里添加
8})
9circos.track(ylim = c(0, 1), panel.fun = function(x, y) {
10 circos.lines(sort(runif(20)), runif(20), col = "red")
11 circos.lines(sort(runif(20)), runif(20), col = "yellow")
12})
13circos.track(ylim = c(0, 1), panel.fun = function(x, y) {
14 col = rand_color(10)
15 tail = c("point", "normal", "point", "normal","point", "normal", "point", "normal","point", "normal")
16 circos.arrow(x1 = 0, x2 = 1, y = 0.5, width = 0.4,
17 arrow.head.width = 0.3, arrow.head.length = ux(0.5, "cm"),
18 col = col[CELL_META$sector.numeric.index],
19 tail = tail[CELL_META$sector.numeric.index])
20})
21circos.track(ylim = c(0, 1), panel.fun = function(x, y) {
22 a=runif(2)
23 b=rev(a)
24 circos.polygon(c(a,b),runif(4))
25})
26circos.clear()
03
高亮显示
绘图中常遇到需要高亮某一区域的需求,下面两个函数可辅助实现
draw.sector(start.degree, end.degree, rou1, rou2, center, col, border, lwd, lty)
highlight.sector(sector.index, track.index, padding, col, border,lwd, lty)
1factors = letters[1:8]
2circos.initialize(factors, xlim = c(0, 1))
3for(i in 1:3) {circos.track(ylim = c(0, 1))}
4circos.info(plot = TRUE)
5draw.sector(get.cell.meta.data("cell.start.degree", sector.index = "a"),
6 get.cell.meta.data("cell.end.degree", sector.index = "a"),
7 rou1 = get.cell.meta.data("cell.top.radius", track.index = 1),
8 col = "#FF00FF40")
9draw.sector(0, 360,
10 rou1 = get.cell.meta.data("cell.top.radius", track.index = 1),
11 rou2 = get.cell.meta.data("cell.bottom.radius", track.index = 1),
12 col = "#00FFFF20")
13draw.sector(get.cell.meta.data("cell.start.degree", sector.index = "e"),
14 get.cell.meta.data("cell.end.degree", sector.index = "f"),
15 rou1 = get.cell.meta.data("cell.top.radius", track.index = 2),
16 rou2 = get.cell.meta.data("cell.bottom.radius", track.index = 3), col = "#00CCFF40")
17pos = circlize(c(0.2, 0.8), c(0.2, 0.8), sector.index = "h", track.index = 2)
18draw.sector(pos[1, "theta"], pos[2, "theta"], pos[1, "rou"], pos[2, "rou"], clock.wise = TRUE, col = "#FFFF0040")
19circos.clear()
1factors = letters[1:8]
2circos.initialize(factors, xlim = c(0, 1))
3for(i in 1:4) {circos.track(ylim = c(0, 1))}
4circos.info(plot = TRUE)
5highlight.sector(c("a", "h"), track.index = 1, text = "a and h belong to a same group",
6 facing = "bending.inside", niceFacing = TRUE, text.vjust = "6mm", cex = 0.8)
7highlight.sector("c", col = "#00FF0040")
8highlight.sector("d", col = NA, border = "red", lwd = 2)
9highlight.sector("e", col = "#0000FF40", track.index = c(2, 3))
10highlight.sector(c("f", "g"), col = NA, border = "green",
11 lwd = 2, track.index = c(2, 3), padding = c(0.1, 0.1, 0.1, 0.1))##是依照上左下右的顺序
12highlight.sector(factors, col = "#FFFF0040", track.index = 4)
数据实战
01
组合图
1col_fun = colorRamp2(c(-1, 0, 1), c("grey", "yellow", "red"))
2fa = letters[1:10]
3circos.initialize(fa, xlim = c(0, 1))
4x = rnorm(10000)
5factors = sample(letters[1:10], 10000, replace = TRUE)
6circos.trackHist(factors = factors, x = x, col = rand_color(10), border = NA,bin.size = 0.05)
7circos.track(ylim = c(0, 1), panel.fun = function(x, y) {
8 circos.lines(sort(runif(20)), runif(20), col = 4)
9 circos.lines(sort(runif(20)), runif(20), col = 5)
10})
11for(i in 1:10) {
12circos.link(sample(fa, 1), sort(runif(10))[1:2],
13sample(fa, 1), sort(runif(10))[1:2],
14col = add_transparency(col_fun(rnorm(1))))
15}
02
热图
1mat = matrix(rnorm(100*10), nrow = 10, ncol = 100)
2col_fun = colorRamp2(c(-1, -0.5, 0, 1, 2), c("grey60", "grey30", "white", "yellow", "red"))
3factors = rep(letters[1:2], times = c(40, 60))##将环状图分为a、b两部分
4mat_list = list(a = mat[, factors == "a"],b = mat[, factors == "b"])##指定a、b的数据
5dend_list = list(a = as.dendrogram(hclust(dist(t(mat_list[["a"]])))),b = as.dendrogram(hclust(dist(t(mat_list[["b"]])))))##生成a、b的树状图
6circos.par(cell.padding = c(0, 0, 0, 0), gap.degree = 5, start.degree = 90)
7circos.initialize(factors, xlim = cbind(c(0, 0), table(factors)))##初始化,分别指定a、b部分的xlim##初始化,分别指定a、b部分的xlim(xlim可以是一个矩阵,用于限制不同的track的x范围)
8circos.track(ylim = c(0, 10), bg.border = NA, panel.fun = function(x, y) {
9 sector.index = CELL_META$sector.index
10 m = mat_list[[sector.index]]##a、b的数据
11 dend = dend_list[[sector.index]]##a、b的树状图
12 m2 = m[, order.dendrogram(dend)]##利用order.dendrogram生成的树状图的索引,来重排数据
13 col_mat = col_fun(m2)##依据数据的大小进行着色
14 nr = nrow(m2);nc = ncol(m2)
15 for(i in 1:nr) {
16 circos.rect(1:nc-1, rep(nr-i, nc), 1:nc, rep(nr-i+1, nc), ##绘制矩形,参数为xleft, ybottom, xright, ytop
17 border = col_mat[i, ], col = col_mat[i, ])##指定矩形的颜色和边框颜色
18 }
19})
20##为保证两个树状图高度在一个相同的尺度上,用attr函数计算并输出两树状图的最大高度,并将其设置为第二轨道的阈值。
21max_height = max(sapply(dend_list, function(x) attr(x, "height")))
22circos.track(ylim = c(0, max_height), bg.border = NA, track.height = 0.3,
23 panel.fun = function(x, y) {
24 sector.index = get.cell.meta.data("sector.index")
25 dend = dend_list[[sector.index]]
26 circos.dendrogram(dend, max_height = max_height)
27 })
28circos.clear()
03
选择信号图
用环状图呈现基因组中常见的群体遗传学参数——Fst、Π和Tajima' D,在这用到了该包基因组方面的函数,后续会进一步加深这个方面的分享。
1chr_info<-read.table("chr.txt",header = T)
2fst_data<-read.table("fst_result.txt",head = T,check.names = F,stringsAsFactors = FALSE,sep = '\t')
3pi_data<-read.table("pi_result.txt",head = T,check.names = F,stringsAsFactors = FALSE,sep = '\t')
4tajima_data<-read.table("tajima_result.txt",head = T,check.names = F,stringsAsFactors = FALSE,sep = '\t')
5head(fst_data)
6head(pi_data)
7head(chr_info)
8circos.clear()
9circos.par(start.degree=90,gap.degree=2)
10circos.genomicInitialize(chr_info,plotType="NULL")
11
12####画第一个track--染色体ID
13circos.track(ylim = c(0, 1), panel.fun = function(x, y) {
14 chr = CELL_META$sector.index
15 xlim = CELL_META$xlim
16 ylim = CELL_META$ylim
17 circos.rect(xlim[1], 0, xlim[2], 1, col = rand_color(14))
18 circos.text(mean(xlim), mean(ylim), chr, cex = 1, col = "white",
19 facing = "inside", niceFacing = TRUE)}, track.height = 0.1, bg.border = NA)
20
21####画第二个track--Tajima'D
22bed030<-read.table("tajima_result.txt",head=T,sep='\t')
23names(bed030)<-c("chr","start","end","mid","value","value2")
24bed30<-bed030[,c("chr","start","end","value")]
25circos.genomicTrackPlotRegion(bed30,ylim = c(-1.5, 5),panel.fun = function(region, value, ...){
26 circos.genomicLines(region, value, type = "l",col='lightpink',...)})
27
28bed31<-bed030[,c("chr","start","end","value2")]
29names(bed31)<-c("chr","start","end","value")
30##指定track,index参数以达到在同一轨道上添加多种数据信息
31circos.genomicTrackPlotRegion(bed31,track.index=2,ylim = c(-1.5,5),panel.fun = function(region, value, ...){
32 circos.genomicLines(region, value, type = "l",col='turquoise2',...)})
33
34####画第三个track--Fst
35bed010<-read.table("fst_result.txt",head=T,sep='\t')
36names(bed010)<-c("chr","start","end","mid","value")
37bed10<-bed010[,c("chr","start","end","value")]
38circos.genomicTrackPlotRegion(bed10,panel.fun = function(region, value, ...){
39 circos.genomicLines(region, value, type = "l",col='green',...)})
40
41####画第四个track--Π比值
42bed020<-read.table("pi_result.txt",head=T,sep='\t')
43bed020$pi_com<-bed020$Pi_tropical/bed020$Pi_temperate
44names(bed020)<-c("chr","start","end","mid","value1","value2","value")
45bed20<-bed020[,c("chr","start","end","value")]
46circos.genomicTrackPlotRegion(bed20,ylim = c(0, 3),panel.fun = function(region, value, ...){
47 circos.genomicLines(region, value, type = "l",col='blue',...)})
相信经过这个简单的介绍,大家对这个R包有了一定的了解。感兴趣的朋友,可以自己去试试。
作者:Bio_gevin
审稿:童蒙
排版:amethyst