基因组分析:Circos作图基础(三)
大家好,很高兴又跟大家见面了,上一周我们画出了一个看起来还不错的基因组圈图,但是跟我们的目标还是有一些差距,所以今天我将把该图的终极画法写出来,供大家参考。
紧接着上一周的图形,我们缺少两个信息,一个是GC含量的图示,还有一个是GC偏移的信息。GC含量可能大家都能理解,GC偏移又是什么呢?首先给出公式:GC偏移=(G-C)/(G+C)这个式子用来衡量G和C的相对含量,如果G>C则GC skew的值为正值,G<C则为负值。在大多数细菌基因组中,前导链(leading strand)和滞后链(lagging strand)在碱基组成上有明显的不同——前导链富含G和T,但是滞后链中的A和C则更多一些。打破A=T和C=G的碱基频率发生的偏移,被称之为“AT(AT-skew)”和“GC(GC-skew)”。由于通常GC偏移比AT偏移发生的更明显,所以习惯上更多地只考虑GC偏移。
我们在每条scaffold 上以1000bp 为一个窗口,计算GC 含量减去全基因组Average_GC>=0 的输出到positive_gc_count.txt, <0 的输出到negative_gc_count.txt ; GC 偏移: (G-C)/(G+C) >=0 的输出到positive_gc_skew.txt, <0 的输出到negative_gc_skew.txt。比如这样:
scaffold1 1 1000 0.0940111701086518
scaffold1 1001 2000 0.0980111701086517
scaffold1 2001 3000 0.0790111701086518
scaffold1 3001 4000 0.0450111701086518
scaffold1 4001 5000 0.102011170108652
我们再一次打开circos.conf配置文件我们在后面加入这几行:
<plots>
type = line
thickness = 1p
<plot>
z = 2
max_gap = 0u
file = positive_gc_count.txt
color = red
fill_color = red
r0 = 0.48r
r1 = 0.57r
orientation = out
</plot>
<plot>
z = 2
max_gap = 0u
file = negative_gc_count.txt
color = blue
fill_color = blue
r0 = 0.39r
r1 = 0.48r
orientation = out
</plot>
<plot>
z = 2
max_gap = 0u
file = positive_gc_skew.txt
color = aquamarine
fill_color = aquamarine
r0 = 0.27r
r1 = 0.36r
orientation = out
</plot>
<plot>
z = 2
max_gap = 0u
file = negative_gc_skew.txt
color = orange
fill_color = orange
r0 = 0.18r
r1 = 0.27r
orientation = out
</plot>
</plots>
就这样,我们把这个圈图就画得很帅啦!如图所示:
很有成就感是不是。
有些小伙伴问我图例和文字是怎么加进去的,这可能需要你掌握一些SVG的作图技巧了,如果有机会的话,我会在将来对这一块有详细的解释。按照惯例,我把这个图形的所有代码都列在下面,如果需要对图形进行调整,可以自行修改代码:
<<includeetc/colors_fonts_patterns.conf>>
<image>
<<includeetc/image.conf>>
</image>
karyotype = karyotype.txt
chromosomes_units = 100000
chromosomes_display_default = yes
<ideogram>
<spacing>
default = 0.005r
</spacing>
radius = 0.80r
thickness = 6p
fill = yes
fill_color = deepskyblue
stroke_color = black
stroke_thickness = 1p
show_label = yes
label_font = light
label_radius = 1r + 110p
label_size = 30
label_parallel = no
</ideogram>
show_ticks = yes
show_tick_labels = yes
<ticks>
skip_first_label = no
skip_last_label = no
radius = dims(ideogram,radius_outer)
color = black
thickness = 2p
size = 30p
multiplier = 1e-6
format = %.2f
<tick>
spacing = 20000b
size = 10p
show_label = no
thickness = 3p
</tick>
<tick>
spacing = 100000b
size = 20p
show_label = yes
label_size = 25p
label_offset = 10p
format = %.2f
</tick>
</ticks>
<highlights>
z = 0
<highlight>
file = sensegene.gff
r0 = 0.90r
r1 = 0.99r
fill_color = 169,169,169
</highlight>
<highlight>
file = temp.txt
r0 = 0.90r
r1 = 0.901r
fill_color = black
</highlight>
<highlight>
file = antigene.gff
r0 = 0.81r
r1 = 0.90r
fill_color = 169,169,169
</highlight>
<highlight>
file = sense_strand_ncRNA.txt
r0 = 0.69r
r1 = 0.78r
</highlight>
<highlight>
file = temp.txt
r0 = 0.69r
r1 = 0.691r
fill_color = black
</highlight>
<highlight>
file = antisense_strand_ncRNA.txt
r0 = 0.60r
r1 = 0.69r
</highlight>
</highlights>
<plots>
type = line
thickness = 1p
<plot>
z = 2
max_gap = 0u
file = positive_gc_count.txt
color = red
fill_color = red
r0 = 0.48r
r1 = 0.57r
orientation = out
</plot>
<plot>
z = 2
max_gap = 0u
file = negative_gc_count.txt
color = blue
fill_color = blue
r0 = 0.39r
r1 = 0.48r
orientation = out
</plot>
<plot>
z = 2
max_gap = 0u
file = positive_gc_skew.txt
color = aquamarine
fill_color = aquamarine
r0 = 0.27r
r1 = 0.36r
orientation = out
</plot>
<plot>
z = 2
max_gap = 0u
file = negative_gc_skew.txt
color = orange
fill_color = orange
r0 = 0.18r
r1 = 0.27r
orientation = out
</plot>
</plots>
<<includeetc/housekeeping.conf>>
新年快要到了,在这里,祝愿所有的小伙伴新年快乐,学习进步。如果大家有什么想要跟我交流的,可以在后台发送问题,我会尽我所能为大家讲解。
下一周我就回家啦,我今年的更新就到这里了。我们明年在见!
欢迎关注,欢迎转发~