ggpairs展示数据间的相关性
开始今天的学习吧~
ggplot2
是一个基于图形语法的R绘图系统。GGally
通过添加几个函数来降低组合geoms
与转换数据的复杂性,从而扩展了ggplot2。其中一些函数包括成对的图形矩阵、散点图矩阵、平行坐标图、生存图和一些绘制网络图的函数。
多变量数据的两两比较
ggpairs()
ggpairs()
是一种特殊形式的ggmatrix()
,可用于多变量数据的成对比较。默认情况下,ggpairs()
提供了每两列的两个不同的比较,并显示沿对角线的各个变量的密度或计数。使用不同的参数设置,可以用数值和图形替换对角线。
ggpairs()
中有许多隐藏功能。
使用tips数据集
展示此函数的用法。
library(GGally)
## 载入需要的程辑包:ggplot2
## Registered S3 method overwritten by 'GGally':
## method from
## +.gg ggplot2
data(tips, package = "reshape")
pm <- ggpairs(tips) # 默认展示所有列
pm
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
改变需要展示的列,并更改呈现的列名
pm <- ggpairs(tips, columns = c(1:4), columnLabels = c("Total Bill","Tip","Sex","Smoker"))
pm
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
改变颜色映射
library(ggplot2)
pm <- ggpairs(tips, columns = 1:4, columnLabels = c("Total Bill","Tip","Sex","Smoker"), mapping = aes(color = smoker))
pm
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
可以更改上三角、下三角、对角线的展示方式
3个主要参数: lower
, upper
, 和 diag
.
lower
和upper
包含3种类型:continuous
,combo
, 和discrete
.'diag' 只有2种: continuous
或者discrete
.
每个模块都可以调整,只要提供一个列表即可
pm <- ggpairs(tips,
columns = c("total_bill","time","tip"),
lower = list(
continuous = "smooth", # 添加拟合线
combo = "facetdensity",
mapping = aes(color = time)
)
)
pm
pm <- ggpairs(
tips, columns = c("total_bill", "time", "tip"),
upper = "blank",
diag = NULL
)
pm
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
改变字体大小
ggpairs(tips, columns = 1:4, aes(color = time, alpha = 0.5),
upper = list(
continuous = wrap("cor", size = 3) # 改变correlation的font size
))
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
改变图形类型
ggpairs(tips, columns = 1:4, aes(color = sex, alpha = 0.6),
upper = list(
continuous = "points"
)
)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
对角线不显示
ggpairs(tips, columns = 1:3, aes(color = sex),
diag = "blank"
)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
和ggplot2的theme对接
ggpairs(tips, columns = 1:3) +
theme(legend.position = "none",
panel.grid.major = element_blank(),
axis.ticks = element_blank(),
panel.border = element_rect(linetype = "dashed", colour = "black", fill = NA))
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
挑选其中的一个子图
pm[3,1]
自定义函数
自定义函数须符合以下格式:
custom_function <- function(data, mapping, ...){
# ggplot2语法
}
比如改变其中一个子图的颜色:
my_bin <- function(data, mapping, ..., low = "#132B43", high = "#56B1F7") {
ggplot(data = data, mapping = mapping) +
geom_bin2d(...) +
scale_fill_gradient(low = low, high = high)
}
pm <- ggpairs(
tips, columns = c("total_bill", "time", "tip"),
lower = list(
continuous = my_bin
))
pm
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
改变默认配色
这个需求看似简单,但是需要自定义函数,实现起来略复杂,需要仔细阅读说明文档。
首先要明确在上下三角以及对角线中各个图形是由哪种变量组成的,是两个连续性变量,还是一个连续性一个离散型,还是两个都是离散型。然后还要看说明文档,都有哪些自定义图形!
# 修改默认函数的参数
my_cor <- function(data, mapping, ...){
ggally_cor(data = data, mapping = mapping, size = 2.5) +
scale_color_manual(values = c("#E41A1C","#377EB8"))
}
my_smooth <- function(data, mapping, ...){
ggally_smooth(data = data, mapping = mapping) +
scale_color_manual(values = c("#E41A1C","#377EB8"))
}
my_bardiag <- function(data, mapping, ...){
ggally_barDiag(data = data, mapping = mapping) +
scale_fill_manual(values = c("#E41A1C","#377EB8"))
}
my_box <- function(data, mapping, ...){
ggally_box(data = data, mapping = mapping) +
scale_fill_manual(values = c("#E41A1C","#377EB8"))
}
my_count <- function(data, mapping, ...){
ggally_count(data = data, mapping = mapping) +
scale_fill_manual(values = c("#E41A1C","#377EB8"))
}
my_hist <- function(data, mapping, ...){
ggally_facethist(data = data, mapping = mapping) +
scale_fill_manual(values = c("#E41A1C","#377EB8"))
}
my_bar <- function(data, mapping, ...){
ggally_facetbar(data = data, mapping = mapping) +
scale_fill_manual(values = c("#E41A1C","#377EB8"))
}
# 使用修改后的参数画图
ggpairs(tips,
aes(color = sex),
upper = list(continuous = wrap(my_cor),
combo = wrap(my_box),
discrete = wrap(my_count)),
lower = list(continuous = wrap(my_smooth),
combo = wrap(my_hist),
discrete = wrap(my_bar)),
diag = list(continuous = wrap(my_bardiag),
discrete = wrap(my_bardiag))
) +
theme_minimal()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
上面的例子实在是太复杂了,当然也有更方便的方法,可以直接用scale_fill_maunal/scale_color_manual
更改,简便方法:
ggpairs(tips, aes(color = sex), lower = list(continuous = "smooth"))+
scale_fill_brewer(palette = "Set2")+
scale_color_brewer(palette = "Set2")+
theme_minimal()
以上就是今天的内容,希望对你有帮助哦!欢迎点赞、在看、关注、转发!
欢迎在评论区留言或直接添加我的微信!
欢迎关注我的公众号:医学和生信笔记
“医学和生信笔记 公众号主要分享:1.医学小知识、肛肠科小知识;2.R语言和Python相关的数据分析、可视化、机器学习等;3.生物信息学学习资料和自己的学习笔记!
往期精彩内容:
超详细的R语言热图之complexheatmap系列1
超详细的R语言热图之complexheatmap系列2
在VScode中使用R语言
R语言ggtern包画三元图详解