网红玫瑰图:新冠肺炎全球疫情形势图仿制
最近一段时间大家每天都会看到这张图,非常好的展示了全球新冠肺炎的形式,昨天刚有了仿制的打算,图还没画出来今天就看到了Y叔公布的绘图方法,我也是命苦啊!
首先还是要感谢Y叔发布的“nCov2019”数据包,毕竟没数据还画什么!
另外今天看了Y叔发布的代码,也参考了一些对自己的代码进行了修改,再次感谢Y叔!
接下来进入正题,首先就是要下载“nCov2019”的数据包。
install.packages("remotes")
library(remotes)
remotes::install_github("GuangchuangYu/nCov2019")
在安装数据包的过程中提示需要升级一些依赖的包,我到最后有一个包没有升级成功,也不知道为啥,我就跳过了,并不影响图像的绘制。
之后载入绘图数据。
library(nCov2019)
data <- load_nCov2019(lang = "zh")
global <- data["global"]
数据是从2019年12月1日起每日全球范围内各个国家目前的确诊病人、治愈病人和死亡病人数目。
数据目前更新到2020年3与9号,因此,就用3月9号的数据进行绘图,先把3月9号的数据提出来并去掉中国的数据。
today <- subset(global, time == "2020-03-09")
today_g <- subset(today, country != "中国")
原图中有一个特点,就是会将确诊数目一致的国家放在一个单元进行显示,因而首先要对确诊数目相同的国家进行合并。
但是如果只是简单的合并会出现一个问题,就是死亡数目的展示会出现问题,因此首先形成新的一列,将国家和死亡数目合并,之后要将发病数目相同的国家进行合并,保留绘图所需要的数据列。
最后生成的数据是从小到达排列,要对其进行重新的排序。
today_g$country_dead <- c(ifelse(today_g$cum_dead == 0,print(today_g$country),paste(today_g$country,"(死亡",today_g$cum_dead,"例)",sep = "")))
library(dplyr)
today_g1 <- today_g %>% group_by(cum_confirm) %>% summarise(country = paste(country,collapse = ","))
today_g2 <- today_g %>% group_by(cum_confirm) %>% summarise(country_dead = paste(country_dead,collapse = "、"))
today_g3 <- today_g %>% group_by(cum_confirm) %>% summarise(cum_dead = paste(cum_dead,collapse = ","))
today_g4 <- merge(today_g1,today_g2)
today_g5 <- merge(today_g4,today_g3)
today_g5 <- as.data.frame(today_g5)
today_g5 <- today_g5[order(today_g5[,"cum_confirm"],decreasing = TRUE),]
图像使用了排名前40的国家和地区进行绘制,因此,首先选出排名前40的国家。
today_top <- today_g5[1:40,]
手动观察一下数据,发现排名第40的波兰和斯洛文尼亚为16例,因而手动添加一行低于16例的国家,并设置起数值为15.
a <- data.frame(cum_confirm = 15,country = "低于16例的国家",country_dead = "",cum_dead = "")
today_top <- rbind(today_top,a)
由于画图是默认是按字母顺序展示,因而要首先将国家列定位因子,同时建立一个角度列,用于文字的添加。
today_top$country <- factor(today_top$country,levels = today_top$country)
today_top$angle <- 1:41*360/41
下面是出图的代码。
library(ggplot2)
ggplot(today_top,aes(x = country, y = cum_confirm, fill = country))+
#添加正常的条形图
geom_bar(stat = "identity",position = "stack",width = 1)+
#再添加一个纯白的条形图以展示中心的白色区域
geom_bar(aes(y = I(2)),stat = "identity",width = 1,fill = "white")+
#在添加两个透明的白色条形图以表示中间部分的光环
geom_bar(aes(y = I(4)),stat="identity",width = 1,fill = "white",alpha = 0.2)+
geom_bar(aes(y = I(6)),stat="identity",width = 1,fill = "white",alpha = 0.1)+
#由于数据差异特别大,因此需要将数值进行log转换,不然后面的国家根本看不见
scale_y_log10()+
#将条形去转换极坐标,参数设置为direction = -1,使得图像逆时针显示
coord_polar(direction = -1)+
#去除背景、坐标轴、标签等各种无关信息
theme_void()+
#去掉图例
theme(legend.position = "none")+
geom_text(aes(label = paste(country,paste(cum_confirm,"例",sep = ""),
sep = "\n"),
y = cum_confirm*.8, angle = angle),
data = today_top[1:5,],color = "white",
fontface = "bold", vjust = 1, family = "STKaiti",size = 2.2)+
geom_text(aes(label = paste("死亡",cum_dead,"例",sep = ""),
y = cum_confirm*.4, angle = angle),
data = today_top[1:5,],color = "white",
fontface = "bold", vjust = 1, family = "STKaiti",size = 1.5)+
geom_text(aes(label = country, y = cum_confirm*.8),
data = today_top[6:20,],color = "white",
vjust = seq(0.4,0,length.out = 15),
hjust = seq(0.2,0.5,length.out = 15),
fontface = "bold", family = "STKaiti",
size = c(2.2,1,seq(2.2,1,length.out = 13)))+
geom_text(aes(label = paste(cum_confirm,"例",sep = ""),
y = cum_confirm*c(seq(.55,.45,length.out = 6),
seq(.47,.65,length.out = 9))),
data = today_top[6:20,],color = "white",
vjust = seq(0.5,0,length.out = 15),
hjust = seq(0,0.5,length.out = 15),
fontface = "bold", family = "STKaiti",
size = seq(1.9,1.5,length.out = 15))+
geom_text(aes(label = paste("死亡",cum_dead,"例",sep = ""),
y = cum_confirm*c(seq(.35,.21,length.out = 6),
seq(.22,.45,length.out = 9))),
data = today_top[6:20,],color = "white",
vjust = seq(0.4,0,length.out = 15),
hjust = seq(0.2,0.5,length.out = 15),
fontface = "bold", family = "STKaiti",
size = 1.2)+
geom_text(aes(label = paste(paste(cum_confirm,"例",sep = ""),
country_dead,sep = " "),
y = cum_confirm*1.1,angle = angle+90),
data = today_top[21:40,],color = "black",
vjust = 0, hjust = 0,
fontface = "bold", family = "STKaiti",size = 1.5)+
geom_text(aes(label = country, y = cum_confirm*1.1,angle = angle+90),
data = today_top[41,],color = "black",
vjust = 0, hjust = 0,
fontface = "bold", family = "STKaiti",size = 1.5)
目前出的图与原图相比,还是有几点没有解决的问题。
首先就是原图中的颜色并不是一个简单的颜色梯度,而是存在这不符合顺序的深浅变化。
第二是原图的每个扇形区域的大小并不是想等的,而是确认人数越多的国家扇形区域的角度越大。
第三个就是哪些排名在40之后的国家确诊数目的添加,当然,这一点可以用PS来解决,但这就没办法全自动出图了。