R语言可视化新冠疫情
y叔团队开发的新冠疫情R包,可以获取新增、死亡、总数、疫苗接种等数据。
安装
remotes::install_github("YuLab-SMU/nCov2019")
数据获取
library(nCov2019)
res <- query()
## Querying the latest data...
## last update: 2022-02-24
## Querying the global data...
## Gloabl total 430662046 cases; and 5939601 deaths
## Gloabl total affect country or areas: 226
## Gloabl total recovered cases: 819913
## last update: 2022-02-24
## Querying the historical data...
## Querying the vaccine data...
## Total Candidates Programs : 51
## Querying the therapeutics data...
## Total Candidates Programs : 84
## Query finish, each time you can launch query() to reflash the data
简单一行代码,即可获取最新数据、全球数据、历史数据、疫苗数据等。
查看数据类型:
names(res)
## [1] "latest" "global" "historical" "vaccine" "therapeutics"
"therapeutics"表示治疗方法发展过程
全球数据
x <- res$global
x$affectedCountries # 受影响国家
## [1] 226
summary(x) # 汇总
## Gloabl total 430662046 cases; and 5939601 deaths
## Gloabl total affect country or areas: 226
## Gloabl total recovered cases: 819913
## last update: 2022-02-24
最新数据
x <- res$latest
print(x) # 最新日期
## last update: 2022-02-24
head(x$detail) # 一共23列信息
## updated country countryInfo._id countryInfo.iso2 countryInfo.iso3
## 1 2022-02-24 Germany 276 DE DEU
## 2 2022-02-24 S. Korea 410 KR KOR
## 3 2022-02-24 Russia 643 RU RUS
## 4 2022-02-24 Brazil 76 BR BRA
## 5 2022-02-24 Turkey 792 TR TUR
## 6 2022-02-24 USA 840 US USA
## countryInfo.lat countryInfo.long countryInfo.flag
## 1 51 9.0 https://disease.sh/assets/img/flags/de.png
## 2 37 127.5 https://disease.sh/assets/img/flags/kr.png
## 3 60 100.0 https://disease.sh/assets/img/flags/ru.png
## 4 -10 -55.0 https://disease.sh/assets/img/flags/br.png
## 5 39 35.0 https://disease.sh/assets/img/flags/tr.png
## 6 38 -97.0 https://disease.sh/assets/img/flags/us.png
## cases todayCases deaths todayDeaths recovered todayRecovered active
## 1 14092621 219859 122622 253 10234100 202900 3735899
## 2 2329182 171448 7607 99 969524 32633 1352051
## 3 15795570 137642 347816 785 12836228 179203 2611526
## 4 28485502 133626 646490 956 25772807 266823 2066205
## 5 13762181 86600 93258 268 12949425 95526 719498
## 6 80372404 75300 966530 2440 52453562 238409 26952312
## critical casesPerOneMillion deathsPerOneMillion tests testsPerOneMillion
## 1 2494 167322 1456 104701826 1243128
## 2 512 45366 148 15804065 307821
## 3 2300 108161 2382 273400000 1872124
## 4 8318 132464 3006 63776166 296573
## 5 1128 160340 1087 141675559 1650634
## 6 10011 240496 2892 944126758 2825085
## population continent oneCasePerPeople oneDeathPerPeople oneTestPerPeople
## 1 84224475 Europe 6 687 1
## 2 51341743 Asia 22 6749 3
## 3 146037365 Europe 9 420 1
## 4 215043743 South America 8 333 3
## 5 85830994 Asia 6 920 1
## 6 334194157 North America 4 346 0
## activePerOneMillion recoveredPerOneMillion criticalPerOneMillion
## 1 44356.45 121509.81 29.61
## 2 26334.34 18883.74 9.97
## 3 17882.59 87896.87 15.75
## 4 9608.30 119849.14 38.68
## 5 8382.73 150871.20 13.14
## 6 80648.66 156955.35 29.96
head(x["Global"],10) # 全球情况,默认是按照todayCases排序的,可以自己更改
## country cases deaths recovered active todayCases todayDeaths
## 1 Germany 14092621 122622 10234100 3735899 219859 253
## 2 S. Korea 2329182 7607 969524 1352051 171448 99
## 3 Russia 15795570 347816 12836228 2611526 137642 785
## 4 Brazil 28485502 646490 25772807 2066205 133626 956
## 5 Turkey 13762181 93258 12949425 719498 86600 268
## 6 USA 80372404 966530 52453562 26952312 75300 2440
## 7 France 22468239 137489 20010065 2320685 66833 213
## 8 Japan 4607029 22272 3783417 801340 66373 272
## 9 Indonesia 5350902 147025 4632355 571522 61488 227
## 10 Vietnam 2972378 39773 2320722 611883 60085 91
## todayRecovered population tests updated
## 1 202900 84224475 104701826 2022-02-24
## 2 32633 51341743 15804065 2022-02-24
## 3 179203 146037365 273400000 2022-02-24
## 4 266823 215043743 63776166 2022-02-24
## 5 95526 85830994 141675559 2022-02-24
## 6 238409 334194157 944126758 2022-02-24
## 7 269738 65511023 243529298 2022-02-24
## 8 84616 125839744 38016089 2022-02-24
## 9 39170 278278496 82520129 2022-02-24
## 10 15641 98778483 78664831 2022-02-24
x[c("USA","India")] # 某个国家情况
## country cases deaths recovered active todayCases todayDeaths
## 6 USA 80372404 966530 52453562 26952312 75300 2440
## 30 India 42881179 512954 42219896 148329 14148 302
## todayRecovered population tests updated
## 6 238409 334194157 944126758 2022-02-24
## 30 30009 1402309809 762414018 2022-02-24
历史数据
z <- res$historical
print(z) # 更新时间
## last update: 2022-02-23
用法和最新数据的用法一样。
head(z["Global"])
## country date cases deaths recovered
## 1 Afghanistan 2020-01-22 0 0 0
## 199 Afghanistan 2020-01-23 0 0 0
## 397 Afghanistan 2020-01-24 0 0 0
## 595 Afghanistan 2020-01-25 0 0 0
## 793 Afghanistan 2020-01-26 0 0 0
## 991 Afghanistan 2020-01-27 0 0 0
head(z[c("China","UK","USA")])
## country date cases deaths recovered
## 39 China 2020-01-22 548 17 28
## 237 China 2020-01-23 643 18 30
## 435 China 2020-01-24 920 26 36
## 633 China 2020-01-25 1406 42 39
## 831 China 2020-01-26 2075 56 49
## 1029 China 2020-01-27 2877 82 58
对于以下国家:Australia,Canada,China,Denmark,France,Netherlands,提供每个省的详细数据。
head(z["China","hubei"])
## country province date cases deaths recovered
## 36 China hubei 2020-01-22 444 17 28
## 125 China hubei 2020-01-23 444 17 28
## 214 China hubei 2020-01-24 549 24 31
## 303 China hubei 2020-01-25 761 40 32
## 392 China hubei 2020-01-26 1058 52 42
## 481 China hubei 2020-01-27 1423 76 45
还非常贴心的提供了一个把自己的数据转换为nCov2019
数据的方法:
userowndata <- read.csv("path_to_user_data.csv")
Z = convert(data=userowndata) # 转换
head(Z["Global"])
疫苗和治疗数据
x <- res$vaccine
summary(x)
## phase candidates
## 1 Phase 3 10
## 2 Phase 2/3 3
## 3 Phase 2 2
## 4 Phase 1/2 9
## 5 Phase 1 13
## 6 Pre-clinical 14
head(x["all"])
## id candidate
## 1 id1 BNT162
## 2 id2 mRNA-1273
## 3 id3 Ad5-nCoV
## 4 id4 AZD1222
## 5 id5 CoronaVac
## 6 id6 Covaxin
## mechanism
## 1 mRNA-based vaccine
## 2 mRNA-based vaccine
## 3 Recombinant vaccine (adenovirus type 5 vector)
## 4 Replication-deficient viral vector vaccine (adenovirus from chimpanzees)
## 5 Inactivated vaccine (formalin with alum adjuvant)
## 6 Inactivated vaccine
## sponsors trialPhase
## 1 Pfizer, BioNTech Phase 3
## 2 Moderna Phase 3
## 3 CanSino Biologics Phase 3
## 4 The University of Oxford Phase 3
## 5 Sinovac Phase 3
## 6 Bharat Biotech Phase 3
## institutions
## 1 Multiple study sites in Europe, North America and China
## 2 Kaiser Permanente Washington Health Research Institute
## 3 Tongji Hospital
## 4 The University of Oxford, 
## 5 Sinovac Research and Development Co., Ltd.
## 6
x <- res$therapeutics
summary(x)
## phase candidates
## 1 Phase 3 13
## 2 Phase 2/3/4 3
## 3 Phase 2/3 28
## 4 Phase 1/2 1
## 5 Phase 2 15
## 6 Phase 3/4 2
## 7 No longer being studied for COVID-19 4
## 8 Various 1
## 9 Phase 1 4
## 10 Phase 2b/3 2
## 11 No longer being developed for COVID-19 1
## 12 Phase 1/2/3 1
## 13 Phase 1/4 1
## 14 Phase 1b/2a 1
## 15 Phase 4 1
## 16 Phase 2/2 1
## 17 Phase 1b 4
## 18 Phase 2/4 1
head(x["all"])
## id medicationClass
## 1 id1 Antiviral
## 2 id2 Monoclonal antibody
## 3 id3 Monoclonal antibody
## 4 id4 IL-6 receptor agonist
## 5 id5 Monoclonal antibody
## 6 id6 Anticoagulant
## tradeName
## 1 Molnupiravir (Lagevrio, formerly known as MK-4482 and EIDD-2801)
## 2 Evusheld (tixagevimab and cilgavimab
## 3 Regkirona (regdanvimab, CT-P59)
## 4 Actemra/RoActemra (tocilizumab)
## 5 Amubarvimab and romlusevimab (formerly BRII-196 and BRII-198)
## 6 Heparin (UF and LMW)
## developerResearcher sponsors trialPhase lastUpdate
## 1 Ridgeback Biotherapeutics Ridgeback Biotherapeutics Phase 3 2020-12-10
## 2 AstraZeneca AstraZeneca Phase 3 2020-12-10
## 3 Celltrion Celltrion Phase 3 2020-12-10
## 4 Roche Various Phase 3 2020-12-10
## 5 Brii Biosciences Limited NIAID Phase 3 2020-12-10
## 6 NHLBI Operation Warp Speed Phase 2/3/4 2020-12-10
x[ID="id1"]
## [1] "Background: Molnupiravir (Lagevrio, formerly known as MK-4482 and EIDD-2801) is an oral broad-spectrum antiviral that has shown effectiveness against infections such as influenza, chikungunya, Ebola and equine encephalitis. It has a similar mechanism of action to remdesivir and prevents replication of the virus. In animal models, molnupiravir inhibited the replication of SARS-CoV-2 and MERS in mice, SARS-CoV-2 in Syrian hamsters, and blocks transmission of SARS-CoV-2 in ferrets, according to preclinical papers. Regulatory actions: Australia: Provisional determination status has been granted. Bangladesh: Bangladesh has authorized the use of molnupiravir and is in the process of authorizing generic manufacturers to supply the drug to its citizens.Canada: Merck Canada has initiated a rolling submission. EU: EMA started a rolling review of molnupiravir as of 25 October and noted that CHMP will provide EU-wide recommendations for early use of the treatment prior to authorization, given increasing case numbers in the region. On 23 November, EMA received an application for marketing authorization of molnupiravir under the name Lagrevrio. The European Commission has listed molnupiravir in its portfolio of ten most promising COVID-19 therapeutics. UK: On 4 November, MHRA approved molnupiravir for use in patients with mild or moderate COVID-19 at high risk of developing severe disease. US: Merck and Ridgeback Bio have applied for an EUA; an advisory committee recommended FDA authorize molnupiravir for emergency use in a 13-10 vote. Trials: A Phase 1 trial of 130 participants who received molnupiravir or placebo has been completed (NCT04392219) as has a Phase 2a safety trial (NCT04405570). A Phase 2/3 trial, the MOVe-IN trial of hospitalized adults (NCT04575584) was terminated, while its outpatient version, MOVe-OUT (NCT04575597) is active, but no longer recruiting. A trial to assess molnupiravir's ability to reduce viral shedding of SARS-CoV-2 (NCT04405739) continues. The Phase 3 MOVe-AHEAD trial, evaluating molnupiravir as post-exposure prophylaxis for individuals living with a person who has tested positive for COVID-19, is currently recruiting (NCT04939428). A generics manufacturer producing and supplying molnupiravir for India, Hetero, is conducting Phase 3 open-label studies on patients with mild and moderate COVID-19.Outcomes: Merck announced results from the Phase 3 MOVe-OUT trial in October 2021, which showed a 50% reduction in hospitalization for non-hospitalized patients with mild or moderate COVID-19 who received the drug. Further results from MOVe-OUT, announced on 26 November, lowered the drug's effectiveness at reducing the risk of hospitalization to 30%. Hetero, a manufacturer that has entered a non-exclusive licensing agreement with Merck regarding molnupiravir, announced results from their own Phase 3 open-label study that showed taking molnupiravir resulted in early clinical improvement and improved median time to clinical improvement compared with standard of care. Early phase results, including from the Phase 1 trial published in Antimicrobial Agents and Chemotherapy, showed molnupiravir was safe and well-tolerated in humans. Results from a Phase 2a trial of a secondary outcome in 202 participants, presented at the CROI meeting, showed a significant reduction in negative SARS-CoV-2 viral culture at 5 days compared with the placebo group. Status: Merck and Ridgeback Biotherapeutics announced on 15 April that they will proceed with the Phase 3 portion of the MOVe-OUT trial, but would not be continuing with the MOVe-IN trial after an analysis revealed molnupiravir was not effective for hospitalized adults with COVID-19. Pending an EUA, the US government has agreed to purchase 1.7 million 5-day treatment courses of molnupiravir for $1.2 billion. Other countries have also taken steps to procure molnupiravir following positive Phase 3 results, such as Australia, Malaysia, South Korea, and Thailand."
可视化
其实有了上面的数据,你可以自己画出各种图形了,为了方便还是直接提供了封装好的函数。
提供了一个封装好的函数plot
:
plot(
x,
region = "Global",
continuous_scale = FALSE,
palette = "Reds",
date = NULL,
from = NULL,
to = NULL,
title = "COVID-19",
type = "cases",
...
)
type
可使用以下几种类型:“cases”,“deaths”,“recovered”,“active”,“todayCases”,“todayDeaths”,“todayRecovered”,“population”,“tests”。默认颜色是红色,可自行更改。
最新数据:
X <- res$latest
plot(X)
## Warning: Ignoring unknown aesthetics: x, y
plot(X, type="tests",palette="Green")
## Warning: Ignoring unknown aesthetics: x, y
## Warning in pal_name(palette, type): Unknown palette Green
比较不同国家每日新增确诊病例:
library(ggplot2)
library(dplyr)
##
## 载入程辑包:'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
X <- res$historical
tmp <- X["global"] %>%
group_by(country) %>%
arrange(country,date) %>%
mutate(diff = cases - lag(cases, default = first(cases))) %>%
filter(country %in% c("Australia", "Japan", "Italy", "Germany", "China"))
ggplot(tmp,aes(date, log(diff+1), color=country)) + geom_line() +
labs(y="Log2(daily increase cases)") +
theme(axis.text = element_text(angle = 15, hjust = 1)) +
scale_x_date(date_labels = "%Y-%m-%d") +
theme_minimal()
## Warning in log(diff + 1): 产生了NaNs
## Warning in log(diff + 1): 产生了NaNs
也可以查看过去某一天的情况:
Y <- res$historical
plot(Y, region="Global" ,date = "2020-05-01", type="cases")
## Warning: Ignoring unknown aesthetics: x, y
## Warning: Transformation introduced infinite values in discrete y-axis
动图
from = "2020-03-01"
to = "2020-04-01"
y = res$historical
plot(y, from = from, to=to) # 出图很慢,不要把时间间隔设的太长...
会自动保存到当前路径。
## A gif, nCov2019.gif, was generated in current directory
其他类型可视化
library(ggplot2)
x <- res$historical
d <- x['USA' ] # 可以换成其他的
d <- d[order(d$cases), ]
ggplot(d,
aes(date, cases)) +
geom_col(fill = 'firebrick') +
theme_minimal(base_size = 14) +
xlab(NULL) + ylab(NULL) +
scale_x_date(date_labels = "%Y/%m/%d") +
labs(caption = paste("accessed date:", max(d$date)))
昨天日增病例最多的前10个国家:
library("dplyr")
library("ggrepel")
x <- res$latest
y <- res$historical
country_list = x["global"]$country[1:10]
y[country_list] %>%
subset( date > as.Date("2020-10-01") ) %>%
group_by(country) %>%
arrange(country,date) %>%
mutate(increase = cases - lag(cases, default = first(cases))) -> df
ggplot(df, aes(x=date, y=increase, color=country ))+
geom_smooth() +
geom_label_repel(aes(label = paste(country,increase)),
data = df[df$date == max(df$date), ], hjust = 1) +
labs(x=NULL,y=NULL)+
theme_bw() + theme(legend.position = 'none')
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
## Warning: ggrepel: 1 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps
某一个国家的确诊数、死亡数和恢复数:
library('tidyr')
library('ggrepel')
library('ggplot2')
y <- res$historical
country = "India"
y[country] -> d
d <- gather(d, curve, count, -date, -country)
ggplot(d, aes(date, count, color = curve)) + geom_point() + geom_line() +
labs(x=NULL,y=NULL,title=paste("Trend of cases, recovered and deaths in", country)) +
scale_color_manual(values=c("#f39c12", "#dd4b39", "#00a65a")) +
theme_bw() +
geom_label_repel(aes(label = paste(curve,count)),
data = d[d$date == max(d$date), ], hjust = 1) +
theme(legend.position = "none",
axis.text = element_text(angle = 15, hjust = 1)) +
scale_x_date(date_labels = "%Y-%m-%d")
每个国家病例热图
library('tidyr')
library('ggrepel')
library('ggplot2')
y <- res$historical
d <- y["global"]
d <- d[d$cases > 0,]
length(unique(d$country)) # 病例数大于0的国家
## [1] 198
d <- subset(d,date <= as.Date("2020-3-19"))
max_time <- max(d$date)
min_time <- max_time - 7
d <- d[d$date >= min_time,]
dd <- d[d$date == max(d$date,na.rm = TRUE),]
d$country <- factor(d$country,
levels=unique(dd$country[order(dd$cases)]))
breaks = c(0,1000, 10000, 100000, 10000000)
ggplot(d, aes(country, date)) +
geom_tile(aes(fill = cases), color = 'black') +
scale_fill_viridis_c(trans = 'log', breaks = breaks,
labels = breaks) +
xlab(NULL) + ylab(NULL) +
scale_y_date(date_labels = "%Y-%m-%d") + theme_minimal() +
theme(axis.text.x = element_text(angle = 90,hjust = 1))
著名的风玫瑰图:
require(dplyr)
y <- res$historical
d <- y["global"]
time = as.Date("2020-03-19")
dd <- filter(d, date == time) %>%
arrange(desc(cases))
dd = dd[1:40, ]
dd$country = factor(dd$country, levels=dd$country)
dd$angle = 1:40 * 360/40
require(ggplot2)
p <- ggplot(dd, aes(country, cases, fill=cases)) +
geom_col(width=1, color='grey90') +
geom_col(aes(y=I(5)), width=1, fill='grey90', alpha = .2) +
geom_col(aes(y=I(3)), width=1, fill='grey90', alpha = .2) +
geom_col(aes(y=I(2)), width=1, fill = "white") +
scale_y_log10() +
scale_fill_gradientn(colors=c("darkgreen", "green", "orange", "firebrick","red"), trans="log") +
geom_text(aes(label=paste(country, cases, sep="\n"),
y = cases *.8, angle=angle),
data=function(d) d[d$cases > 700,],
size=3, color = "white", fontface="bold", vjust=1) +
geom_text(aes(label=paste0(cases, " cases ", country),
y = max(cases) * 2, angle=angle+90),
data=function(d) d[d$cases < 700,],
size=3, vjust=0) +
coord_polar(direction=-1) +
theme_void() +
theme(legend.position="none") +
ggtitle("COVID19 global trend", time)
p
生成看板:
dashboard()
R包地址:Github:https://github.com/YuLab-SMU/nCov2019
以上就是今天的内容,希望对你有帮助哦!欢迎点赞、在看、关注、转发!
欢迎在评论区留言或直接添加我的微信!
欢迎关注我的公众号:医学和生信笔记
“医学和生信笔记 公众号主要分享:1.医学小知识、肛肠科小知识;2.R语言和Python相关的数据分析、可视化、机器学习等;3.生物信息学学习资料和自己的学习笔记!
往期精彩内容:
ggpairs展示数据间的相关性
ggduo展示两组数据间的相关性
ggnostic和ggcoef可视化回归模型
GGally包的实用函数