复制粘贴不走样!从网页复制格式化数据到RStudio
关注公众号,发送R语言或Python,获取学习资料!
今天介绍的这个包很神奇,可以直接把网页或者csv文件中的数据复制到RStudio中,直接变成tibble格式!
一个小小的动图帮助大家理解这个包要解决的问题:
在使用这个包之前需要安装好RStudio哦!
安装
安装方式略有不同哦,不要使用CRAN版本,因为已经不再维护了,有很多bug。
install.packages("datapasta", repos = c(mm = "https://milesmcbain.r-universe.dev", getOption("repos")))
install.packages(c("readr","clipr","tibble")) # 依赖包安装
使用
安装好之后会在RStudio的Addins下面出现这样的功能:
一个小例子解释下基本用法:在这个网站[1]里面有一个表格数据长这样:
你选中然后复制数据,回到RStudio中,在Addins里面选择Paste as tribble,就会变成如下格式:
tibble::tribble(
~X, ~Location, ~Min, ~Max,
"Partly cloudy.", "Brisbane", 19L, 29L,
"Partly cloudy.", "Brisbane Airport", 18L, 27L,
"Possible shower.", "Beaudesert", 15L, 30L,
"Partly cloudy.", "Chermside", 17L, 29L,
"Shower or two. Possible storm.", "Gatton", 15L, 32L,
"Possible shower.", "Ipswich", 15L, 30L,
"Partly cloudy.", "Logan Central", 18L, 29L,
"Mostly sunny.", "Manly", 20L, 26L,
"Partly cloudy.", "Mount Gravatt", 17L, 28L,
"Possible shower.", "Oxley", 17L, 30L,
"Partly cloudy.", "Redcliffe", 19L, 27L
)
## # A tibble: 11 x 4
## X Location Min Max
## <chr> <chr> <int> <int>
## 1 Partly cloudy. Brisbane 19 29
## 2 Partly cloudy. Brisbane Airport 18 27
## 3 Possible shower. Beaudesert 15 30
## 4 Partly cloudy. Chermside 17 29
## 5 Shower or two. Possible storm. Gatton 15 32
## 6 Possible shower. Ipswich 15 30
## 7 Partly cloudy. Logan Central 18 29
## 8 Mostly sunny. Manly 20 26
## 9 Partly cloudy. Mount Gravatt 17 28
## 10 Possible shower. Oxley 17 30
## 11 Partly cloudy. Redcliffe 19 27
非常神奇!简直是一个小爬虫了感觉!
当然,变成data.frame
格式也是可以的(Paste as data.frame):
data.frame(
stringsAsFactors = FALSE,
X = c("Partly cloudy.",
"Partly cloudy.","Possible shower.","Partly cloudy.",
"Shower or two. Possible storm.","Possible shower.","Partly cloudy.",
"Mostly sunny.","Partly cloudy.","Possible shower.",
"Partly cloudy."),
Location = c("Brisbane","Brisbane Airport",
"Beaudesert","Chermside","Gatton","Ipswich",
"Logan Central","Manly","Mount Gravatt","Oxley","Redcliffe"),
Min = c(19L, 18L, 15L, 17L, 15L, 15L, 18L, 20L, 17L, 17L, 19L),
Max = c(29L, 27L, 30L, 29L, 32L, 30L, 29L, 26L, 28L, 30L, 27L)
)
## X Location Min Max
## 1 Partly cloudy. Brisbane 19 29
## 2 Partly cloudy. Brisbane Airport 18 27
## 3 Possible shower. Beaudesert 15 30
## 4 Partly cloudy. Chermside 17 29
## 5 Shower or two. Possible storm. Gatton 15 32
## 6 Possible shower. Ipswich 15 30
## 7 Partly cloudy. Logan Central 18 29
## 8 Mostly sunny. Manly 20 26
## 9 Partly cloudy. Mount Gravatt 17 28
## 10 Possible shower. Oxley 17 30
## 11 Partly cloudy. Redcliffe 19 27
还支持变成data.table
格式,只要选择Paste as data.table即可!
除此之外,还支持直接变成向量哦!大家可以自己尝试一下。神奇!
注意事项
目前还有一些小问题没解决:
目前支持csv、excel、HTML,但是对一些非格式化的数据支持不太行,比如有合并单元格的数据、列名是多行的; quoted csv,且引号里面含有逗号的,不能解析; 列表列也是不行的。
参考资料
示例网站数据: https://cran.r-project.org/web/packages/datapasta/vignettes/how-to-datapasta.html
以上就是今天的内容,希望对你有帮助哦!欢迎点赞、在看、关注、转发!
欢迎在评论区留言或直接添加我的微信!
End
欢迎关注公众号:医学和生信笔记
“医学和生信笔记 公众号主要分享:1.医学小知识、肛肠科小知识;2.R语言和Python相关的数据分析、可视化、机器学习等;3.生物信息学学习资料和自己的学习笔记!
往期回顾
使用tinyarray简化你的TCGA分析流程!
使用tinyarray包简化你的GEO分析流程!
R语言处理因子之forcats包介绍(1)
R语言处理因子之forcats包介绍(2)
R语言处理因子之forcats包介绍(3)