R 语言:如何根据经纬度判断所处的南北方、东西部
为了让大家更好的理解本文的内容,欢迎各位培训班会员参加明晚 8 点的直播课 「R 语言:如何根据经纬度判断所处的南北方、东西部」
之前给大家分享过两个课程:
如何判断工企业在南方还是北方 & 计算每个工企业距离秦岭淮河线的距离?:https://rstata.duanshu.com/#/brief/course/8a4e1e28c1e54c30a7d6996823648242 已知某个地点的经纬度,如何判断其是在北方还是南方?:https://rstata.duanshu.com/#/brief/course/cf9511522ec244a09340dec8e89012b3
不过课程里面的南北方区域是手动绘制的,不是很方便,今天再给大家介绍一种新的方案。
首先加载所需的 R 包:
library(tidyverse)
library(sf)
library(lwgeom)
读取 2014 年工企地理位置数据:
haven::read_dta("2014年工企数据库地理位置数据.dta") %>%
mutate(id = row_number()) %>%
select(id, ends_with("度")) -> df
df
#> # A tibble: 305,159 × 3
#> id 经度 纬度
#> <int> <dbl> <dbl>
#> 1 1 121. 31.3
#> 2 2 114. 22.6
#> 3 3 121. 37.7
#> 4 4 111. 30.7
#> 5 5 119. 25.4
#> 6 6 117. 39.1
#> 7 7 113. 28.2
#> 8 8 121. 29.0
#> 9 9 111. 34.7
#> 10 10 122. 31.2
#> # ℹ 305,149 more rows
读取省级行政区划:
read_sf("2021行政区划/省.shp") -> prov
# 合并成全国的
prov %>%
st_union() %>%
nngeo::st_remove_holes() -> cn
中国的经纬度范围:
st_bbox(cn) %>%
st_as_sfc() -> cnbbox
读取秦岭-淮河线与胡焕庸线
# 读取秦岭-淮河线
read_sf("秦岭淮河线/秦岭淮河线.shp") -> qh
# 读取胡焕庸线
read_sf("胡焕庸线/胡焕庸线.shp") -> hhy
我们再使用 leaflet 一并预览下上面的对象们:
# 预览
library(leaflet)
library(mapview)
leaflet() %>%
addTiles("http://map.geoq.cn/ArcGIS/rest/services/ChinaOnlineCommunity/MapServer/tile/{z}/{y}/{x}", attribution = "微信公众号 RStata") -> map
mapview(cnbbox, layer.name = "中国范围",
col.regions = "red", alpha.regions = 0.4) +
mapview(cn, layer.name = "中国") +
mapview(qh, color = "red", layer.name = "秦岭-淮河线") +
mapview(hhy, map = map, color = "blue", layer.name = "胡焕庸线")
如何使用胡焕庸线分东西
lwgeom::st_split() 函数可以线条把一个多边形分成两个区域,但是胡焕庸线的长度不够,所以我们得把它往两端延伸,这里我们可以选择 cnbbox 左下角和右上角的两个点。
# 胡焕庸线的两个端点
st_startpoint(hhy) -> hhytr
st_endpoint(hhy) -> hhybl
cnbbox 左下角和右上角的两个点:
cnbbox %>%
st_cast("POINT") %>%
.[1] -> blpoint
cnbbox %>%
st_cast("POINT") %>%
.[3] -> trpoint
把这四个点合成一条新的 linestring:
c(blpoint, hhybl, hhytr, trpoint) %>%
st_coordinates() %>%
st_linestring() %>%
st_sfc(crs = 4326) -> longhhyline
最终的效果:
longhhyline %>%
mapview() +
mapview(cnbbox, map = map)
这里还可以编写一个函数用于 point 合成 linestring:
st_point_to_linestring <- function(..., crs) {
c(...) %>%
st_coordinates() %>%
st_linestring() %>%
st_sfc(crs = crs)
}
st_point_to_linestring(blpoint, hhybl, hhytr, trpoint,
crs = 4326)
#> Geometry set for 1 feature
#> Geometry type: LINESTRING
#> Dimension: XY
#> Bounding box: xmin: 73.50235 ymin: 3.83703 xmax: 135.0957 ymax: 53.56362
#> Geodetic CRS: WGS 84
然后就可以进行分隔了:
cnbbox %>%
lwgeom::st_split(longhhyline) %>%
st_collection_extract("POLYGON") %>%
st_as_sf() %>%
mutate(xpos = st_coordinates(st_centroid(.))[,"X"]) %>%
mutate(position = ifelse(xpos == max(xpos), "east", "west")) -> west_eastdf
west_eastdf
#> Simple feature collection with 2 features and 2 fields
#> Geometry type: POLYGON
#> Dimension: XY
#> Bounding box: xmin: 73.50235 ymin: 3.83703 xmax: 135.0957 ymax: 53.56362
#> Geodetic CRS: WGS 84
#> x xpos position
#> 1 POLYGON ((135.0957 53.56362... 113.07748 east
#> 2 POLYGON ((73.50235 3.83703,... 91.03894 west
上面代码中的 xpos 用于判断两个部分的东西情况。
mapview(west_eastdf, map = map, zcol = "position")
然后就可以对工企进行东西分类了。
首先把工企地理位置数据转换成 sf 对象:
# 首先把工企地理位置数据转换成 sf 对象
df %>%
filter(!is.na(经度)) %>%
st_as_sf(coords = c("经度", "纬度"),
crs = 4326, remove = F) -> dfsf
dfsf %>%
st_intersection(west_eastdf) -> dfsf1
dfsf1
#> Simple feature collection with 303661 features and 5 fields
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: 74.8288 ymin: 18.24581 xmax: 134.3002 ymax: 52.97204
#> Geodetic CRS: WGS 84
#> # A tibble: 303,661 × 6
#> id 经度 纬度 xpos position geometry
#> * <int> <dbl> <dbl> <dbl> <chr> <POINT [°]>
#> 1 1 121. 31.3 113. east (120.7504 31.34605)
#> 2 2 114. 22.6 113. east (113.8791 22.5583)
#> 3 3 121. 37.7 113. east (121.0589 37.68217)
#> 4 4 111. 30.7 113. east (111.3246 30.70783)
#> 5 5 119. 25.4 113. east (119.0104 25.43473)
#> 6 6 117. 39.1 113. east (117.3079 39.08553)
#> 7 7 113. 28.2 113. east (112.8611 28.2179)
#> 8 8 121. 29.0 113. east (121.3542 29.04982)
#> 9 9 111. 34.7 113. east (111.2657 34.71126)
#> 10 10 122. 31.2 113. east (121.5283 31.22698)
#> # ℹ 303,651 more rows
再计算工企距离胡焕庸线的距离:
dfsf1 %>%
st_distance(hhy) -> distmat
dfsf1 %>%
st_drop_geometry() %>%
mutate(dist = distmat[,1],
dist = units::set_units(dist, "km"),
dist = as.numeric(dist)) %>%
select(-xpos) -> dfsf1
如何使用秦岭-淮河线分南北
可以在 cnbbox 的左右框线取中点对秦岭淮河线进行延伸:
cnbbox %>%
st_cast("POINT") %>%
.[c(4, 5)] %>%
st_point_to_linestring(crs = 4326) %>%
st_centroid() -> leftpoint
cnbbox %>%
st_cast("POINT") %>%
.[c(2, 3)] %>%
st_point_to_linestring(crs = 4326) %>%
st_centroid() -> rightpoint
c(rightpoint, st_cast(qh$geometry, "POINT"), leftpoint) %>%
st_point_to_linestring(crs = 4326) -> longqhline
然后就可以进行分隔了:
cnbbox %>%
lwgeom::st_split(longqhline) %>%
st_collection_extract("POLYGON") %>%
st_as_sf() %>%
mutate(ypos = st_coordinates(st_centroid(.))[,"Y"]) %>%
mutate(position = ifelse(ypos == max(ypos), "north", "south")) -> north_southdf
绘图展示下:
mapview(north_southdf, map = map, zcol = "position")
然后就可以对工企进行南北分类了:
dfsf %>%
st_intersection(north_southdf) -> dfsf2
dfsf2
#> Simple feature collection with 303661 features and 5 fields
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: 74.8288 ymin: 18.24581 xmax: 134.3002 ymax: 52.97204
#> Geodetic CRS: WGS 84
#> # A tibble: 303,661 × 6
#> id 经度 纬度 ypos position geometry
#> * <int> <dbl> <dbl> <dbl> <chr> <POINT [°]>
#> 1 1 121. 31.3 19.0 south (120.7504 31.34605)
#> 2 2 114. 22.6 19.0 south (113.8791 22.5583)
#> 3 4 111. 30.7 19.0 south (111.3246 30.70783)
#> 4 5 119. 25.4 19.0 south (119.0104 25.43473)
#> 5 7 113. 28.2 19.0 south (112.8611 28.2179)
#> 6 8 121. 29.0 19.0 south (121.3542 29.04982)
#> 7 10 122. 31.2 19.0 south (121.5283 31.22698)
#> 8 11 121. 30.9 19.0 south (121.0068 30.89801)
#> 9 12 121. 31.4 19.0 south (121.1934 31.42712)
#> 10 13 121. 31.2 19.0 south (121.1196 31.15257)
#> # ℹ 303,651 more rows
再计算工企距离秦岭-淮河线的距离:
dfsf %>%
st_distance(qh) -> distmat2
dfsf2 %>%
st_drop_geometry() %>%
mutate(dist = distmat2[,1],
dist = units::set_units(dist, "km"),
dist = as.numeric(dist)) %>%
select(-ypos) -> dfsf2
dfsf2
#> # A tibble: 303,661 × 5
#> id 经度 纬度 position dist
#> <int> <dbl> <dbl> <chr> <dbl>
#> 1 1 121. 31.3 south 170.
#> 2 2 114. 22.6 south 1148.
#> 3 4 111. 30.7 south 521.
#> 4 5 119. 25.4 south 221.
#> 5 7 113. 28.2 south 780.
#> 6 8 121. 29.0 south 663.
#> 7 10 122. 31.2 south 512.
#> 8 11 121. 30.9 south 430.
#> 9 12 121. 31.4 south 222.
#> 10 13 121. 31.2 south 207.
#> # ℹ 303,651 more rows
直播信息
为了让大家更好的理解上面的内容,欢迎各位培训班会员参加明晚 8 点的直播课 「R 语言:如何根据经纬度判断所处的南北方、东西部」
直播地址:腾讯会议(需要报名 RStata 培训班参加) 讲义材料:需要报名 RStata 培训班,详情可阅读:一起来学习 R 语言和 Stata 啦!学习过程中遇到的问题也可以随时提问!
更多关于 RStata 会员的更多信息可添加微信号 r_stata 咨询:
附件下载(点击文末的阅读原文即可跳转):
https://rstata.duanshu.com/#/brief/course/6a5bd22b9c5246eebb7498695d6c1f92