其他
【文献阅读】Google Earth Engine:人人可用的行星级地理空间分析平台
Google Earth Engine:人人可用的行星级地理空间分析平台
分享一篇发表在《Remote Sensing of Environment》上面的引用次数达2000多次的经典文献。
Gorelick N, Hancher M, Dixon M, etal. Google Earth Engine: Planetary-scale geospatial analysis for everyone[J]. Remote Sensing of Environment, 2017, 202: 18–27.
摘要
Google Earth Engine是一个基于云平台的行星尺度地理空间分析平台,它将Google的海量计算能力应用于森林砍伐、干旱、灾害、疾病、粮食安全、水管理、气候监测和环境保护等各种社会问题。它作为一个集成平台在该领域独树一帜,其设计目的不仅是赋能于传统的遥感科学家,而是更广泛的缺乏利用传统超级计算机或大规模商品云计算资源所需要的技术能力的普通用户。
引言
超算和PB级的遥感影像可以从NASA/ESA/NOAA等很多单位免费获得,很多工具如GeoSpark/GeoMesa/Hadoop等也被开发出来应用于海量的地理数据计算; 要想充分利用以上资源需要IT基础设施CPU、GPU、网络等等的支持; 遥感专家和有地理数据分析需求的用户不一定能获得足够的超算等资源; Google Earth Engine(GEE)是一个云计算平台,和超算中心不一样,GEE有很好的易用性,不需要很强的代码能力。
平台总览
GEE平台包含了PB级的数据以及强大的并行计算能力,提供网络API接口和交互式开发环境(IDE)方便用户进行部署算法和可视化结果。 数据目录中提供了海量的公开地理空间数据集,包含卫星和航空要给数据,光学和非光学的,气象预报、土地覆被等多种数据集; 用户可以使用GEE API调用公开数据和个人数据,数据处理过程自动进行并行运算,用户可以通过本地客户端或者网页IDE来调用这些API; 用户可以登陆earthengine.google.com来获取用户接口、教程、案例、函数参考等等,如果有GIS、遥感、JAVASCRIPT基础会更方便学习GEE,个人数据可以上传也可以将处理结果下载到本地。
数据目录
GEE公开数据目录包含了PB级的常用的地理空间数据: 包含全部的Landsat和Sentinel-1/2,和其它气候预测、环境变化、LUCC、社会经济等数据; 这个数据目录每天更新近6000个传感器的数据; 用户可以通过REST接口上传个人数据; GEE中的影像为了提高处理效率进行了预处理: 影像保留原始的投影和分辨率,做成了256×256的切片,存储在高效且有容灾备份系统的数据库中; 为了高速可视化,影像进行了金字塔切片;
GEE中常用数据集
Dataset | Nominal resolution | Temporal granularity | Temporal coverage | Spatial coverage |
---|---|---|---|---|
Landsat | ||||
Landsat 8 OLI/TIRS | 30 m | 16 day | 2013–Now | Global |
Landsat 7 ETM + | 30 m | 16 day | 2000–Now | Global |
Landsat 5 TM | 30 m | 16 day | 1984–2012 | Global |
Landsat 4–8 surface reflectance | 30 m | 16 day | 1984–Now | Global |
Sentinel | ||||
Sentinel 1 A/B ground range detected | 10 m | 6 day | 2014–Now | Global |
Sentinel 2A MSI | 10/20 m | 10 day | 2015–Now | Global |
MODIS | ||||
MOD08 atmosphere | 1° | Daily | 2000–Now | Global |
MOD09 surface reflectance | 500 m | 1 day/8 day | 2000–Now | Global |
MOD10 snow cover | 500 m | 1 day | 2000–Now | Global |
MOD11 temperature and emissivity | 1000 m | 1 day/8 day | 2000–Now | Global |
MCD12 Land cover | 500 m | Annual | 2000–Now | Global |
MOD13 Vegetation indices | 500/250 m | 16 day | 2000–Now | Global |
MOD14 Thermal anomalies & fire | 1000 m | 8 day | 2000–Now | Global |
MCD15 Leaf area index/FPAR | 500 m | 4 day | 2000–Now | Global |
MOD17 Gross primary productivity | 500 m | 8 day | 2000–Now | Global |
MCD43 BRDF-adjusted reflectance | 1000/500 m | 8 day/16 day | 2000–Now | Global |
MOD44 veg. cover conversion | 250 m | Annual | 2000–Now | Global |
MCD45 thermal anomalies and fire | 500 m | 30 day | 2000–Now | Global |
ASTER | ||||
L1 T radiance | 15/30/90 m | 1 day | 2000–Now | Global |
Global emissivity | 100 m | Once | 2000–2010 | Global |
Other imagery | ||||
PROBA-V top of canopy reflectance | 100/300 m | 2 day | 2013–Now | Global |
EO-1 hyperion hyperspectral radiance | 30 m | Targeted | 2001–Now | Global |
DMSP-OLS nighttime lights | 1 km | Annual | 1992–2013 | Global |
USDA NAIP aerial imagery | 1 m | Sub-annual | 2003–2015 | CONUS |
Topography | ||||
Shuttle Radar Topography Mission | 30 m | Single | 2000 | 60°N–54°S |
USGS National Elevation Dataset | 10 m | Single | Multiple | United States |
USGS GMTED2010 | 7.5″ | Single | Multiple | 83°N–57°S |
GTOPO30 | 30″ | Single | Multiple | Global |
ETOPO1 | 1′ | Single | Multiple | Global |
Landcover | ||||
GlobCover | 300 m | Non-periodic | 2009 | 90°N–65°S |
USGS National Landcover Database | 30 m | Non-periodic | 1992–2011 | CONUS |
UMD global forest change | 30 m | Annual | 2000–2014 | 80°N–57°S |
JRC global surface water | 30 m | Monthly | 1984–2015 | 78°N–60°S |
GLCF tree cover | 30 m | 5 year | 2000–2010 | Global |
USDA NASS cropland data layer | 30 m | Annual | 1997–2015 | CONUS |
Weather, precipitation & atmosphere | ||||
Global precipitation measurement | 6′ | 3 h | 2014–Now | Global |
TRMM 3B42 precipitation | 15′ | 3 h | 1998–2015 | 50°N–50°S |
CHIRPS precipitation | 3′ | 5 day | 1981–Now | 50°N–50°S |
NLDAS-2 | 7.5′ | 1 h | 1979–Now | North America |
GLDAS-2 | 15′ | 3 h | 1948–2010 | Global |
NCEP reanalysis | 2.5° | 6 h | 1948–Now | Global |
ORNL DAYMET weather | 1 km | Annual | 1980–Now | North America |
GRIDMET | 4 km | 1 day | 1979–Now | CONUS |
NCEP global forecast system | 15′ | 6 h | 2015–Now | Global |
NCEP climate forecast system | 12′ | 6 h | 1979–Now | Global |
WorldClim | 30″ | 12 images | 1960–1990 | Global |
NEX downscaled climate projections | 1 km | 1 day | 1950–2099 | North America |
Population | ||||
WorldPop | 100 m | 5 year | Multiple | 2010–2015 |
GPWv4 | 30″ | 5 year | 2000–2020 | 85°N–60°S |
系统架构
GEE是谷歌技术的集大成者: Borg聚类管理系统(Borg cluster management system); Bigtable和Spanner分布式数据库(distributed databases); Colossus Google文件系统(Google File System);-FlumeJava框架,一种并行计算系统框架; Google Fusion Tables一种网络数据库,支持空间数据表(点线面和属性);
GEE系统架构简介: Earth Engine Code Editor和第三方WEB应用程序通过REST API进行交互和批量查询; 动态请求(On-the-fly requests)前端控制,复杂的二级查询功能由Compute Master提供,管理分布式的计算服务器。 FlumeJava管理分布式计算,Borge聚类管理系统保证系统多用户负载均衡; 任何个人用户的失败仅仅是重新进行查询。 GEE的查询基于函数 GEE库中有800多函数 GEE的函数可以进行组合,形成有向非循环图(Directed Acyclic Graph, DAG) GEE开发目前支持Python和Javascript
GEE的结果可以显示在动态地图上,根据交互式的地图确定投影和范围,用户可以开发出要大规模应用的算法,然后提交GEE进行批处理获取完整的计算结果。
GEE常用函数表
Function category | Examples | Mode of operation |
---|---|---|
Numerical operations | ||
Primitive operations | add, subtract, multiply, divide, etc. | Per pixel/per feature |
Trigonometric operations | cos, sin, tan, acos, asin, atan, etc. | |
Standard functions | abs, pow, sqrt, exp., log, erf, etc. | |
Logical operations | eq, neq, gt, gte, lt, lte, and, or | |
Bit/bitwise operations | and, or, xor, not, bit shift, etc. | |
Numeric casting | int, float, double, uint8, etc. | |
Array/matrix operations | ||
Elementwise operations | (numeric operations as above) | Per pixel/per feature |
Array manipulation | Get, length, cat, slice, sort, etc. | |
Array construction | Identity, diagonal, etc. | |
Matrix operations | Product, determinant, transpose, inverse, pseudoinverse, decomposition, etc. | |
Reduce and accumulate | Reduce, accum | |
Machine learning | ||
Supervised classification and regression | Bayes, CART, Random Forest, SVM, Perceptron, Mahalanobis, etc. | Per pixel/per feature |
Unsupervised Classification | K-Means, LVQ, Cobweb, etc. | |
Other per-pixel image operations | ||
Spectral operations | Unmixing, HSV transform, etc. | Per pixel |
Data masking | Unmask, update mask, etc. | |
Visualization | Min/max, color palette, gamma, SLD, etc. | |
Location | Pixel area, pixel coordinates, etc. | |
Kernel operations | ||
Convolution | Convolve, blur, etc. | Per image tile |
Morphology | Min, max, mean, distance, etc. | |
Texture | Entropy, GLCM, etc. | |
Simple shape kernels | Circle, rectangle, diamond, cross, etc. | |
Standard kernels | Gaussian, Laplacian, Roberts, Sobel, etc. | |
Other kernels | Euclidean, Manhattan and Chebyshev distance, arbitrary kernels and combinations | |
Other Image Operations | ||
Band manipulation | Add, select, rename, etc. | Per image |
Metadata properties | Get, set, etc. | |
Derivative | Pixel-space derivative, spatial gradient | |
Edge detection | Canny, Hough transform | |
Terrain operations | Slope, aspect, hillshade, fill minima, etc. | |
Connected components | Components, component size | |
Image clipping | Clip | |
Resampling | Bilinear, bicubic, etc. | |
Warping | Translate, changeProj | |
Image registration | Register, displacement, displace | |
Other tile-based operations | Cumulative cost, medial axis, reduce resolution with arbitrary reducers, etc. | |
Image aggregations | Sample region(s), reduce region(s) with arbitrary reducers | |
Reducers | ||
Simple | Count, distinct, first, etc. | Context-dependent |
Mathematical | sum, product, min, max, etc. | |
Logical | Logical and/or, bitwise and/or | |
Statistical | Mean, median, mode, percentile, standard deviation, covariance, histogram, etc. | |
Correlation | Kendall, Spearman, Pearson, Sen's slope | |
Regression | Linear regression, robust linear regression | |
Geometry Operations | ||
Types | Point, LineString, Polygon, etc. | Per-feature |
Measurements | Length, area, perimeter, distance, etc. | |
Constructive operations | Intersection, union, difference, etc. | |
Predicates | Intersects, contains, withinDistance, etc. | |
Other operations | Buffer, centroid, transform, simplify, etc. | |
Table/collection operations | ||
Basic manipulation | Sort, merge, size, first, limit, distinct, flatten, remap, etc. | Streaming |
Property filtering | eq, neq, gt, lt, date range, and, or, not, etc. | |
Spatial filtering | Intersects, contains, withinDistance, etc. | |
Parallel processing | Map, reduce, iterate | |
Joins | Simple, inner, grouping, etc. | |
Vector/raster operations | ||
Rasterization | Paint/draw, distance | Per tile |
Spatial interpolation | Kriging, IDW interpolation | |
Vectorization | reduceToVectors | Scatter/gather |
Other data types | ||
Number, string, list, dictionary, date, daterange, projection, etc. | Context-dependent |
数据分布模型(Data distribution model)
Earth Engine库中的函数采用多种内置的并行化和数据分布模型来实现高性能。每种模型都针对不同的数据访问模式(Data access pattern)进行了优化。
影像瓦片(Image tiling)
高计算量的:影像瓦片,一般256×256像素 低计算量:像元,JVM(JAVA虚拟机)和Jast-In-Time(JIT)实时计算
空间聚合(Spatial aggregations)
使用场景:空间统计、栅格转矢量、影像分类等 scatter-gather模型,划分子区域,并行计算
流集合(Streaming collections)
使用场景:时间序列影像 影像瓦片和空间聚合的组合 瓦片显著小于整个影像时,流处理非常高效
缓存和常见子表达式消除(Caching and common sub-expression elimination)
将中间计算结果以哈希值为主键存储在高速分布式缓存中 查询减少冗余计算
完整文章请点击阅读原文查看原文