GWR+R+Stata+Matlab:空间计量之地理加权归回模型(GWR)操作应用
计量经济学服务中心专辑汇总!计量百科·资源·干货:
Stata |Python |Matlab |Eviews |R Geoda |ArcGis |GeodaSpace |SPSS 一文读懂 |数据资源 |回归方法 |网络爬虫 门限回归 |工具变量 |内生性 |空间计量 因果推断 |合成控制法 |倾向匹配得分 |断点回归 |双重差分 面板数据 | 动态面板数据
一、 地理加权模型简介
空间统计为自然科学和社会科学中广泛的学科提供了重要的分析技术,在这些学科中(通常是大型的)空间数据集经常被收集。在这里,我们提出的技术从一个特殊的分支的非平稳空间统计,称为地理加权(GW)模型。GW模型适用于一些通用或全局模型不能很好地描述空间数据的情况,但适用于一些空间区域,适当的局部模型校准可以提供更好的描述。
该方法使用移动窗口加权技术,在目标位置找到局部模型。在这里,对于某个目标位置的单个模型,我们根据某个距离衰减核函数对所有邻近观测值进行加权,然后将模型局部应用于该加权数据。这个局部模型可能应用的窗口大小是由带宽控制的。较小的带宽导致结果的空间变化更加迅速,而较大的带宽使结果越来越接近通用模型的解。当存在一些目标函数(例如,模型可以预测)时,可以使用交叉验证和相关方法找到最优带宽
地理加权模型(GW model)包括的功能有:地理加权汇总统计(GW summary statistics),地理加权主成分分析(GW principal comp- onents analysis,即GW PCA),地理加权回归(GW regression),地理加权判别分析(GW discriminant analysis),其中一些功能有基本和稳健形式之分。
运用GW model的一个重要元素就是空间加权函数,空间加权函数量化(或套)观察到的变量之间的空间关系或空间相关性。空间目标及其位置临近关系的确定。
六个核函数的介绍:
Global Model(均值核函数)、Gaussian(高斯核函数)、Exponential、Box-car(盒状核函数)、Bi-square(二次核函数)、Tri-cude(立方体和函数)
二、 GWR软件操作
使用GWR软件得到结果,其具体操作如下,
1、在session file选择设置文件名,然后选择data file的filepath文件路径,选择CSV格式的数据,分隔类型选择comma,然后单击open,
2,在这样的一个方程mode中选择被解释变量以及解释变量,经度和纬度
3、kernel默认即可
4、在common output里面的session control file,选择浏览按钮,
5、单击execute this session,获得运行的结果
结果为:
第二部分:Stata:地理加权回归模型
三、Stata基本操作
空间统计目前Stata进行地理加权回归主要有命令spregxt以及gwr、gwrgrid等,本文主要简介介绍该命令的基本应用,详细内容见下一篇。
gwr语法格式为:
Geographically weighted regression
----------------------------------
gwr depvar [varlist] [if exp] [in range] , east(varname)
north(varname) [options]
选项包括
saving(filename) dots reps(#) double eform family(familyname)
link(linkname) [ln]offset(varname) test replace noconstant
nolog scale(x2|dev|#) disp(#) iterate(#) init(varname)
outfile(filename) comma wide bandwidth(#) mcsave(filename)
sample(#)
其中
familyname选项包括 gaussian | igaussian | binomial [varname|#] | poisson | nbinomial [#] | gamma,即核函数类型
linkname 选项包括如下内容:identify | log | logit | probit | cloglog | opower # | power # | nbinomial
test:要求测试带宽的重要性。这测试了gwr模型对数据的描述是否明显优于 全局的回归模型。
sample(#)指定在带宽校准过程中使用的观测值百分比,默认为100%。这是特别对于大型数据集很有用,可以减少校准带宽所需的时间。如果指定了该选项,将随机抽取#%的观测数据并用于校准过程。
bandwidth(#)允许用户输入带宽值,并减少gwr运行所需的时间。
nolog抑制带宽优化迭代的显示。
iterate(#)指定在估计带宽时允许的最大迭代次数。默认值为50。
save (filename)创建一个Stata数据文件,其中包含从计算gwr的每个点估算的参数。
outfile(filename)创建文本文件filename。
replace表示save()和/或outfile()指定的文件可以 被覆盖。它也适用于mcsave()选项。
reps(#)指定要执行的蒙特卡罗模拟的数量。默认值为1000。
操作案例:
gwr cars class unemp, east(easting) north(northing) test
gwr flag class unemp, east(east) north(north) fam(binomial) link(logit)
gwrgrid y x1, east(east) north(north) fam(b) link(l) square(10) samp(25)
结果为:
. gwr cars class unemp, east(easting) north(northing) test
Global Model
Source | SS df MS Number of obs = 120
-------------+---------------------------------- F(2, 117) = 287.17
Model | 4.51965851 2 2.25982925 Prob > F = 0.0000
Residual | .920700696 117 .007869237 R-squared = 0.8308
-------------+---------------------------------- Adj R-squared = 0.8279
Total | 5.4403592 119 .045717304 Root MSE = .08871
------------------------------------------------------------------------------
cars | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
class | .0188073 .0033449 5.62 0.000 .0121829 .0254316
unemp | -.0182798 .0011238 -16.27 0.000 -.0205054 -.0160543
_cons | .8847704 .0288569 30.66 0.000 .8276208 .94192
------------------------------------------------------------------------------
四、Stata基本操作2
spregxt是一个完整的Stata模块工具包 ,可以估计不同类型的空间面板计量经济回归模型。
例如(SAR - SEM - SDM - SAC - SPGMM - GS2SLS - GS2SLSAR)回归。
spregxt可以估算以下模型: 1-自相关(序列相关)回归模型。2-干扰项异方差回归模型。3-扰动项下的非正态回归模型,以及自变量多重共线性回归模型。
事实上,spregxt估计了许多空间和非空间模型,甚至OLS回归:
spregxt varlist , nc(#) model(ols) run(ols)不选择:wmfile(),
spregxt的回归结果与OLS相同
运行spregxt varlist , nc(#) model(ols) run(xtbe, xtfe, xtre, xtmle, xtpa, xtgls, ...)不选择:wmfile(),spregxt的结果与xtreg和更多的XT命令的面板相同。
所有诊断测试对所有模型都有效。
spregxt估计连续和截断因变量模型tobit。
model(sar, sem, sdm, sac, mstar,mstard, spgmm)处理连续或截断相关的数据 变量。如果depvar有缺失值或下限,那么在这种情况下, model(sar, sem, sdm, sac, mstar,mstard, spgmm)将会通过xttobit模型拟合空间面板模型,因此spregxt可以解决 存在于多种数据中缺失值的问题。否则,在连续数据的情况下,将使用正态估计。
spregxt语法格式为:
* (17) Spatial Panel Geographically Weighted Regressions (GWR)
* ols spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(ols)
* xtbe spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtbe)
* xtbem spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtbem) ridge(grr1)
* xtfe spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtfe)
* xtfem spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtfem) ridge(grr1)
* xtfm spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtfm)
* xtpa spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtpa)
* xtwem spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtwem)
* xtmle spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtmle)
* xtam spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtam)
* xtbn spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtbn)
* xthh spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xthh)
* xtrc spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtrc)
* xtre spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtre) hausman
* xtrem spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtrem) hausman ridge(grr1)
* xtsa spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtsa)
* xtmlem spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtmlem)
* xtwh spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtwh)
* xtgls spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtgls)
* xtkmhomo spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtkmhomo)
* xtkmhet1 spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtkmhet1)
* xtkmhet2 spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtkmhet2)
* xtparks spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtparks)
* xtmg spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtmg)
* xtpcse spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtpcse)
* xtregar spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtregar)
* xtabond spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtabond) inst(x1 x2)
* xtdhp spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtdhp) re
* xtdpd spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtdpd) dgmmiv(x1 x2)
* xtdpdsys spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtdpdsys)
* xtmln spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtmln)
* xtmlh spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtmlh) mhet(x1 x2)
* xtfrontier spregxt y x1 x2, nc(7) wmfile(SPWxt) model(gwr) run(xtfrontier) ti
* xtfrontier spregxt y x1 x2, nc(7) wmfile(SPWxt) model(gwr) run(xtfrontier) tvd
* xtfrontier spregxt y x1 x2, nc(7) wmfile(SPWxt) model(gwr) run(xtfrontier) ti cost
* xttobit spregxt ys x1 x2, nc(7) wmfile(SPWxt) model(gwr) run(xttobit)
选项含义为:
model(gwr) Geographically Weighted Regressions (GWR)
run( ) is used with Spatial: model(ols, sarxt, sdmxt)
主要包括如下选项模型:
1- run(ols) [regress] Ordinary Least Squares (OLS)
2- run(xtbe) [xtreg , be] Between-Effects Panel Regression
3- run(xtbem) [NEW] Between-Effects Panel Regression
4- run(xtfe) [xtreg , fe] Fixed-Effects Panel Regression
5- run(xtfem) [NEW] Fixed-Effects Panel Regression
6- run(xtpa) [xtreg , pa] Population Averaged-Effects Panel Regression
7- run(xtmle) [xtreg , mle] MLE Random-Effects Panel Regression
8- run(xtam) [NEW] Amemiya Random-Effects Panel Regression
9- run(xtbn) [NEW] Balestra-Nerlove Random-Effects Panel Regression
10- run(xtfm) [NEW] Fama-MacBeth Panel Regression
11- run(xthh) [NEW] Hildreth-Houck Random Coefficients Panel Regression
12- run(xtrc) [xtrc] Swamy Random Coefficients Panel Regression
13- run(xtre) [xtreg , re] GLS Random-Effects Panel Regression
14- run(xtrem) [NEW] Fuller-Battese GLS Random-Effects Panel Regression
15- run(xtsa) [NEW] Swamy-Arora Random-Effects Panel Regression
16- run(xtmlem) [NEW] Trevor Breusch MLE Random-Effects Panel Regression
17- run(xtwem) [NEW] Within-Effects Panel Regression
18- run(xtwh) [NEW] Wallace-Hussain Random-Effects Panel Regression
19- run(xtgls) [xtgls] Autocorrelation & Heteroskedasticity GLS Panel Regression
20- run(xtkmhomo) [xtgls] Kmenta Homoscedastic GLS AR(1) Panel Regression
* with Options: panels(iid) corr(psar1)
21- run(xtkmhet1) [xtgls] Kmenta Heteroscedastic GLS - different AR(1) in each Panel
* with Options: panels(het) corr(psar1)
22- run(xtkmhet2) [xtgls] Kmenta Heteroscedastic GLS - SAME/Common AR(1) in all Panels
* with Options: panels(het) corr(ar1)
23- run(xtparks) [xtgls] Parks Full Heteroscedastic Cross-Section GLS AR(1) Panel Regression
* with Options: panels(corr) corr(psar1)
24- run(xtmg) [xtmg] Heterogeneous Slopes Time Series Panel Regression
Requires (xtmg) module ssc install xtmg
25- run(xtpcse) [xtpcse] Corrected Standard Error Panel Regression
26- run(xtregar) [xtregar] AR(1) Panel Regression
27- run(xtabond) [xtabond] Arellano-Bond Linear Dynamic Panel Regression
28- run(xtdhp) [NEW] Han-Philips (2010) Linear Dynamic Panel Regression
29- run(xtdpd) [xtdpd] Arellano-Bond (1991) Linear Dynamic Panel Regression
30- run(xtdpdsys) [xtdpdsys] Arellano-Bover/Blundell-Bond (1995, 1998)
System Linear Dynamic Panel Regression
31- run(xtfrontier)[xtfrontier] Stochastic Frontier Panel Regression
32- run(xttobit) [xttobit] Tobit Random-Effects Panel Regression
33- run(xtmln) [NEW] MLE Random-Effects Panel Regression
* Normal Model
34- run(xtmlh) [NEW] MLE Random-Effects Panel Regression
* Multiplicative Heteroscedasticity Normal Model
结果为:
1、普通最小二乘法GWR
spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(ols)
2、运行面板固定效应GWR
. spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtfe)
==============================================================================
*** Binary (0/1) Weight Matrix: (49x49) : NC=7 NT=7 (Non Normalized)
------------------------------------------------------------------------------
==============================================================================
* Spatial Panel Geographically Weighted Regression (GWR): Model(gwr) - Run(xtfe)
==============================================================================
* Fixed-Effects Panel Data Regression
==============================================================================
w1y_y = w1x_x1 w1x_x2
------------------------------------------------------------------------------
Sample Size = 49 | Cross Sections Number = 7
Wald Test = 23.6942 | P-Value > Chi2(2) = 0.0000
F-Test = 11.8471 | P-Value > F(2 , 40) = 0.0001
R2 (R-Squared) = 0.3056 | Raw Moments R2 = 0.7326
R2a (Adjusted R2) = 0.1668 | Raw Moments R2 Adj = 0.6792
Root MSE (Sigma) = 61.4785 | Log Likelihood Function = -196.0189
------------------------------------------------------------------------------
- R2h= 0.3056 R2h Adj= 0.1668 F-Test = 10.12 P-Value > F(2 , 40) 0.0003
- R2r= 0.7326 R2r Adj= 0.6792 F-Test = 42.02 P-Value > F(3 , 40) 0.0000
------------------------------------------------------------------------------
w1y_y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
w1x_x1 | -.2283962 .0796563 -2.87 0.007 -.3893877 -.0674047
w1x_x2 | -.8133583 .2947702 -2.76 0.009 -1.409111 -.2176054
_cons | 159.5689 12.69732 12.57 0.000 133.9067 185.2311
------------------------------------------------------------------------------
*3、运行面板随机效应GWR
. spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtmle)
==============================================================================
*** Binary (0/1) Weight Matrix: (49x49) : NC=7 NT=7 (Non Normalized)
------------------------------------------------------------------------------
==============================================================================
* Spatial Panel Geographically Weighted Regression (GWR): Model(gwr) - Run(xtmle)
==============================================================================
* MLE Random-Effects Panel Data Regression
==============================================================================
w1y_y = w1x_x1 w1x_x2
------------------------------------------------------------------------------
Sample Size = 49 | Cross Sections Number = 7
Wald Test = 18.9340 | P-Value > Chi2(2) = 0.0001
F-Test = 9.4670 | P-Value > F(2 , 40) = 0.0004
R2 (R-Squared) = 0.3055 | Raw Moments R2 = 0.7518
R2a (Adjusted R2) = 0.1666 | Raw Moments R2 Adj = 0.7022
Root MSE (Sigma) = 59.2296 | Log Likelihood Function = -215.7765
------------------------------------------------------------------------------
- R2h= 0.3055 R2h Adj= 0.1666 F-Test = 10.12 P-Value > F(2 , 40) 0.0003
- R2r= 0.7518 R2r Adj= 0.7022 F-Test = 46.45 P-Value > F(3 , 40) 0.0000
------------------------------------------------------------------------------
w1y_y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
w1y_y |
w1x_x1 | -.2111652 .0783551 -2.69 0.010 -.3695267 -.0528036
w1x_x2 | -.7026954 .2917514 -2.41 0.021 -1.292347 -.1130438
_cons | 153.0434 23.27066 6.58 0.000 106.0116 200.0751
------------------------------------------------------------------------------
.
第三部分:Matlab:地理加权回归模型
1
Matlab:地理加权回归模型命令简介
在Matlab软件中,可以调用gwr.m来实现地理加权回归模型的参数过程,下面介绍 GWR 在Matlab中的实现过程:
“gwr.m”函数命令的调用方式如下所示:
#%% 地理加权回归模型MATLAB程序代码如下
>> help gwr
PURPOSE: compute geographically weighted regression
----------------------------------------------------
USAGE: results = gwr(y,x,east,north,info)
where: y = dependent variable vector
x = explanatory variable matrix
east = x-coordinates in space
north = y-coordinates in space
info = a structure variable with fields:
info.bwidth = scalar bandwidth to use or zero
for cross-validation estimation (default)
info.bmin = minimum bandwidth to use in CV search
info.bmax = maximum bandwidth to use in CV search
defaults: bmin = 0.1, bmax = 20
info.dtype = 'gaussian' for Gaussian weighting (default)
= 'exponential' for exponential weighting
= 'tricube' for tri-cube weighting
info.q = q-nearest neighbors to use for tri-cube weights
(default: CV estimated)
info.qmin = minimum # of neighbors to use in CV search
info.qmax = maximum # of neighbors to use in CV search
defaults: qmin = nvar+2, qmax = 4*nvar
---------------------------------------------------
NOTE: res = gwr(y,x,east,north) does CV estimation of bandwidth
---------------------------------------------------
RETURNS: a results structure
results.meth = 'gwr'
results.beta = bhat matrix (nobs x nvar)
results.tstat = t-stats matrix (nobs x nvar)
results.yhat = yhat
results.resid = residuals
results.sige = e'e/(n-dof) (nobs x 1)
results.nobs = nobs
results.nvar = nvars
results.bwidth = bandwidth if gaussian or exponential
results.q = q nearest neighbors if tri-cube
results.dtype = input string for Gaussian, exponential weights
results.iter = # of simplex iterations for cv
results.north = north (y-coordinates)
results.east = east (x-coordinates)
results.y = y data vector
---------------------------------------------------
See also: prt,plt, prt_gwr, plt_gwr to print and plot results
---------------------------------------------------
References: Brunsdon, Fotheringham, Charlton (1996)
Geographical analysis, pp. 281-298
---------------------------------------------------
NOTES: uses auxiliary function scoref for cross-validation
---------------------------------------------------
选项含义为:
2
高斯距离权重函数地理加权回归模型
选择高斯距离权重函数进行地理加权回归模型代码如下:
%--------------------------------------------------------------------------
%计量经济学服务中心《空间计量经济学及Matlab应用》
%--------------------------------------------------------------------------
%高斯距离权重函数地理加权回归模型
% load the Anselin data set
% 导入数据
load anselin.dat;
y = anselin(:,1);
nobs = length(y);
x = [ones(nobs,1) anselin(:,2:3)];
[nobs nvar] = size(x);
north = anselin(:,4);
east = anselin(:,5);
vnames = strvcat('crime','constant','income','hvalue');
%--------------------------------------------------------------------------
%计量经济学服务中心《空间计量经济学及Matlab应用》
%--------------------------------------------------------------------------
% Gaussian distance weighting
%高斯距离权重函数地理加权回归模型
info.dtype = 'gaussian';
tic; result1 = gwr(y,x,east,north,info); toc;
prt(result1,vnames);
结果为:
14-49空间单元的参数估计结果不再展现,完整结果可以查看后面代码。
3
三种不同空间权重矩阵参数比较
命令为:
%--------------------------------------------------------------------------
%计量经济学服务中心《空间计量经济学及Matlab应用》
%--------------------------------------------------------------------------
% plot results for comparison (see also plt)
tt=1:nobs;
subplot(3,1,1),
plot(tt,result1.beta(:,1),tt,result2.beta(:,1),'--',tt,result3.beta(:,1),'-.');
legend('Gaussian','Exponential','tricube');
ylabel('Constant term');
subplot(3,1,2),
plot(tt,result1.beta(:,2),tt,result2.beta(:,2),'--',tt,result3.beta(:,2),'-.');
legend('Gaussian','Exponential','tricube');
ylabel('Household income');
subplot(3,1,3),
plot(tt,result1.beta(:,3),tt,result2.beta(:,3),'--',tt,result3.beta(:,3),'-.');
legend('Gaussian','Exponential','tricube');
ylabel('House value');
%--------------------------------------------------------------------------
%计量经济学服务中心《空间计量经济学及Matlab应用》
%--------------------------------------------------------------------------
结果为:
4
指数距离权重函数地理加权回归模型
2-49空间单元参数估计结果不再展现
5
立方距离权重函数地理加权回归模型
2-49空间单元参数估计结果不再展现
高斯距离权重函数地理加权回归模型1-49地理加权回归参数估计结果为:
%--------------------------------------------------------------------------
%计量经济学服务中心《空间计量经济学及Matlab应用》
%--------------------------------------------------------------------------
Vname =
Variable
Geometrically weighted regression estimates
Dependent Variable = crime
R-squared = 0.9418
Rbar-squared = 0.9393
Bandwidth = 0.6518
# iterations = 17
Decay type = gaussian
Nobs, Nvars = 49, 3
***************************************
Obs = 1, x-coordinate= 42.3800, y-coordinate= 35.6200, sige= 3.4125
Variable Coefficient t-statistic t-probability
constant 51.197363 9.212794 0.000000
income -0.461038 -1.678857 0.099547
hvalue -0.434237 -3.693955 0.000556
Obs = 2, x-coordinate= 40.5200, y-coordinate= 36.5000, sige= 6.7847
Variable Coefficient t-statistic t-probability
constant 63.564308 9.955778 0.000000
income -0.369902 -0.991321 0.326399
hvalue -0.683553 -4.656428 0.000025
Obs = 3, x-coordinate= 38.7100, y-coordinate= 36.7100, sige= 8.6457
Variable Coefficient t-statistic t-probability
constant 72.673672 9.395151 0.000000
income -0.161106 -0.269269 0.788853
hvalue -0.826921 -5.367996 0.000002
Obs = 4, x-coordinate= 38.4100, y-coordinate= 33.3600, sige= 5.2400
Variable Coefficient t-statistic t-probability
constant 81.381328 7.772343 0.000000
income 0.149437 0.194405 0.846662
hvalue -1.073198 -9.228621 0.000000
Obs = 5, x-coordinate= 44.0700, y-coordinate= 38.8000, sige= 0.6985
Variable Coefficient t-statistic t-probability
constant 46.737222 13.309854 0.000000
income -0.689933 -2.949392 0.004869
hvalue -0.223718 -4.843363 0.000013
Obs = 6, x-coordinate= 41.1800, y-coordinate= 39.8200, sige= 2.7853
Variable Coefficient t-statistic t-probability
constant 57.351504 10.979281 0.000000
income -0.971958 -2.506024 0.015580
hvalue -0.310679 -3.233765 0.002189
Obs = 7, x-coordinate= 38.0000, y-coordinate= 40.0100, sige= 2.2903
Variable Coefficient t-statistic t-probability
constant 79.683414 14.237667 0.000000
income -1.990153 -3.856516 0.000336
hvalue -0.402011 -2.423975 0.019088
Obs = 8, x-coordinate= 39.2800, y-coordinate= 43.7500, sige= 0.6613
Variable Coefficient t-statistic t-probability
constant 79.374676 10.227137 0.000000
income -3.294825 -6.536725 0.000000
hvalue 0.059876 0.936349 0.353686
Obs = 9, x-coordinate= 34.9100, y-coordinate= 39.6100, sige= 2.8503
Variable Coefficient t-statistic t-probability
constant 72.218154 10.454632 0.000000
income -1.988247 -2.094491 0.041410
hvalue -0.063618 -0.350051 0.727801
Obs = 10, x-coordinate= 36.4200, y-coordinate= 47.6100, sige= 0.3660
Variable Coefficient t-statistic t-probability
constant 54.058540 23.753628 0.000000
income -1.719995 -13.667469 0.000000
hvalue 0.033105 1.405730 0.166114
Obs = 11, x-coordinate= 34.4600, y-coordinate= 48.5800, sige= 0.6241
Variable Coefficient t-statistic t-probability
constant 55.363293 25.304369 0.000000
income -1.767205 -17.192082 0.000000
hvalue 0.019889 0.631315 0.530769
Obs = 12, x-coordinate= 32.6500, y-coordinate= 49.6100, sige= 1.1183
Variable Coefficient t-statistic t-probability
constant 54.800116 18.118969 0.000000
income -1.673002 -11.485387 0.000000
hvalue 0.000544 0.009703 0.992298
Obs = 13, x-coordinate= 29.9100, y-coordinate= 50.1100, sige= 1.8016
Variable Coefficient t-statistic t-probability
constant 49.090996 11.226397 0.000000
income -1.206984 -4.006017 0.000209
hvalue -0.061675 -0.484850 0.629943
Obs = 14, x-coordinate= 27.8000, y-coordinate= 51.2400, sige= 1.1740
Variable Coefficient t-statistic t-probability
constant 42.025898 8.693270 0.000000
income -1.049190 -2.353344 0.022662
hvalue 0.034076 0.168603 0.866803
Obs = 15, x-coordinate= 25.2400, y-coordinate= 50.8900, sige= 0.5074
Variable Coefficient t-statistic t-probability
constant 42.023487 9.931361 0.000000
income -2.035189 -4.285382 0.000085
hvalue 0.541132 2.264220 0.028021
Obs = 16, x-coordinate= 27.9300, y-coordinate= 48.4400, sige= 1.8858
Variable Coefficient t-statistic t-probability
constant 50.858338 11.840157 0.000000
income -0.970544 -2.036648 0.047108
hvalue -0.163386 -0.772214 0.443696
Obs = 17, x-coordinate= 31.9100, y-coordinate= 46.7300, sige= 1.9074
Variable Coefficient t-statistic t-probability
constant 63.935965 23.149085 0.000000
income -1.851883 -8.000266 0.000000
hvalue -0.065988 -0.739344 0.463225
Obs = 18, x-coordinate= 35.9200, y-coordinate= 43.4400, sige= 1.0826
Variable Coefficient t-statistic t-probability
constant 61.515865 14.934047 0.000000
income -1.892916 -5.019187 0.000007
hvalue 0.013062 0.157654 0.875377
Obs = 19, x-coordinate= 33.4600, y-coordinate= 43.3700, sige= 2.7225
Variable Coefficient t-statistic t-probability
constant 65.413374 17.271159 0.000000
income -2.860764 -4.125718 0.000143
hvalue 0.275876 1.222495 0.227368
Obs = 20, x-coordinate= 33.1400, y-coordinate= 41.1300, sige= 5.0673
Variable Coefficient t-statistic t-probability
constant 66.620907 8.186391 0.000000
income -1.619154 -1.246106 0.218651
hvalue -0.110761 -0.311593 0.756672
Obs = 21, x-coordinate= 31.6100, y-coordinate= 43.9500, sige= 2.6677
Variable Coefficient t-statistic t-probability
constant 68.176378 23.711500 0.000000
income -3.351877 -5.596394 0.000001
hvalue 0.449873 2.009607 0.049996
Obs = 22, x-coordinate= 30.4000, y-coordinate= 44.1000, sige= 2.6080
Variable Coefficient t-statistic t-probability
constant 68.744965 23.673393 0.000000
income -3.282837 -5.969185 0.000000
hvalue 0.438989 2.053138 0.045418
Obs = 23, x-coordinate= 29.1800, y-coordinate= 43.7000, sige= 2.8861
Variable Coefficient t-statistic t-probability
constant 69.068145 17.943117 0.000000
income -3.326136 -5.847206 0.000000
hvalue 0.468812 1.914810 0.061364
Obs = 24, x-coordinate= 28.7800, y-coordinate= 41.0400, sige= 8.1087
Variable Coefficient t-statistic t-probability
constant 77.271200 7.416531 0.000000
income -3.000189 -3.431018 0.001230
hvalue 0.167212 0.398976 0.691645
Obs = 25, x-coordinate= 27.3100, y-coordinate= 43.2300, sige= 4.0434
Variable Coefficient t-statistic t-probability
constant 67.368725 9.525528 0.000000
income -3.069044 -3.468780 0.001099
hvalue 0.363366 0.754567 0.454120
Obs = 26, x-coordinate= 24.9600, y-coordinate= 42.6700, sige= 2.5678
Variable Coefficient t-statistic t-probability
constant 61.306231 5.851086 0.000000
income 0.006368 0.004423 0.996489
hvalue -1.071954 -1.162870 0.250514
Obs = 27, x-coordinate= 25.9000, y-coordinate= 41.2100, sige= 6.2344
Variable Coefficient t-statistic t-probability
constant 59.819535 4.913992 0.000010
income -1.697764 -1.212751 0.231040
hvalue -0.138505 -0.170688 0.865172
Obs = 28, x-coordinate= 25.8500, y-coordinate= 39.3200, sige= 5.2496
Variable Coefficient t-statistic t-probability
constant 45.265068 2.954417 0.004803
income -2.135825 -1.284057 0.205161
hvalue 0.591982 0.883602 0.381226
Obs = 29, x-coordinate= 27.4900, y-coordinate= 41.0900, sige= 8.3927
Variable Coefficient t-statistic t-probability
constant 72.899979 6.290465 0.000000
income -3.258441 -2.970188 0.004599
hvalue 0.307426 0.555503 0.581078
Obs = 30, x-coordinate= 28.8200, y-coordinate= 38.3200, sige= 6.0199
Variable Coefficient t-statistic t-probability
constant 80.285094 7.449344 0.000000
income -0.676605 -0.717337 0.476572
hvalue -0.618717 -2.097025 0.041175
Obs = 31, x-coordinate= 30.9000, y-coordinate= 41.3100, sige= 5.9421
Variable Coefficient t-statistic t-probability
constant 68.118651 7.883805 0.000000
income -1.803631 -2.133632 0.037905
hvalue 0.059481 0.178766 0.858859
Obs = 32, x-coordinate= 32.8800, y-coordinate= 39.3600, sige= 4.5678
Variable Coefficient t-statistic t-probability
constant 58.637810 7.764366 0.000000
income 0.495270 0.487439 0.628121
hvalue -0.388646 -1.896549 0.063791
Obs = 33, x-coordinate= 30.6400, y-coordinate= 39.7200, sige= 5.1218
Variable Coefficient t-statistic t-probability
constant 70.568456 9.798923 0.000000
income -0.218856 -0.335471 0.738702
hvalue -0.448133 -1.933095 0.059014
Obs = 34, x-coordinate= 30.3500, y-coordinate= 38.2900, sige= 3.1096
Variable Coefficient t-statistic t-probability
constant 80.030552 12.784499 0.000000
income 0.036213 0.068159 0.945936
hvalue -0.786849 -5.179351 0.000004
Obs = 35, x-coordinate= 32.0900, y-coordinate= 36.6000, sige= 3.5543
Variable Coefficient t-statistic t-probability
constant 63.967857 8.009308 0.000000
income 0.337987 0.382854 0.703484
hvalue -0.492099 -3.556112 0.000846
Obs = 36, x-coordinate= 34.0800, y-coordinate= 37.6000, sige= 2.7764
Variable Coefficient t-statistic t-probability
constant 67.746908 11.590897 0.000000
income -0.755463 -0.934476 0.354641
hvalue -0.243619 -2.063643 0.044369
Obs = 37, x-coordinate= 36.1200, y-coordinate= 37.1300, sige= 5.2909
Variable Coefficient t-statistic t-probability
constant 65.979447 8.493093 0.000000
income -0.082415 -0.089905 0.928729
hvalue -0.420816 -3.386697 0.001402
Obs = 38, x-coordinate= 36.3000, y-coordinate= 37.8500, sige= 4.1933
Variable Coefficient t-statistic t-probability
constant 70.241135 9.853816 0.000000
income -0.851484 -1.007494 0.318647
hvalue -0.331039 -2.669386 0.010278
Obs = 39, x-coordinate= 36.4000, y-coordinate= 35.9500, sige= 7.5290
Variable Coefficient t-statistic t-probability
constant 60.058183 6.254403 0.000000
income 1.346573 1.335110 0.188010
hvalue -0.676333 -5.334379 0.000002
Obs = 40, x-coordinate= 35.6000, y-coordinate= 35.7200, sige= 6.1315
Variable Coefficient t-statistic t-probability
constant 59.441973 6.623725 0.000000
income 1.197840 1.197762 0.236772
hvalue -0.582346 -4.668150 0.000024
Obs = 41, x-coordinate= 34.6600, y-coordinate= 35.7600, sige= 4.4315
Variable Coefficient t-statistic t-probability
constant 64.924831 8.369141 0.000000
income 0.077997 0.082850 0.934308
hvalue -0.395195 -3.149189 0.002788
Obs = 42, x-coordinate= 33.9200, y-coordinate= 36.1500, sige= 2.7971
Variable Coefficient t-statistic t-probability
constant 68.995022 11.150905 0.000000
income -0.721014 -0.903935 0.370452
hvalue -0.276558 -2.460420 0.017450
Obs = 43, x-coordinate= 30.4200, y-coordinate= 34.0800, sige= 1.6449
Variable Coefficient t-statistic t-probability
constant 42.987174 3.069301 0.003493
income -0.130118 -0.085542 0.932179
hvalue -0.015665 -0.079318 0.937103
Obs = 44, x-coordinate= 28.2600, y-coordinate= 30.3200, sige= 1.5262
Variable Coefficient t-statistic t-probability
constant 38.427625 7.366893 0.000000
income -0.618892 -1.442734 0.155458
hvalue -0.192297 -0.847771 0.400688
Obs = 45, x-coordinate= 29.8500, y-coordinate= 27.9400, sige= 1.2787
Variable Coefficient t-statistic t-probability
constant 31.201319 5.045446 0.000007
income -0.603071 -1.601338 0.115730
hvalue -0.061387 -0.506869 0.614520
Obs = 46, x-coordinate= 28.2100, y-coordinate= 27.2700, sige= 1.5429
Variable Coefficient t-statistic t-probability
constant 27.113505 4.535402 0.000037
income -0.413367 -1.327768 0.190407
hvalue -0.043317 -0.586370 0.560318
Obs = 47, x-coordinate= 26.6900, y-coordinate= 24.2500, sige= 0.5555
Variable Coefficient t-statistic t-probability
constant 24.205091 3.701915 0.000542
income -0.278691 -0.853556 0.397505
hvalue -0.032273 -0.812649 0.420350
Obs = 48, x-coordinate= 25.7100, y-coordinate= 25.4700, sige= 0.6629
Variable Coefficient t-statistic t-probability
constant 24.211353 4.173486 0.000122
income -0.271872 -0.991422 0.326350
hvalue -0.034801 -0.854272 0.397112
Obs = 49, x-coordinate= 26.5800, y-coordinate= 29.0200, sige= 1.4185
Variable Coefficient t-statistic t-probability
constant 30.052990 5.675101 0.000001
income -0.431664 -1.644271 0.106522
hvalue -0.081504 -0.934733 0.354510
第四部分:R语言:地理加权回归模型
R语言、地理加权回归spgwr函数
spgwr包需要安装,代码为:
install.packages("spgwr")
library(spgwr)
spgwr进行地理加权回归主要有两个非常重要的步骤,分别是带宽选择和回归。
带宽选择使用的是gwr.sel这个函数,语法格式为:
gwr.sel(formula, data=list(), coords, adapt=FALSE, gweight=gwr.Gauss,
method = "cv", verbose = TRUE, longlat=NULL, RMSE=FALSE, weights,
tol=.Machine$double.eps^0.25, show.error.messages = FALSE)
选项含义为:
formula:regression model formula as in lm,表示R语言里面进行回归的方程式
data:model data frame as in lm, or may be a SpatialPointsDataFrame or SpatialPolygonsDataFrame object as defined in package sp 指定数据
coords: matrix of coordinates of points representing the spatial positions of the observations指定坐标(主要是经纬度,x对应经度,y对应纬度
gweight: 指定空间权重函数(代码中使用的gauss函数,以前默认的是gwr.bisquare()函数。
verbose:是否汇打印计算带宽的过程
method:指定带宽的计算方法
二、GW回归
GW 回归是探索因变量和自变量之间的空间变化关系,基本的GW回归是将通常的回归方法用于空间当中,最重要的是所有回归系数的估计都要加权,加权用到前面提到的核函数。
1、传统回归分析
------------------------------------------------------------------------------------
# 计量经济学服务中心
1、导入安装包
library(spgwr)
2、导入数据
data(columbus, package="spData")
3、查看数据
View(columbus)
进行回归,代码为:
4、进行传统OLS回归
col.lm <- lm(CRIME ~ INC + HOVAL, data=columbus)
summary(col.lm)
结果为:
2、带宽选择
------------------------------------------------------------------------------------
# 计量经济学服务中心
#5、带宽选择
col.bw <- gwr.sel(CRIME ~ INC + HOVAL, data=columbus,
coords=cbind(columbus$X, columbus$Y))
结果为:
> #5、带宽选择
> col.bw <- gwr.sel(CRIME ~ INC + HOVAL, data=columbus,
+ coords=cbind(columbus$X, columbus$Y))
Bandwidth: 12.65221 CV score: 7432.209
Bandwidth: 20.45127 CV score: 7462.704
Bandwidth: 7.83213 CV score: 7323.545
Bandwidth: 4.853154 CV score: 7307.57
Bandwidth: 5.125504 CV score: 7322.796
Bandwidth: 3.012046 CV score: 6461.764
Bandwidth: 1.874179 CV score: 6473.378
Bandwidth: 2.475485 CV score: 6109.995
Bandwidth: 2.447721 CV score: 6098.372
Bandwidth: 2.228647 CV score: 6064.1
Bandwidth: 2.264538 CV score: 6060.774
Bandwidth: 2.280666 CV score: 6060.649
Bandwidth: 2.274969 CV score: 6060.601
Bandwidth: 2.2751 CV score: 6060.601
Bandwidth: 2.27506 CV score: 6060.601
Bandwidth: 2.275019 CV score: 6060.601
Bandwidth: 2.27506 CV score: 6060.601
3、地理加权回归
使用gwr进行地理加权回归,代码为:
#6、地理加权回归操作(高斯函数)
col.gauss <- gwr(CRIME ~ INC + HOVAL, data=columbus,
coords=cbind(columbus$X, columbus$Y),
bandwidth=col.bw, hatmatrix=TRUE)
col.gauss
结果为:
> #6、地理加权回归操作(高斯函数)
> col.gauss <- gwr(CRIME ~ INC + HOVAL, data=columbus,
+ coords=cbind(columbus$X, columbus$Y),
+ bandwidth=col.bw, hatmatrix=TRUE)
> col.gauss
Call:
gwr(formula = CRIME ~ INC + HOVAL, data = columbus, coords = cbind(columbus$X,
columbus$Y), bandwidth = col.bw, hatmatrix = TRUE)
Kernel function: gwr.Gauss
Fixed bandwidth: 2.27506
Summary of GWR coefficient estimates at data points:
Min. 1st Qu. Median 3rd Qu. Max. Global
X.Intercept. 23.233234 54.124872 63.902588 68.756460 80.900619 68.6190
INC -3.130714 -1.912908 -0.984380 -0.368564 1.291075 -1.5973
HOVAL -1.052811 -0.376735 -0.097394 0.030049 0.794577 -0.2739
Number of data points: 49
Effective number of parameters (residual: 2traceS - traceS'S): 29.6163
Effective degrees of freedom (residual: 2traceS - traceS'S): 19.3837
Sigma (residual: 2traceS - traceS'S): 8.027502
Effective number of parameters (model: traceS): 23.92796
Effective degrees of freedom (model: traceS): 25.07204
Sigma (model: traceS): 7.058361
Sigma (ML): 5.048946
AICc (GWR p. 61, eq 2.33; p. 96, eq. 4.21): 403.6187
AIC (GWR p. 96, eq. 4.22): 321.6635
Residual sum of squares: 1249.101
Quasi-global R2: 0.9070486
选择函数,代码和结果为:
> #7、gwr.bisquare进行地理加权回归操作
> col.d <- gwr.sel(CRIME ~ INC + HOVAL, data=columbus,
+ coords=cbind(columbus$X, columbus$Y),
+ gweight=gwr.bisquare)
Bandwidth: 12.65221 CV score: 8180.619
Bandwidth: 20.45127 CV score: 7552.85
Bandwidth: 25.27136 CV score: 7508.227
Bandwidth: 23.68132 CV score: 7519.864
Bandwidth: 28.25033 CV score: 7491.85
Bandwidth: 30.09144 CV score: 7486.673
Bandwidth: 31.69353 CV score: 7483.663
Bandwidth: 31.08159 CV score: 7484.706
Bandwidth: 32.21945 CV score: 7482.846
Bandwidth: 32.54449 CV score: 7482.371
Bandwidth: 32.74538 CV score: 7482.088
Bandwidth: 32.86953 CV score: 7481.916
Bandwidth: 32.94626 CV score: 7481.812
Bandwidth: 32.99368 CV score: 7481.748
Bandwidth: 33.02299 CV score: 7481.708
Bandwidth: 33.04111 CV score: 7481.684
Bandwidth: 33.0523 CV score: 7481.669
Bandwidth: 33.05922 CV score: 7481.659
Bandwidth: 33.0635 CV score: 7481.654
Bandwidth: 33.06614 CV score: 7481.65
Bandwidth: 33.06777 CV score: 7481.648
Bandwidth: 33.06878 CV score: 7481.647
Bandwidth: 33.06941 CV score: 7481.646
Bandwidth: 33.06979 CV score: 7481.645
Bandwidth: 33.07003 CV score: 7481.645
Bandwidth: 33.07018 CV score: 7481.645
Bandwidth: 33.07027 CV score: 7481.645
Bandwidth: 33.07032 CV score: 7481.645
Bandwidth: 33.07037 CV score: 7481.645
Bandwidth: 33.07037 CV score: 7481.645
Warning message:
In gwr.sel(CRIME ~ INC + HOVAL, data = columbus, coords = cbind(columbus$X, :
Bandwidth converged to upper bound:33.0704149683672
>
> col.bisq <- gwr(CRIME ~ INC + HOVAL, data=columbus,
+ coords=cbind(columbus$X, columbus$Y),
+ bandwidth=col.d,gweight=gwr.bisquare, hatmatrix=TRUE)
> col.bisq
Call:
gwr(formula = CRIME ~ INC + HOVAL, data = columbus, coords = cbind(columbus$X,
columbus$Y), bandwidth = col.d, gweight = gwr.bisquare, hatmatrix = TRUE)
Kernel function: gwr.bisquare
Fixed bandwidth: 33.07037
Summary of GWR coefficient estimates at data points:
Min. 1st Qu. Median 3rd Qu. Max. Global
X.Intercept. 68.30731 68.96074 69.26155 69.57103 71.64487 68.6190
INC -1.72665 -1.64754 -1.61378 -1.56908 -1.46759 -1.5973
HOVAL -0.33314 -0.30522 -0.28063 -0.25355 -0.18834 -0.2739
Number of data points: 49
Effective number of parameters (residual: 2traceS - traceS'S): 4.604881
Effective degrees of freedom (residual: 2traceS - traceS'S): 44.39512
Sigma (residual: 2traceS - traceS'S): 11.3765
Effective number of parameters (model: traceS): 3.917653
Effective degrees of freedom (model: traceS): 45.08235
Sigma (model: traceS): 11.28946
Sigma (ML): 10.82875
AICc (GWR p. 61, eq 2.33; p. 96, eq. 4.21): 383.6983
AIC (GWR p. 96, eq. 4.22): 376.4297
Residual sum of squares: 5745.83
Quasi-global R2: 0.5724262
◆◆◆◆