查看原文
其他

GWR+R+Stata+Matlab:空间计量之地理加权归回模型(GWR)操作应用

数量经济学 数量经济学 2022-12-31

计量经济学服务中心专辑汇总计量百科·资源·干货:

Stata  |Python  |Matlab  |Eviews  |R 
Geoda  |ArcGis  |GeodaSpace  |SPSS 
一文读懂   |数据资源   |回归方法  |网络爬虫                                                               
限回归   |工具变量   |内生性   |空间计量 
果推断   |合成控制法   |倾向匹配得分   |断点回归   |双重差   
面板数据  | 动态面板数据

🌈2022年寒假--北京&远程寒假研讨班 | “高级计量经济学及Stata应用”研讨班重磅发布!

👉2021年11月空间计量研讨班:空间计量及Geoda、Stata、ArcGis、Matlab应用

一、 地理加权模型简介

空间统计为自然科学和社会科学中广泛的学科提供了重要的分析技术,在这些学科中(通常是大型的)空间数据集经常被收集。在这里,我们提出的技术从一个特殊的分支的非平稳空间统计,称为地理加权(GW)模型。GW模型适用于一些通用或全局模型不能很好地描述空间数据的情况,但适用于一些空间区域,适当的局部模型校准可以提供更好的描述。

该方法使用移动窗口加权技术,在目标位置找到局部模型。在这里,对于某个目标位置的单个模型,我们根据某个距离衰减核函数对所有邻近观测值进行加权,然后将模型局部应用于该加权数据。这个局部模型可能应用的窗口大小是由带宽控制的。较小的带宽导致结果的空间变化更加迅速,而较大的带宽使结果越来越接近通用模型的解。当存在一些目标函数(例如,模型可以预测)时,可以使用交叉验证和相关方法找到最优带宽

地理加权模型(GW model)包括的功能有:地理加权汇总统计(GW summary statistics),地理加权主成分分析(GW principal comp- onents analysis,即GW PCA),地理加权回归(GW regression),地理加权判别分析(GW discriminant analysis),其中一些功能有基本和稳健形式之分。

运用GW model的一个重要元素就是空间加权函数,空间加权函数量化(或套)观察到的变量之间的空间关系或空间相关性。空间目标及其位置临近关系的确定。

六个核函数的介绍:

Global Model(均值核函数)、Gaussian(高斯核函数)、Exponential、Box-car(盒状核函数)、Bi-square(二次核函数)、Tri-cude(立方体和函数)

二、 GWR软件操作

使用GWR软件得到结果,其具体操作如下,

1、在session file选择设置文件名,然后选择data file的filepath文件路径,选择CSV格式的数据,分隔类型选择comma,然后单击open,


2,在这样的一个方程mode中选择被解释变量以及解释变量,经度和纬度


3、kernel默认即可


4、在common output里面的session control file,选择浏览按钮,

5、单击execute this session,获得运行的结果

结果为:




第二部分:Stata:地理加权回归模型

三、Stata基本操作

空间统计目前Stata进行地理加权回归主要有命令spregxt以及gwr、gwrgrid等,本文主要简介介绍该命令的基本应用,详细内容见下一篇。

gwr语法格式为:



Geographically weighted regression
----------------------------------
   gwr depvar [varlist] [if exp] [in range] , east(varname) 
         north(varname)  [options]

选项包括


saving(filename) dots reps(#)  double eform family(familyname)
link(linkname) [ln]offset(varname) test replace noconstant
nolog scale(x2|dev|#) disp(#) iterate(#) init(varname) 
outfile(filename) comma wide bandwidth(#) mcsave(filename)
sample(#)

其中

familyname选项包括 gaussian |  igaussian     |  binomial [varname|#]  |  poisson  |  nbinomial [#] |  gamma,即核函数类型

linkname 选项包括如下内容:identify |  log     |  logit    |  probit  |  cloglog  | opower # |  power # |  nbinomial

test:要求测试带宽的重要性。这测试了gwr模型对数据的描述是否明显优于 全局的回归模型。

sample(#)指定在带宽校准过程中使用的观测值百分比,默认为100%。这是特别对于大型数据集很有用,可以减少校准带宽所需的时间。如果指定了该选项,将随机抽取#%的观测数据并用于校准过程。 

bandwidth(#)允许用户输入带宽值,并减少gwr运行所需的时间。

nolog抑制带宽优化迭代的显示。 

iterate(#)指定在估计带宽时允许的最大迭代次数。默认值为50。

save (filename)创建一个Stata数据文件,其中包含从计算gwr的每个点估算的参数。

outfile(filename)创建文本文件filename。

replace表示save()和/或outfile()指定的文件可以 被覆盖。它也适用于mcsave()选项。

reps(#)指定要执行的蒙特卡罗模拟的数量。默认值为1000。

操作案例:


gwr cars class unemp, east(easting) north(northing) test
gwr flag class unemp, east(east) north(north) fam(binomial) link(logit)
gwrgrid y x1, east(east) north(north) fam(b) link(l) square(10) samp(25)

结果为:


.  gwr cars class unemp, east(easting) north(northing) test
Global Model

      Source |       SS           df       MS      Number of obs   =       120
-------------+----------------------------------   F(2, 117)       =    287.17
       Model |  4.51965851         2  2.25982925   Prob > F        =    0.0000
    Residual |  .920700696       117  .007869237   R-squared       =    0.8308
-------------+----------------------------------   Adj R-squared   =    0.8279
       Total |   5.4403592       119  .045717304   Root MSE        =    .08871

------------------------------------------------------------------------------
        cars |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       class |   .0188073   .0033449     5.62   0.000     .0121829    .0254316
       unemp |  -.0182798   .0011238   -16.27   0.000    -.0205054   -.0160543
       _cons |   .8847704   .0288569    30.66   0.000     .8276208      .94192
------------------------------------------------------------------------------

四、Stata基本操作2

spregxt是一个完整的Stata模块工具包 ,可以估计不同类型的空间面板计量经济回归模型。

例如(SAR - SEM - SDM - SAC - SPGMM - GS2SLS - GS2SLSAR)回归。

spregxt可以估算以下模型: 1-自相关(序列相关)回归模型。2-干扰项异方差回归模型。3-扰动项下的非正态回归模型,以及自变量多重共线性回归模型。

事实上,spregxt估计了许多空间和非空间模型,甚至OLS回归:

spregxt varlist , nc(#) model(ols) run(ols)不选择:wmfile(), spregxt的回归结果与OLS相同

运行spregxt varlist , nc(#) model(ols) run(xtbe, xtfe, xtre, xtmle, xtpa, xtgls, ...)不选择:wmfile(),spregxt的结果与xtreg和更多的XT命令的面板相同。

所有诊断测试对所有模型都有效。

spregxt估计连续和截断因变量模型tobit。

model(sar, sem, sdm, sac, mstar,mstard, spgmm)处理连续或截断相关的数据 变量。如果depvar有缺失值或下限,那么在这种情况下,  model(sar, sem, sdm, sac, mstar,mstard, spgmm)将会通过xttobit模型拟合空间面板模型,因此spregxt可以解决 存在于多种数据中缺失值的问题。否则,在连续数据的情况下,将使用正态估计。

spregxt语法格式为:



* (17) Spatial Panel Geographically Weighted Regressions (GWR)
* ols      spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(ols)
* xtbe     spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtbe)
* xtbem    spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtbem) ridge(grr1)
* xtfe     spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtfe)
* xtfem    spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtfem) ridge(grr1)
* xtfm     spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtfm)
* xtpa     spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtpa)
* xtwem    spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtwem)
* xtmle    spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtmle)
* xtam     spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtam)
* xtbn     spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtbn)
* xthh     spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xthh)
* xtrc     spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtrc)
* xtre     spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtre) hausman
* xtrem    spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtrem) hausman ridge(grr1)
* xtsa     spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtsa)
* xtmlem   spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtmlem)
* xtwh     spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtwh)
* xtgls    spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtgls)
* xtkmhomo spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtkmhomo)
* xtkmhet1 spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtkmhet1)
* xtkmhet2 spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtkmhet2)
* xtparks  spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtparks)
* xtmg     spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtmg)
* xtpcse   spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtpcse)
* xtregar  spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtregar)

* xtabond  spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtabond) inst(x1 x2)
* xtdhp    spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtdhp) re
* xtdpd    spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtdpd) dgmmiv(x1 x2)
* xtdpdsys spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtdpdsys)

* xtmln    spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtmln)
* xtmlh    spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtmlh) mhet(x1 x2)
* xtfrontier spregxt y x1 x2, nc(7) wmfile(SPWxt) model(gwr) run(xtfrontier) ti
* xtfrontier spregxt y x1 x2, nc(7) wmfile(SPWxt) model(gwr) run(xtfrontier) tvd
* xtfrontier spregxt y x1 x2, nc(7) wmfile(SPWxt) model(gwr) run(xtfrontier) ti cost
* xttobit  spregxt ys x1 x2, nc(7) wmfile(SPWxt) model(gwr) run(xttobit)

选项含义为:

model(gwr)      Geographically Weighted Regressions (GWR)

run( ) is used with Spatial: model(ols, sarxt, sdmxt)

主要包括如下选项模型:


 1- run(ols)       [regress] Ordinary Least Squares (OLS)
 2- run(xtbe)      [xtreg , be] Between-Effects Panel Regression
 3- run(xtbem)     [NEW] Between-Effects Panel Regression
 4- run(xtfe)      [xtreg , fe] Fixed-Effects Panel Regression
 5- run(xtfem)     [NEW] Fixed-Effects Panel Regression
 6- run(xtpa)      [xtreg , pa] Population Averaged-Effects Panel Regression

 7- run(xtmle)     [xtreg , mle] MLE Random-Effects Panel Regression
 8- run(xtam)      [NEW] Amemiya Random-Effects Panel Regression
 9- run(xtbn)      [NEW] Balestra-Nerlove Random-Effects Panel Regression
10- run(xtfm)      [NEW] Fama-MacBeth Panel Regression
11- run(xthh)      [NEW] Hildreth-Houck Random Coefficients Panel Regression
12- run(xtrc)      [xtrc] Swamy Random Coefficients Panel Regression
13- run(xtre)      [xtreg , re] GLS Random-Effects Panel Regression
14- run(xtrem)     [NEW] Fuller-Battese GLS Random-Effects Panel Regression
15- run(xtsa)      [NEW] Swamy-Arora Random-Effects Panel Regression
16- run(xtmlem)    [NEW] Trevor Breusch MLE Random-Effects Panel Regression
17- run(xtwem)     [NEW] Within-Effects Panel Regression
18- run(xtwh)      [NEW] Wallace-Hussain Random-Effects Panel Regression

19- run(xtgls)     [xtgls] Autocorrelation & Heteroskedasticity GLS Panel Regression
20- run(xtkmhomo)  [xtgls] Kmenta Homoscedastic GLS AR(1) Panel Regression
                         * with Options: panels(iid) corr(psar1)
21- run(xtkmhet1)  [xtgls] Kmenta Heteroscedastic GLS - different AR(1) in each Panel
                         * with Options: panels(het) corr(psar1)
22- run(xtkmhet2)  [xtgls] Kmenta Heteroscedastic GLS - SAME/Common AR(1) in all Panels
                         * with Options: panels(het) corr(ar1)
23- run(xtparks)   [xtgls] Parks Full Heteroscedastic Cross-Section GLS AR(1) Panel Regression
                         * with Options: panels(corr) corr(psar1)

24- run(xtmg)      [xtmg] Heterogeneous Slopes Time Series Panel Regression 
                    Requires (xtmg) module ssc install xtmg

25- run(xtpcse)    [xtpcse] Corrected Standard Error Panel Regression
26- run(xtregar)   [xtregar] AR(1) Panel Regression

27- run(xtabond)   [xtabond] Arellano-Bond Linear Dynamic Panel Regression
28- run(xtdhp)     [NEW] Han-Philips (2010) Linear Dynamic Panel Regression
29- run(xtdpd)     [xtdpd] Arellano-Bond (1991) Linear Dynamic Panel Regression
30- run(xtdpdsys)  [xtdpdsys] Arellano-Bover/Blundell-Bond (1995, 1998)
                              System Linear Dynamic Panel Regression

31- run(xtfrontier)[xtfrontier] Stochastic Frontier Panel Regression

32- run(xttobit)   [xttobit] Tobit Random-Effects Panel Regression

33- run(xtmln)     [NEW] MLE Random-Effects Panel Regression
                         * Normal Model
34- run(xtmlh)     [NEW] MLE Random-Effects Panel Regression
                         * Multiplicative Heteroscedasticity Normal Model

结果为:

1、普通最小二乘法GWR

spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(ols)

2、运行面板固定效应GWR

. spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtfe)

==============================================================================
*** Binary (0/1) Weight Matrix: (49x49) : NC=7 NT=7 (Non Normalized)
------------------------------------------------------------------------------
==============================================================================
* Spatial Panel Geographically Weighted Regression (GWR): Model(gwr) - Run(xtfe)
==============================================================================
* Fixed-Effects Panel Data Regression
==============================================================================
 w1y_y = w1x_x1 w1x_x2
------------------------------------------------------------------------------
 Sample Size        =          49   |   Cross Sections Number   =           7
 Wald Test          =     23.6942   |   P-Value > Chi2(2)       =      0.0000
 F-Test             =     11.8471   |   P-Value > F(2 , 40)     =      0.0001
 R2  (R-Squared)    =      0.3056   |   Raw Moments R2          =      0.7326
 R2a (Adjusted R2)  =      0.1668   |   Raw Moments R2 Adj      =      0.6792
 Root MSE (Sigma)   =     61.4785   |   Log Likelihood Function =   -196.0189
------------------------------------------------------------------------------
- R2h= 0.3056   R2h Adj= 0.1668  F-Test =   10.12 P-Value > F(2 , 40)  0.0003
- R2r= 0.7326   R2r Adj= 0.6792  F-Test =   42.02 P-Value > F(3 , 40)  0.0000
------------------------------------------------------------------------------
       w1y_y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      w1x_x1 |  -.2283962   .0796563    -2.87   0.007    -.3893877   -.0674047
      w1x_x2 |  -.8133583   .2947702    -2.76   0.009    -1.409111   -.2176054
       _cons |   159.5689   12.69732    12.57   0.000     133.9067    185.2311
------------------------------------------------------------------------------

*3、运行面板随机效应GWR

. spregxt y x1 x2 , nc(7) wmfile(SPWxt) model(gwr) run(xtmle)

==============================================================================
*** Binary (0/1) Weight Matrix: (49x49) : NC=7 NT=7 (Non Normalized)
------------------------------------------------------------------------------
==============================================================================
* Spatial Panel Geographically Weighted Regression (GWR): Model(gwr) - Run(xtmle)
==============================================================================
* MLE Random-Effects Panel Data Regression
==============================================================================
 w1y_y = w1x_x1 w1x_x2
------------------------------------------------------------------------------
 Sample Size        =          49   |   Cross Sections Number   =           7
 Wald Test          =     18.9340   |   P-Value > Chi2(2)       =      0.0001
 F-Test             =      9.4670   |   P-Value > F(2 , 40)     =      0.0004
 R2  (R-Squared)    =      0.3055   |   Raw Moments R2          =      0.7518
 R2a (Adjusted R2)  =      0.1666   |   Raw Moments R2 Adj      =      0.7022
 Root MSE (Sigma)   =     59.2296   |   Log Likelihood Function =   -215.7765
------------------------------------------------------------------------------
- R2h= 0.3055   R2h Adj= 0.1666  F-Test =   10.12 P-Value > F(2 , 40)  0.0003
- R2r= 0.7518   R2r Adj= 0.7022  F-Test =   46.45 P-Value > F(3 , 40)  0.0000
------------------------------------------------------------------------------
       w1y_y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
w1y_y        |
      w1x_x1 |  -.2111652   .0783551    -2.69   0.010    -.3695267   -.0528036
      w1x_x2 |  -.7026954   .2917514    -2.41   0.021    -1.292347   -.1130438
       _cons |   153.0434   23.27066     6.58   0.000     106.0116    200.0751
------------------------------------------------------------------------------

.




第三部分:Matlab:地理加权回归模型


1

Matlab:地理加权回归模型命令简介


在Matlab软件中,可以调用gwr.m来实现地理加权回归模型的参数过程,下面介绍 GWR 在Matlab中的实现过程:

“gwr.m”函数命令的调用方式如下所示:

#%% 地理加权回归模型MATLAB程序代码如下>> help gwr PURPOSE: compute geographically weighted regression ---------------------------------------------------- USAGE: results = gwr(y,x,east,north,info) where: y = dependent variable vector x = explanatory variable matrix east = x-coordinates in space north = y-coordinates in space info = a structure variable with fields: info.bwidth = scalar bandwidth to use or zero for cross-validation estimation (default) info.bmin = minimum bandwidth to use in CV search info.bmax = maximum bandwidth to use in CV search defaults: bmin = 0.1, bmax = 20 info.dtype = 'gaussian' for Gaussian weighting (default) = 'exponential' for exponential weighting = 'tricube' for tri-cube weighting info.q = q-nearest neighbors to use for tri-cube weights (default: CV estimated) info.qmin = minimum # of neighbors to use in CV search info.qmax = maximum # of neighbors to use in CV search defaults: qmin = nvar+2, qmax = 4*nvar --------------------------------------------------- NOTE: res = gwr(y,x,east,north) does CV estimation of bandwidth --------------------------------------------------- RETURNS: a results structure results.meth = 'gwr' results.beta = bhat matrix (nobs x nvar) results.tstat = t-stats matrix (nobs x nvar) results.yhat = yhat results.resid = residuals results.sige = e'e/(n-dof) (nobs x 1) results.nobs = nobs results.nvar = nvars results.bwidth = bandwidth if gaussian or exponential results.q = q nearest neighbors if tri-cube results.dtype = input string for Gaussian, exponential weights results.iter = # of simplex iterations for cv results.north = north (y-coordinates) results.east = east (x-coordinates) results.y = y data vector --------------------------------------------------- See also: prt,plt, prt_gwr, plt_gwr to print and plot results --------------------------------------------------- References: Brunsdon, Fotheringham, Charlton (1996) Geographical analysis, pp. 281-298 --------------------------------------------------- NOTES: uses auxiliary function scoref for cross-validation ---------------------------------------------------


选项含义为:



2

高斯距离权重函数地理加权回归模型


选择高斯距离权重函数进行地理加权回归模型代码如下:

%--------------------------------------------------------------------------%计量经济学服务中心《空间计量经济学及Matlab应用》%--------------------------------------------------------------------------%高斯距离权重函数地理加权回归模型% load the Anselin data set% 导入数据load anselin.dat;y = anselin(:,1);
nobs = length(y);x = [ones(nobs,1) anselin(:,2:3)];
[nobs nvar] = size(x);
north = anselin(:,4);east = anselin(:,5);
vnames = strvcat('crime','constant','income','hvalue');



%--------------------------------------------------------------------------%计量经济学服务中心《空间计量经济学及Matlab应用》%--------------------------------------------------------------------------% Gaussian distance weighting%高斯距离权重函数地理加权回归模型info.dtype = 'gaussian'; tic; result1 = gwr(y,x,east,north,info); toc;prt(result1,vnames);


结果为:

14-49空间单元的参数估计结果不再展现,完整结果可以查看后面代码。


3

三种不同空间权重矩阵参数比较


命令为:

%--------------------------------------------------------------------------%计量经济学服务中心《空间计量经济学及Matlab应用》%--------------------------------------------------------------------------% plot results for comparison (see also plt)tt=1:nobs;subplot(3,1,1),plot(tt,result1.beta(:,1),tt,result2.beta(:,1),'--',tt,result3.beta(:,1),'-.');legend('Gaussian','Exponential','tricube');ylabel('Constant term');
subplot(3,1,2),plot(tt,result1.beta(:,2),tt,result2.beta(:,2),'--',tt,result3.beta(:,2),'-.');legend('Gaussian','Exponential','tricube');ylabel('Household income');
subplot(3,1,3),plot(tt,result1.beta(:,3),tt,result2.beta(:,3),'--',tt,result3.beta(:,3),'-.');legend('Gaussian','Exponential','tricube');ylabel('House value');%--------------------------------------------------------------------------%计量经济学服务中心《空间计量经济学及Matlab应用》%--------------------------------------------------------------------------



结果为:

4

指数距离权重函数地理加权回归模型


2-49空间单元参数估计结果不再展现

5

立方距离权重函数地理加权回归模型


2-49空间单元参数估计结果不再展现


高斯距离权重函数地理加权回归模型1-49地理加权回归参数估计结果为:

%--------------------------------------------------------------------------%计量经济学服务中心《空间计量经济学及Matlab应用》%--------------------------------------------------------------------------
Vname =
Variable

Geometrically weighted regression estimates Dependent Variable = crime R-squared = 0.9418 Rbar-squared = 0.9393 Bandwidth = 0.6518 # iterations = 17 Decay type = gaussian Nobs, Nvars = 49, 3 ***************************************Obs = 1, x-coordinate= 42.3800, y-coordinate= 35.6200, sige= 3.4125 Variable Coefficient t-statistic t-probability constant 51.197363 9.212794 0.000000 income -0.461038 -1.678857 0.099547 hvalue -0.434237 -3.693955 0.000556
Obs = 2, x-coordinate= 40.5200, y-coordinate= 36.5000, sige= 6.7847 Variable Coefficient t-statistic t-probability constant 63.564308 9.955778 0.000000 income -0.369902 -0.991321 0.326399 hvalue -0.683553 -4.656428 0.000025
Obs = 3, x-coordinate= 38.7100, y-coordinate= 36.7100, sige= 8.6457 Variable Coefficient t-statistic t-probability constant 72.673672 9.395151 0.000000 income -0.161106 -0.269269 0.788853 hvalue -0.826921 -5.367996 0.000002
Obs = 4, x-coordinate= 38.4100, y-coordinate= 33.3600, sige= 5.2400 Variable Coefficient t-statistic t-probability constant 81.381328 7.772343 0.000000 income 0.149437 0.194405 0.846662 hvalue -1.073198 -9.228621 0.000000
Obs = 5, x-coordinate= 44.0700, y-coordinate= 38.8000, sige= 0.6985 Variable Coefficient t-statistic t-probability constant 46.737222 13.309854 0.000000 income -0.689933 -2.949392 0.004869 hvalue -0.223718 -4.843363 0.000013
Obs = 6, x-coordinate= 41.1800, y-coordinate= 39.8200, sige= 2.7853 Variable Coefficient t-statistic t-probability constant 57.351504 10.979281 0.000000 income -0.971958 -2.506024 0.015580 hvalue -0.310679 -3.233765 0.002189
Obs = 7, x-coordinate= 38.0000, y-coordinate= 40.0100, sige= 2.2903 Variable Coefficient t-statistic t-probability constant 79.683414 14.237667 0.000000 income -1.990153 -3.856516 0.000336 hvalue -0.402011 -2.423975 0.019088
Obs = 8, x-coordinate= 39.2800, y-coordinate= 43.7500, sige= 0.6613 Variable Coefficient t-statistic t-probability constant 79.374676 10.227137 0.000000 income -3.294825 -6.536725 0.000000 hvalue 0.059876 0.936349 0.353686
Obs = 9, x-coordinate= 34.9100, y-coordinate= 39.6100, sige= 2.8503 Variable Coefficient t-statistic t-probability constant 72.218154 10.454632 0.000000 income -1.988247 -2.094491 0.041410 hvalue -0.063618 -0.350051 0.727801
Obs = 10, x-coordinate= 36.4200, y-coordinate= 47.6100, sige= 0.3660 Variable Coefficient t-statistic t-probability constant 54.058540 23.753628 0.000000 income -1.719995 -13.667469 0.000000 hvalue 0.033105 1.405730 0.166114
Obs = 11, x-coordinate= 34.4600, y-coordinate= 48.5800, sige= 0.6241 Variable Coefficient t-statistic t-probability constant 55.363293 25.304369 0.000000 income -1.767205 -17.192082 0.000000 hvalue 0.019889 0.631315 0.530769
Obs = 12, x-coordinate= 32.6500, y-coordinate= 49.6100, sige= 1.1183 Variable Coefficient t-statistic t-probability constant 54.800116 18.118969 0.000000 income -1.673002 -11.485387 0.000000 hvalue 0.000544 0.009703 0.992298
Obs = 13, x-coordinate= 29.9100, y-coordinate= 50.1100, sige= 1.8016 Variable Coefficient t-statistic t-probability constant 49.090996 11.226397 0.000000 income -1.206984 -4.006017 0.000209 hvalue -0.061675 -0.484850 0.629943
Obs = 14, x-coordinate= 27.8000, y-coordinate= 51.2400, sige= 1.1740 Variable Coefficient t-statistic t-probability constant 42.025898 8.693270 0.000000 income -1.049190 -2.353344 0.022662 hvalue 0.034076 0.168603 0.866803
Obs = 15, x-coordinate= 25.2400, y-coordinate= 50.8900, sige= 0.5074 Variable Coefficient t-statistic t-probability constant 42.023487 9.931361 0.000000 income -2.035189 -4.285382 0.000085 hvalue 0.541132 2.264220 0.028021
Obs = 16, x-coordinate= 27.9300, y-coordinate= 48.4400, sige= 1.8858 Variable Coefficient t-statistic t-probability constant 50.858338 11.840157 0.000000 income -0.970544 -2.036648 0.047108 hvalue -0.163386 -0.772214 0.443696
Obs = 17, x-coordinate= 31.9100, y-coordinate= 46.7300, sige= 1.9074 Variable Coefficient t-statistic t-probability constant 63.935965 23.149085 0.000000 income -1.851883 -8.000266 0.000000 hvalue -0.065988 -0.739344 0.463225
Obs = 18, x-coordinate= 35.9200, y-coordinate= 43.4400, sige= 1.0826 Variable Coefficient t-statistic t-probability constant 61.515865 14.934047 0.000000 income -1.892916 -5.019187 0.000007 hvalue 0.013062 0.157654 0.875377
Obs = 19, x-coordinate= 33.4600, y-coordinate= 43.3700, sige= 2.7225 Variable Coefficient t-statistic t-probability constant 65.413374 17.271159 0.000000 income -2.860764 -4.125718 0.000143 hvalue 0.275876 1.222495 0.227368
Obs = 20, x-coordinate= 33.1400, y-coordinate= 41.1300, sige= 5.0673 Variable Coefficient t-statistic t-probability constant 66.620907 8.186391 0.000000 income -1.619154 -1.246106 0.218651 hvalue -0.110761 -0.311593 0.756672
Obs = 21, x-coordinate= 31.6100, y-coordinate= 43.9500, sige= 2.6677 Variable Coefficient t-statistic t-probability constant 68.176378 23.711500 0.000000 income -3.351877 -5.596394 0.000001 hvalue 0.449873 2.009607 0.049996
Obs = 22, x-coordinate= 30.4000, y-coordinate= 44.1000, sige= 2.6080 Variable Coefficient t-statistic t-probability constant 68.744965 23.673393 0.000000 income -3.282837 -5.969185 0.000000 hvalue 0.438989 2.053138 0.045418
Obs = 23, x-coordinate= 29.1800, y-coordinate= 43.7000, sige= 2.8861 Variable Coefficient t-statistic t-probability constant 69.068145 17.943117 0.000000 income -3.326136 -5.847206 0.000000 hvalue 0.468812 1.914810 0.061364
Obs = 24, x-coordinate= 28.7800, y-coordinate= 41.0400, sige= 8.1087 Variable Coefficient t-statistic t-probability constant 77.271200 7.416531 0.000000 income -3.000189 -3.431018 0.001230 hvalue 0.167212 0.398976 0.691645
Obs = 25, x-coordinate= 27.3100, y-coordinate= 43.2300, sige= 4.0434 Variable Coefficient t-statistic t-probability constant 67.368725 9.525528 0.000000 income -3.069044 -3.468780 0.001099 hvalue 0.363366 0.754567 0.454120
Obs = 26, x-coordinate= 24.9600, y-coordinate= 42.6700, sige= 2.5678 Variable Coefficient t-statistic t-probability constant 61.306231 5.851086 0.000000 income 0.006368 0.004423 0.996489 hvalue -1.071954 -1.162870 0.250514
Obs = 27, x-coordinate= 25.9000, y-coordinate= 41.2100, sige= 6.2344 Variable Coefficient t-statistic t-probability constant 59.819535 4.913992 0.000010 income -1.697764 -1.212751 0.231040 hvalue -0.138505 -0.170688 0.865172
Obs = 28, x-coordinate= 25.8500, y-coordinate= 39.3200, sige= 5.2496 Variable Coefficient t-statistic t-probability constant 45.265068 2.954417 0.004803 income -2.135825 -1.284057 0.205161 hvalue 0.591982 0.883602 0.381226
Obs = 29, x-coordinate= 27.4900, y-coordinate= 41.0900, sige= 8.3927 Variable Coefficient t-statistic t-probability constant 72.899979 6.290465 0.000000 income -3.258441 -2.970188 0.004599 hvalue 0.307426 0.555503 0.581078
Obs = 30, x-coordinate= 28.8200, y-coordinate= 38.3200, sige= 6.0199 Variable Coefficient t-statistic t-probability constant 80.285094 7.449344 0.000000 income -0.676605 -0.717337 0.476572 hvalue -0.618717 -2.097025 0.041175
Obs = 31, x-coordinate= 30.9000, y-coordinate= 41.3100, sige= 5.9421 Variable Coefficient t-statistic t-probability constant 68.118651 7.883805 0.000000 income -1.803631 -2.133632 0.037905 hvalue 0.059481 0.178766 0.858859
Obs = 32, x-coordinate= 32.8800, y-coordinate= 39.3600, sige= 4.5678 Variable Coefficient t-statistic t-probability constant 58.637810 7.764366 0.000000 income 0.495270 0.487439 0.628121 hvalue -0.388646 -1.896549 0.063791
Obs = 33, x-coordinate= 30.6400, y-coordinate= 39.7200, sige= 5.1218 Variable Coefficient t-statistic t-probability constant 70.568456 9.798923 0.000000 income -0.218856 -0.335471 0.738702 hvalue -0.448133 -1.933095 0.059014
Obs = 34, x-coordinate= 30.3500, y-coordinate= 38.2900, sige= 3.1096 Variable Coefficient t-statistic t-probability constant 80.030552 12.784499 0.000000 income 0.036213 0.068159 0.945936 hvalue -0.786849 -5.179351 0.000004
Obs = 35, x-coordinate= 32.0900, y-coordinate= 36.6000, sige= 3.5543 Variable Coefficient t-statistic t-probability constant 63.967857 8.009308 0.000000 income 0.337987 0.382854 0.703484 hvalue -0.492099 -3.556112 0.000846
Obs = 36, x-coordinate= 34.0800, y-coordinate= 37.6000, sige= 2.7764 Variable Coefficient t-statistic t-probability constant 67.746908 11.590897 0.000000 income -0.755463 -0.934476 0.354641 hvalue -0.243619 -2.063643 0.044369
Obs = 37, x-coordinate= 36.1200, y-coordinate= 37.1300, sige= 5.2909 Variable Coefficient t-statistic t-probability constant 65.979447 8.493093 0.000000 income -0.082415 -0.089905 0.928729 hvalue -0.420816 -3.386697 0.001402
Obs = 38, x-coordinate= 36.3000, y-coordinate= 37.8500, sige= 4.1933 Variable Coefficient t-statistic t-probability constant 70.241135 9.853816 0.000000 income -0.851484 -1.007494 0.318647 hvalue -0.331039 -2.669386 0.010278
Obs = 39, x-coordinate= 36.4000, y-coordinate= 35.9500, sige= 7.5290 Variable Coefficient t-statistic t-probability constant 60.058183 6.254403 0.000000 income 1.346573 1.335110 0.188010 hvalue -0.676333 -5.334379 0.000002
Obs = 40, x-coordinate= 35.6000, y-coordinate= 35.7200, sige= 6.1315 Variable Coefficient t-statistic t-probability constant 59.441973 6.623725 0.000000 income 1.197840 1.197762 0.236772 hvalue -0.582346 -4.668150 0.000024
Obs = 41, x-coordinate= 34.6600, y-coordinate= 35.7600, sige= 4.4315 Variable Coefficient t-statistic t-probability constant 64.924831 8.369141 0.000000 income 0.077997 0.082850 0.934308 hvalue -0.395195 -3.149189 0.002788
Obs = 42, x-coordinate= 33.9200, y-coordinate= 36.1500, sige= 2.7971 Variable Coefficient t-statistic t-probability constant 68.995022 11.150905 0.000000 income -0.721014 -0.903935 0.370452 hvalue -0.276558 -2.460420 0.017450
Obs = 43, x-coordinate= 30.4200, y-coordinate= 34.0800, sige= 1.6449 Variable Coefficient t-statistic t-probability constant 42.987174 3.069301 0.003493 income -0.130118 -0.085542 0.932179 hvalue -0.015665 -0.079318 0.937103
Obs = 44, x-coordinate= 28.2600, y-coordinate= 30.3200, sige= 1.5262 Variable Coefficient t-statistic t-probability constant 38.427625 7.366893 0.000000 income -0.618892 -1.442734 0.155458 hvalue -0.192297 -0.847771 0.400688
Obs = 45, x-coordinate= 29.8500, y-coordinate= 27.9400, sige= 1.2787 Variable Coefficient t-statistic t-probability constant 31.201319 5.045446 0.000007 income -0.603071 -1.601338 0.115730 hvalue -0.061387 -0.506869 0.614520
Obs = 46, x-coordinate= 28.2100, y-coordinate= 27.2700, sige= 1.5429 Variable Coefficient t-statistic t-probability constant 27.113505 4.535402 0.000037 income -0.413367 -1.327768 0.190407 hvalue -0.043317 -0.586370 0.560318
Obs = 47, x-coordinate= 26.6900, y-coordinate= 24.2500, sige= 0.5555 Variable Coefficient t-statistic t-probability constant 24.205091 3.701915 0.000542 income -0.278691 -0.853556 0.397505 hvalue -0.032273 -0.812649 0.420350
Obs = 48, x-coordinate= 25.7100, y-coordinate= 25.4700, sige= 0.6629 Variable Coefficient t-statistic t-probability constant 24.211353 4.173486 0.000122 income -0.271872 -0.991422 0.326350 hvalue -0.034801 -0.854272 0.397112
Obs = 49, x-coordinate= 26.5800, y-coordinate= 29.0200, sige= 1.4185 Variable Coefficient t-statistic t-probability constant 30.052990 5.675101 0.000001 income -0.431664 -1.644271 0.106522 hvalue -0.081504 -0.934733 0.354510



第四部分:R语言:地理加权回归模型


R语言、地理加权回归spgwr函数

spgwr包需要安装,代码为:

install.packages("spgwr") library(spgwr)

spgwr进行地理加权回归主要有两个非常重要的步骤,分别是带宽选择和回归。

带宽选择使用的是gwr.sel这个函数,语法格式为:

gwr.sel(formula, data=list(), coords, adapt=FALSE, gweight=gwr.Gauss, method = "cv", verbose = TRUE, longlat=NULL, RMSE=FALSE, weights, tol=.Machine$double.eps^0.25, show.error.messages = FALSE)


选项含义为:

formula:regression model formula as in lm,表示R语言里面进行回归的方程式 


data:model data frame as in lm, or may be a SpatialPointsDataFrame or SpatialPolygonsDataFrame object as defined in package sp  指定数据


coords: matrix of coordinates of points representing the spatial positions of the observations指定坐标(主要是经纬度,x对应经度,y对应纬度


gweight: 指定空间权重函数(代码中使用的gauss函数,以前默认的是gwr.bisquare()函数。 


verbose:是否汇打印计算带宽的过程 


method:指定带宽的计算方法


二、GW回归

GW 回归是探索因变量和自变量之间的空间变化关系,基本的GW回归是将通常的回归方法用于空间当中,最重要的是所有回归系数的估计都要加权,加权用到前面提到的核函数。

1、传统回归分析

------------------------------------------------------------------------------------# 计量经济学服务中心1、导入安装包library(spgwr)
2、导入数据data(columbus, package="spData")
3、查看数据View(columbus)

进行回归,代码为:

4、进行传统OLS回归col.lm <- lm(CRIME ~ INC + HOVAL, data=columbus)summary(col.lm)

结果为:


2、带宽选择

------------------------------------------------------------------------------------# 计量经济学服务中心#5、带宽选择col.bw <- gwr.sel(CRIME ~ INC + HOVAL, data=columbus, coords=cbind(columbus$X, columbus$Y))

结果为:

> #5、带宽选择> col.bw <- gwr.sel(CRIME ~ INC + HOVAL, data=columbus,+ coords=cbind(columbus$X, columbus$Y))Bandwidth: 12.65221 CV score: 7432.209 Bandwidth: 20.45127 CV score: 7462.704 Bandwidth: 7.83213 CV score: 7323.545 Bandwidth: 4.853154 CV score: 7307.57 Bandwidth: 5.125504 CV score: 7322.796 Bandwidth: 3.012046 CV score: 6461.764 Bandwidth: 1.874179 CV score: 6473.378 Bandwidth: 2.475485 CV score: 6109.995 Bandwidth: 2.447721 CV score: 6098.372 Bandwidth: 2.228647 CV score: 6064.1 Bandwidth: 2.264538 CV score: 6060.774 Bandwidth: 2.280666 CV score: 6060.649 Bandwidth: 2.274969 CV score: 6060.601 Bandwidth: 2.2751 CV score: 6060.601 Bandwidth: 2.27506 CV score: 6060.601 Bandwidth: 2.275019 CV score: 6060.601 Bandwidth: 2.27506 CV score: 6060.601 

3、地理加权回归

使用gwr进行地理加权回归,代码为:

#6、地理加权回归操作(高斯函数)col.gauss <- gwr(CRIME ~ INC + HOVAL, data=columbus, coords=cbind(columbus$X, columbus$Y), bandwidth=col.bw, hatmatrix=TRUE)col.gauss

结果为:


> #6、地理加权回归操作(高斯函数)> col.gauss <- gwr(CRIME ~ INC + HOVAL, data=columbus,+ coords=cbind(columbus$X, columbus$Y),+ bandwidth=col.bw, hatmatrix=TRUE)> col.gaussCall:gwr(formula = CRIME ~ INC + HOVAL, data = columbus, coords = cbind(columbus$X, columbus$Y), bandwidth = col.bw, hatmatrix = TRUE)Kernel function: gwr.Gauss Fixed bandwidth: 2.27506 Summary of GWR coefficient estimates at data points: Min. 1st Qu. Median 3rd Qu. Max. GlobalX.Intercept. 23.233234 54.124872 63.902588 68.756460 80.900619 68.6190INC -3.130714 -1.912908 -0.984380 -0.368564 1.291075 -1.5973HOVAL -1.052811 -0.376735 -0.097394 0.030049 0.794577 -0.2739Number of data points: 49 Effective number of parameters (residual: 2traceS - traceS'S): 29.6163 Effective degrees of freedom (residual: 2traceS - traceS'S): 19.3837 Sigma (residual: 2traceS - traceS'S): 8.027502 Effective number of parameters (model: traceS): 23.92796 Effective degrees of freedom (model: traceS): 25.07204 Sigma (model: traceS): 7.058361 Sigma (ML): 5.048946 AICc (GWR p. 61, eq 2.33; p. 96, eq. 4.21): 403.6187 AIC (GWR p. 96, eq. 4.22): 321.6635 Residual sum of squares: 1249.101 Quasi-global R2: 0.9070486




选择函数,代码和结果为:

> #7、gwr.bisquare进行地理加权回归操作> col.d <- gwr.sel(CRIME ~ INC + HOVAL, data=columbus,+ coords=cbind(columbus$X, columbus$Y), + gweight=gwr.bisquare)Bandwidth: 12.65221 CV score: 8180.619 Bandwidth: 20.45127 CV score: 7552.85 Bandwidth: 25.27136 CV score: 7508.227 Bandwidth: 23.68132 CV score: 7519.864 Bandwidth: 28.25033 CV score: 7491.85 Bandwidth: 30.09144 CV score: 7486.673 Bandwidth: 31.69353 CV score: 7483.663 Bandwidth: 31.08159 CV score: 7484.706 Bandwidth: 32.21945 CV score: 7482.846 Bandwidth: 32.54449 CV score: 7482.371 Bandwidth: 32.74538 CV score: 7482.088 Bandwidth: 32.86953 CV score: 7481.916 Bandwidth: 32.94626 CV score: 7481.812 Bandwidth: 32.99368 CV score: 7481.748 Bandwidth: 33.02299 CV score: 7481.708 Bandwidth: 33.04111 CV score: 7481.684 Bandwidth: 33.0523 CV score: 7481.669 Bandwidth: 33.05922 CV score: 7481.659 Bandwidth: 33.0635 CV score: 7481.654 Bandwidth: 33.06614 CV score: 7481.65 Bandwidth: 33.06777 CV score: 7481.648 Bandwidth: 33.06878 CV score: 7481.647 Bandwidth: 33.06941 CV score: 7481.646 Bandwidth: 33.06979 CV score: 7481.645 Bandwidth: 33.07003 CV score: 7481.645 Bandwidth: 33.07018 CV score: 7481.645 Bandwidth: 33.07027 CV score: 7481.645 Bandwidth: 33.07032 CV score: 7481.645 Bandwidth: 33.07037 CV score: 7481.645 Bandwidth: 33.07037 CV score: 7481.645 Warning message:In gwr.sel(CRIME ~ INC + HOVAL, data = columbus, coords = cbind(columbus$X, : Bandwidth converged to upper bound:33.0704149683672> > col.bisq <- gwr(CRIME ~ INC + HOVAL, data=columbus,+ coords=cbind(columbus$X, columbus$Y),+ bandwidth=col.d,gweight=gwr.bisquare, hatmatrix=TRUE)> col.bisqCall:gwr(formula = CRIME ~ INC + HOVAL, data = columbus, coords = cbind(columbus$X, columbus$Y), bandwidth = col.d, gweight = gwr.bisquare, hatmatrix = TRUE)Kernel function: gwr.bisquare Fixed bandwidth: 33.07037 Summary of GWR coefficient estimates at data points: Min. 1st Qu. Median 3rd Qu. Max. GlobalX.Intercept. 68.30731 68.96074 69.26155 69.57103 71.64487 68.6190INC -1.72665 -1.64754 -1.61378 -1.56908 -1.46759 -1.5973HOVAL -0.33314 -0.30522 -0.28063 -0.25355 -0.18834 -0.2739Number of data points: 49 Effective number of parameters (residual: 2traceS - traceS'S): 4.604881 Effective degrees of freedom (residual: 2traceS - traceS'S): 44.39512 Sigma (residual: 2traceS - traceS'S): 11.3765 Effective number of parameters (model: traceS): 3.917653 Effective degrees of freedom (model: traceS): 45.08235 Sigma (model: traceS): 11.28946 Sigma (ML): 10.82875 AICc (GWR p. 61, eq 2.33; p. 96, eq. 4.21): 383.6983 AIC (GWR p. 96, eq. 4.22): 376.4297 Residual sum of squares: 5745.83 Quasi-global R2: 0.5724262




◆◆◆◆



您可能也对以下帖子感兴趣

文章有问题?点此查看未经处理的缓存