查看原文
其他

双重差分倾向得分匹配(PSM-DID)

以下文章来源于计量经济学 ,作者小计量

PSM-DID基本介绍

双重差分PSM模型是由Heckman et al(1997,1998)提出的。假设存在两期面板数据,实验前的时期记为t’,实验后的数据记为t。对于控制组合处理组在t’时期,其潜在结果均为yot‘,但是在t时期的时候存在两种潜在结果即,控制组为y0t,处理组为y1t。

双重差分PSM模型成立的假设为:

如果以上假定成立,则可以得到ATT的一致估计:

步骤

双重差分PSM的估计步骤大致如下:

(1)根据处理变量D和协变量X计算倾向得分

(2)对于处理组的每个个体i确定与其匹配的全部控制组个体(即确定集合Sp)

(3)对于处理组的每位个体i,计算其结果变量前后变化

(4)对于处理组的每个个体i,计算与其匹配的全部控制个体的前后变化

(5)针对(3)和(4)中的公式,根据以上公式进行倾向得分核匹配或局部线性回归匹配,即可得到ATT

优点:

控制不可观测但不随时间变化的组间差异。例如处理组和控制组来自两个不通过的区域,或者处理组或者控制组使用了两套调查问卷。

操作

***PSM_DID

ssc install diff

help diff

双重差分语法格式


diff outcome_var ,treat(varname) period(varame) id(varname) ///

kernel ktype(kernel) cov(varlist) report logit support test

语法格式解释

其中“outcome_var”表示结果变量

“treat(varname) ”为必选项,用来指定处理变量,

“period(varame)”用来指定实验期虚拟变量(1=实验期,0=非实验期),

“id(varname)”用来指定个体id(这是进行匹配的前提),

“kernel”表示使用核匹配方法(diff命令不提供其他匹配方法),

“cov(varlist)”用来指定倾向得分的协变量,

“report”表示汇报倾向得分的估计结果,

“logit”表示使用logit计算得分,默认选项为probit,

“support”表示仅使用共同取值范围内的观测值进行匹配,

“test”表示检验倾向得分匹配之后的,各变量在实验组和控制在分布是否平衡。


演示


***PSM_DID

ssc install diff

help diff
***双重差分语法格式***

diff outcome_var ,treat(varname) period(varame) id(varname) ///

kernel ktype(kernel) cov(varlist) report logit support test



use cardkrueger1994.dta

bro

des

sum

diff fte ,t(treated) p(t) kernel id(id) logit cov(bk kfc roys) ///

report support

diff fte ,t(treated) p(t) kernel id(id) logit cov(bk kfc roys) ///

report support test

结果为:

. use http://fmwww.bc.edu/repec/bocode/c/CardKrueger1994.dta
(Dataset from Card&Krueger (1994))

. bro




. des

Contains data from http://fmwww.bc.edu/repec/bocode/c/CardKrueger1994.dta
 Observations:           820                  Dataset from Card&Krueger (1994)
    Variables:             8                  27 May 2011 20:36
-------------------------------------------------------------------------------------------------------------------------------------
Variable      Storage   Display    Value
    name         type    format    label      Variable label
-------------------------------------------------------------------------------------------------------------------------------------
id              int     %8.0g                 Store ID
t               byte    %8.0g                 Feb. 1992 = 0; Nov. 1992 = 1
treated         long    %8.0g      treated    New Jersey = 1; Pennsylvania = 0
fte             float   %9.0g                 Output: Full Time Employment
bk              byte    %8.0g                 Burger King == 1
kfc             byte    %8.0g                 Kentuky Fried Chiken == 1
roys            byte    %8.0g                 Roy Rogers == 1
wendys          byte    %8.0g                 Wendy's == 1
-------------------------------------------------------------------------------------------------------------------------------------
Sorted by: id  t




. sum

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
          id |        820    246.5073    148.1413          1        522
           t |        820          .5    .5003052          0          1
     treated |        820    .8073171    .3946469          0          1
         fte |        801    17.59457    9.022517          0         80
          bk |        820    .4170732    .4933761          0          1
-------------+---------------------------------------------------------
         kfc |        820     .195122    .3965364          0          1
        roys |        820    .2414634    .4282318          0          1
      wendys |        820    .1463415    .3536639          0          1

. diff fte ,t(treated) p(t) kernel id(id) logit cov(bk kfc roys) report support
KERNEL PROPENSITY SCORE MATCHING DIFFERENCE-IN-DIFFERENCES
    Estimation on common support
    Report - Propensity score estimation with logit command
    Atention: _pscore is estimated at baseline

Iteration 0:   log likelihood = -198.21978
Iteration 1:   log likelihood = -196.77862
Iteration 2:   log likelihood =  -196.7636
Iteration 3:   log likelihood =  -196.7636

Logistic regression                               Number of obs   =        404
                                                  LR chi2(3)      =       2.91
                                                  Prob > chi2     =     0.4053
Log likelihood =  -196.7636                       Pseudo R2       =     0.0073

------------------------------------------------------------------------------
     treated | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
          bk |   .3108387   .3561643     0.87   0.383    -.3872306    1.008908
         kfc |   .6814511   .4335455     1.57   0.116    -.1682824    1.531185
        roys |    .520356   .4011747     1.30   0.195     -.265932    1.306644
       _cons |    1.05315   .2998708     3.51   0.000      .465414    1.640886
------------------------------------------------------------------------------
    Matching iterations...
.....................................................................................................................................
> ...................................................................................................................................
> ..............................................................
DIFFERENCE-IN-DIFFERENCES ESTIMATION RESULTS
Number of observations in the DIFF-IN-DIFF: 795
            Before         After    
   Control: 78             76          154
   Treated: 326            315         641
            404            391
--------------------------------------------------------
 Outcome var.   | fte     | S. Err. |   |t|   |  P>|t|
----------------+---------+---------+---------+---------
Before          |         |         |         | 
   Control      | 20.040  |         |         | 
   Treated      | 17.065  |         |         | 
   Diff (T-C)   | -2.975  | 0.943   | -3.16   | 0.002***
After           |         |         |         | 
   Control      | 17.449  |         |         | 
   Treated      | 17.499  |         |         | 
   Diff (T-C)   | 0.050   | 0.955   | 0.05    | 0.958
                |         |         |         | 
Diff-in-Diff    | 3.026   | 1.342   | 2.25    | 0.024**
--------------------------------------------------------
R-square:    0.02
* Means and Standard Errors are estimated by linear regression
**Inference: *** p<0.01; ** p<0.05; * p<0.1




. diff fte ,t(treated) p(t) kernel id(id) logit cov(bk kfc roys)  report support test
    Report - Propensity score estimation with logit command
    Atention: _pscore is estimated at baseline

Iteration 0:   log likelihood = -198.21978
Iteration 1:   log likelihood = -196.77862
Iteration 2:   log likelihood =  -196.7636
Iteration 3:   log likelihood =  -196.7636

Logistic regression                               Number of obs   =        404
                                                  LR chi2(3)      =       2.91
                                                  Prob > chi2     =     0.4053
Log likelihood =  -196.7636                       Pseudo R2       =     0.0073

------------------------------------------------------------------------------
     treated | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
          bk |   .3108387   .3561643     0.87   0.383    -.3872306    1.008908
         kfc |   .6814511   .4335455     1.57   0.116    -.1682824    1.531185
        roys |    .520356   .4011747     1.30   0.195     -.265932    1.306644
       _cons |    1.05315   .2998708     3.51   0.000      .465414    1.640886
------------------------------------------------------------------------------
    Matching iterations...
.....................................................................................................................................
> ...................................................................................................................................
> ..............................................................
TWO-SAMPLE T TEST
    Test on common support

Number of observations (baseline): 404
            Before         After    
   Control: 78             -           78
   Treated: 326            -           326
            404            -

t-test at period = 0:
----------------------------------------------------------------------------------------------
Weighted Variable(s) |   Mean Control   | Mean Treated |    Diff.   |   |t|   |  Pr(|T|>|t|)
---------------------+------------------+--------------+------------+---------+---------------
fte                  | 20.040           | 17.065       | -2.975     |  2.89   | 0.0041***
bk                   | 0.468            | 0.408        | -0.060     |  1.21   | 0.2259
kfc                  | 0.144            | 0.209        | 0.064      |  1.69   | 0.0911*
roys                 | 0.272            | 0.252        | -0.020     |  0.46   | 0.6462
----------------------------------------------------------------------------------------------
*** p<0.01; ** p<0.05; * p<0.1
Attention: option kernel weighs variables in cov(varlist)
Means and t-test are estimated by linear regression


您可能也对以下帖子感兴趣

文章有问题?点此查看未经处理的缓存