查看原文
其他

倾向得分匹配法(PSM)举例及 stata 实现

倾向得分匹配法(PSM)举例及 stata 实现

推荐阅读:

【因果推断】一文读懂倾向匹配得分Stata及R操作应用

倾向匹配得分教程(附PSM操作应用、平衡性检验、共同取值范围、核密度函数图)

政策背景:

国家支持工作示范项目(National Supported Work,NSW)

研究目的:

检验接受该项目(培训)与不接受该项目(培训)对工资的影响。

基本思想:

分析接受培训组(处理组,treatment group)接受培训行为与不接受培训行为在工资表现上的差异。但是,现实可以观测到的是处理组接受培训的事实,而处理组没有接受培训会怎样是不可能观测到的,这种状态也成为反事实(counterfactual)。匹配法就是为了解决这种不可观测事实的方法。在倾向得分匹配方法(Propensity Score Matching)中,根据处理指示变量将样本分为两个组,一是处理组,在本例中就是在 NSW 实施后接受培训的组;二是对照组(comparison group),

在本例中就是在 NSW 实施后不接受培训的组。

倾向得分匹配方法的基本思想是,在处理组和对照组样本通过一定的方式匹配后,在其他条件完全相同的情况下,通过接受培训的组(处理组)与不接受培训的组(对照组)在工资表现上的差异来判断接受培训的行为与工资之间的因果关系

. desc

Contains data from C:\Users\Metrics\Desktop\计量经济学\高级\A15-psm\data\ldw_exper.dta
  obs:           445                          
 vars:            12                          30 Jan 2013 12:47
 size:        12,015                          
--------------------------------------------------------------------------------------
              storage   display    value
variable name   type    format     label      variable label
--------------------------------------------------------------------------------------
t               byte    %8.0g                 participation in job training program
age             byte    %8.0g                 age
educ            byte    %8.0g                 years of education
black           byte    %8.0g                 indicator for African-American
hisp            byte    %8.0g                 indicator for Hispanic
married         byte    %8.0g                 indicator for married
nodegree        byte    %8.0g                 indicator for more than grade school but
                                                less than high-school education
re74            float   %9.0g                 real earnings in 1974 (in thousands of
                                                1978 $)
re75            float   %9.0g                 real earnings in 1975 (in thousands of
                                                1978 $)
re78            float   %9.0g                 real earnings in 1978 (in thousands of
                                                1978 $)
u74             float   %9.0g                 indicator for unemployed in 1974
u75             float   %9.0g                 indicator for unemployed in 1975
--------------------------------------------------------------------------------------
Sorted by: 

按处理组分类统计


 bysort t :sum age educ nodegree black hisp married u74 u75

--------------------------------------------------------------------------------------
-> t = 0

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
         age |        260    25.05385    7.057745         17         55
        educ |        260    10.08846    1.614325          3         14
    nodegree |        260    .8346154    .3722439          0          1
       black |        260    .8269231    .3790434          0          1
        hisp |        260    .1076923    .3105893          0          1
-------------+---------------------------------------------------------
     married |        260    .1538462    .3614971          0          1
         u74 |        260         .75    .4338478          0          1
         u75 |        260    .6846154    .4655651          0          1

--------------------------------------------------------------------------------------
-> t = 1

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
         age |        185    25.81622    7.155019         17         48
        educ |        185    10.34595     2.01065          4         16
    nodegree |        185    .7081081    .4558666          0          1
       black |        185    .8432432    .3645579          0          1
        hisp |        185    .0594595    .2371244          0          1
-------------+---------------------------------------------------------
     married |        185    .1891892    .3927217          0          1
         u74 |        185    .7081081    .4558666          0          1
         u75 |        185          .6    .4912274          0          1






描述性分析

tabulate t, summarize(re78) means standard

结果为:

. tabulate t, summarize(re78) means standard

participati |     Summary of real
  on in job |  earnings in 1978 (in
   training |  thousands of 1978 $)
    program |        Mean   Std. Dev.
------------+------------------------
          0 |   4.5548023   5.4838368
          1 |   6.3491454   7.8674047
------------+------------------------
      Total |   5.3007651   6.6314934

设置种子数

set seed 20180105     //产生随机数种子

gen u=runiform()

sort u                        //排序

或者order u

上述命令是为了生成伪随机数,满足01的均匀分布

生成宏变量

local v1 "t"

local v2 "age edu black hisp married re74 re75 u74 u75"

global x "`v1' `v2' "

倾向匹配得分

psmatch2  $x, out(re78) neighbor(1)  ate   ties logit common   // 1:1 匹配

$表示引用宏变量,

psmatch2  $x, out(re78) neighbor(1)  ate   ties logit common   // 1:1 匹

等价于

psmatch2 t age edu black hisp married re74 re75 u74 u75, out(re78) neighbor(1)  ate   ties logit common

结果为:

 psmatch2  $x, out(re78) neighbor(1)  ate   ties logit common  

Logistic regression                             Number of obs     =        445
                                                LR chi2(9)        =      11.70
                                                Prob > chi2       =     0.2308
Log likelihood = -296.25026                     Pseudo R2         =     0.0194

------------------------------------------------------------------------------
           t |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .0142619   .0142116     1.00   0.316    -.0135923    .0421162
        educ |   .0499776   .0564116     0.89   0.376     -.060587    .1605423
       black |   -.347664   .3606532    -0.96   0.335    -1.054531    .3592032
        hisp |   -.928485     .50661    -1.83   0.067    -1.921422    .0644523
     married |   .1760431   .2748817     0.64   0.522    -.3627151    .7148012
        re74 |  -.0339278   .0292559    -1.16   0.246    -.0912683    .0234127
        re75 |     .01221   .0471351     0.26   0.796    -.0801731    .1045932
         u74 |  -.1516037   .3716369    -0.41   0.683    -.8799987    .5767913
         u75 |  -.3719486    .317728    -1.17   0.242    -.9946841    .2507869
       _cons |  -.4736308   .8244205    -0.57   0.566    -2.089465    1.142204
------------------------------------------------------------------------------
There are observations with identical propensity score values.
The sort order of the data could affect your results.
Make sure that the sort order is random before calling psmatch2.
--------------------------------------------------------------------------------------
> --
        Variable     Sample |    Treated     Controls   Difference         S.E.   T-st
> at
----------------------------+---------------------------------------------------------
> --
            re78  Unmatched | 6.34914538   4.55480228   1.79434311   .632853552     2.
> 84
                        ATT | 6.40495818   4.99436488    1.4105933   .839875971     1.
> 68
                        ATU | 4.52683013   6.15618973    1.6293596            .       
>  .
                        ATE |                           1.53668776            .       
>  .
----------------------------+---------------------------------------------------------
> --
Note: S.E. does not take into account that the propensity score is estimated.

 psmatch2: |   psmatch2: Common
 Treatment |        support
assignment | Off suppo  On suppor |     Total
-----------+----------------------+----------
 Untreated |        11        249 |       260 
   Treated |         2        183 |       185 
-----------+----------------------+----------
     Total |        13        432 |       445 

下面用pstest查看匹配效果是否较好的平衡了数据

. pstest age edu black hisp married re74 re75 u74 u75, both graph

--------------------------------------------------------------------------------------
> --
                Unmatched |       Mean               %reduct |     t-test    |  V(T)/
Variable          Matched | Treated Control    %bias  |bias| |    t    p>|t| |  V(C)
--------------------------+----------------------------------+---------------+--------
> --
age                    U  | 25.816   25.054     10.7         |   1.12  0.265 |  1.03
                       M  | 25.781   25.383      5.6    47.7 |   0.52  0.604 |  0.91
                          |                                  |               |
educ                   U  | 10.346   10.088     14.1         |   1.50  0.135 |  1.55*
                       M  | 10.322   10.415     -5.1    63.9 |  -0.49  0.627 |  1.52*
                          |                                  |               |
black                  U  | .84324   .82692      4.4         |   0.45  0.649 |     .
                       M  | .85246   .86339     -2.9    33.0 |  -0.30  0.765 |     .
                          |                                  |               |
hisp                   U  | .05946   .10769    -17.5         |  -1.78  0.076 |     .
                       M  | .06011   .04372      5.9    66.0 |   0.71  0.481 |     .
                          |                                  |               |
married                U  | .18919   .15385      9.4         |   0.98  0.327 |     .
                       M  | .18579   .19126     -1.4    84.5 |  -0.13  0.894 |     .
                          |                                  |               |
re74                   U  | 2.0956    2.107     -0.2         |  -0.02  0.982 |  0.74*
                       M  | 2.0672   1.9222      2.7 -1166.6 |   0.27  0.784 |  0.88
                          |                                  |               |
re75                   U  | 1.5321   1.2669      8.4         |   0.87  0.382 |  1.08
                       M  | 1.5299   1.6446     -3.6    56.7 |  -0.32  0.748 |  0.82
                          |                                  |               |
u74                    U  | .70811      .75     -9.4         |  -0.98  0.326 |     .
                       M  | .71038   .75956    -11.1   -17.4 |  -1.06  0.288 |     .
                          |                                  |               |
u75                    U  |     .6   .68462    -17.7         |  -1.85  0.065 |     .
                       M  | .60656   .63388     -5.7    67.7 |  -0.54  0.591 |     .
                          |                                  |               |
--------------------------------------------------------------------------------------
> --
if variance ratio outside [0.75; 1.34] for U and [0.75; 1.34] for M

-----------------------------------------------------------------------------------
 Sample    | Ps R2   LR chi2   p>chi2   MeanBias   MedBias      B      R     %Var
-----------+-----------------------------------------------------------------------
 Unmatched | 0.019     11.75    0.227     10.2       9.4      33.1*   0.82     50
 Matched   | 0.008      3.87    0.920      4.9       5.1      20.6    1.09     25
-----------------------------------------------------------------------------------
if B>25%, R outside [0.5; 2]


psgraph


您可能也对以下帖子感兴趣

文章有问题?点此查看未经处理的缓存