公开重现资料时如何发布涉密数据

Original 连享会连享会 2022-12-31

👇 连享会 · 推文导航 | www.lianxh.cn

🍎 Stata：Stata基础 | Stata绘图 | Stata程序 | Stata新命令
📘 论文：数据处理 | 结果输出 | 论文写作 | 数据分享
💹 计量：回归分析 | 交乘项-调节 | IV-GMM | 时间序列 | 面板数据 | 空间计量 | Probit-Logit | 分位数回归
⛳ 专题：SFA-DEA | 生存分析 | 爬虫 | 机器学习 | 文本分析
🔃 因果：DID | RDD | 因果推断 | 合成控制法 | PSM-Matching
🔨 工具：工具软件 | Markdown | Python-R-Stata
🎧 课程：公开课-直播 | 计量专题 | 关于连享会

连享会寒假班

作者： 常丁祎（山东大学）
邮箱： changdingyi1126@163.com

1. 引言
2. Stata 范例

2.1 stata 操作
2.2 检验合成数据的可行性

3. 结语
4. 参考资料
5. 相关推文

温馨提示： 文中链接在微信中无法生效。请点击底部「阅读原文」。或直接长按/扫描如下二维码，直达原文：

1. 引言

现如今国外经济学的一些 TOP 期刊（如 AER, QJE, JPE, AEJ 系列等）基本上都会要求作者提供论文的原始数据和代码，并且还会将作者上传的数据和代码也会公开出来，通过这样的方式不仅能约束学术不端行为，也能保护作者的知识产权。固然说，这种将数据代码公开给学者使用，可以帮助学术圈的进步，但是这也给投稿的作者们带来了难题，尤其是很多时候他们使用的数据是保密的或者签订了协议并不能公开此数据。

对此，我们就需要采取一些措施来处理我们的原始数据，如构造一个合成数据集，让这个合成数据集满足所有的隐私保护约束，同时还能保留原始数据的一些重要的结构，让广大学者可以通过使用这个合成数据集能够大致复现论文的主要结论。基于这个思考，我们可以利用多重填充（Multiple Imputation）的方法，以下的步骤参考于 How to come public, with private data.

2. Stata 范例

为了更好地描述该方法是如何进行的，我们将使用一个现成的在线数据集。该数据摘自《1998 瑞士劳动力市场调查》，在 stata 命令 oaxaca （by Jann, 2008）中作为示例数据提供。

在这里我们假设你已经签署了保密协议来处理 Swiss Survey 的数据，并准备提交论文，但是所投稿的期刊需要你提供论文的数据和代码。但是由于你已经签署了保密协议不能公开此数据集，因此在此文的建议是提供 5 个人为合成的数据集，基于此合成数据集，其他人就可以使用你提供的代码去复现论文中的实证结果。具体的 stata 操作如下：

2.1 stata 操作

首先，在 stata 中导入本文所使用的范例数据oaxaca.dta，并且可以发现oaxaca.dta是一个包含 15 个变量总共 1647 个观测值的数据集。

clear all

. use http://fmwww.bc.edu/RePEc/bocode/o/oaxaca.dta, clear  //使用oaxaca数据集
(Excerpt from the Swiss Labor Market Survey 1998)

. codebook, c

Variable    Obs Unique      Mean       Min       Max  Label
---------------------------------------------------------------------------
lnwage     1434    675  3.357604   .507681  5.259097  log hourly wages
educ       1647     10  11.40134         5      17.5  years of education
exper      1434    563  13.15324         0  49.16667  years of work experience
tenure     1434    323  7.860937         0  44.83333  years of job tenure
isco       1434      9  4.014644         1         9  occupation (ISCO)
female     1647      2  .5391621         0         1  sex of respondent (1=female)
lfp        1647      2   .870674         0         1  labor force participation
age        1647     45  39.25379        18        62  age of respondent
agesq      1647     45  1662.489       324      3844  age squared
single     1647      2   .343048         0         1  single
married    1647      2  .5233758         0         1  married
divorced   1647      2  .1335762         0         1  divorced
kids6      1647      5  .2847602         0         4  number of childern ages 6 and younger
kids714    1647      5  .3290832         0         4  number of children ages 7 to 14
wt         1647      6  1.006181  .5302977  3.181786  sampling weights
---------------------------------------------------------------------------

. misstable summarize  //对数据集中的缺失值进行报告
                                              Obs<.
                                 +--------------------------
         |                       | Unique                  
Variable |  Obs=.  Obs>.  Obs<.  | values      Min       Max
---------+-----------------------+--------------------------
  lnwage |    213         1,434  |   >500  .507681  5.259097
   exper |    213         1,434  |   >500        0  49.16667
  tenure |    213         1,434  |    323        0  44.83333
    isco |    213         1,434  |      9        1         9
------------------------------------------------------------

. count if lfp==0
  213

. list lnwage exper tenure isco lfp in 1435/1450

      +--------------------------------------+
      | lnwage   exper   tenure   isco   lfp |
      |--------------------------------------|
1435. |      .       .        .      .     0 |
1436. |      .       .        .      .     0 |
1437. |      .       .        .      .     0 |
1438. |      .       .        .      .     0 |
1439. |      .       .        .      .     0 |
      |--------------------------------------|
1440. |      .       .        .      .     0 |
1441. |      .       .        .      .     0 |
1442. |      .       .        .      .     0 |
1443. |      .       .        .      .     0 |
1444. |      .       .        .      .     0 |
      |--------------------------------------|
1445. |      .       .        .      .     0 |
1446. |      .       .        .      .     0 |
1447. |      .       .        .      .     0 |
1448. |      .       .        .      .     0 |
1449. |      .       .        .      .     0 |
      |--------------------------------------|
1450. |      .       .        .      .     0 |
      +--------------------------------------+

从以上结果可以得知，原始数据集中 lnwage、exper、tenure 以及 isco 这四个变量具有缺失值，并且是在 lsp=0 时，这四个变量的观测值缺失。

面对这种情况，我们可以通过生成一个“seed”变量用来创建合成数据集，并且该变量是一个范围在 0-100 之间的随机均匀变量。除此之外，再生成一个 id 变量对数据集的观测值进行编号。

. gen id = _n  //进行编号

. set seed 10101

. gen seed = runiform(0,100)

下一步是在原始数据集的基础上再生成相同数量的数据集。具体做法为按照数据集中第一列的观测值重复复制 1648 次，并且对于新生成的观测值赋值 tag=1。再将 tag=1 的所有变量的观测值都设置为缺失。具体操作如下：

. expand 1648 in 1, gen(tag)
(1,647 observations created)

. local vlist "lnwage educ exper tenure isco female lfp age single married divorced kids6 kids714 wt"
. foreach i of varlist `vlist' {
        replace `i'=. if tag==1
  }

对于 tag=1 对应的新生成的 1647 个观测值中需要重新生成 seed 和 lfp 的变量的值。

. replace seed = runiform(0,100) if tag==1
(1,647 real changes made)

. replace lfp = runiform()<.87 if tag==1
(1,647 real changes made)

下一步是利用多元填充（Multiple Imputation）的方法生成合成数据集。在这里我们需要使用 mi impute chain 命令，我们认为最好的方法是使用 pmm，即预测均值匹配（predictive mean matching）的方法。即：

. mi set wide

. mi register impute lnwage educ exper tenure  ///
                     isco female age single married  ///
                     kids6 kids714 wt

. mi impute chain  ///
     (pmm, knn(100)) educ female age single married kids6 kids714 wt (pmm if lfp==1, knn(100)) ///
     lnwage  exper tenure isco = seed lfp, add(5)
note: missing-value pattern is monotone; no iteration performed

Conditional models (monotone):
              educ: pmm educ seed lfp , knn(100)
            female: pmm female educ seed lfp , knn(100)
               age: pmm age female educ seed lfp , knn(100)
            single: pmm single age female educ seed lfp , knn(100)
           married: pmm married single age female educ seed lfp , knn(100)
             kids6: pmm kids6 married single age female educ seed lfp , knn(100)
           kids714: pmm kids714 kids6 married single age female educ seed lfp , knn(100)
                wt: pmm wt kids714 kids6 married single age female educ seed lfp , knn(100)
            lnwage: pmm lnwage wt kids714 kids6 married single age female educ seed lfp if lfp==1, knn(100)
             exper: pmm exper lnwage wt kids714 kids6 married single age female educ seed lfp if lfp==1, knn(100)
            tenure: pmm tenure exper lnwage wt kids714 kids6 married single age female educ seed lfp if lfp==1, knn(100)
              isco: pmm isco tenure exper lnwage wt kids714 kids6 married single age female educ seed lfp if lfp==1, knn(100)

Performing chained iterations ...

Multivariate imputation              Imputations =        5
Chained equations                          added =        5
Imputed: m=1 through m=5                 updated =        0
                                                             
Initialization: monotone              Iterations =        0
                                         burn-in =        0

      educ: predictive mean matching
    female: predictive mean matching
       age: predictive mean matching
    single: predictive mean matching
   married: predictive mean matching
     kids6: predictive mean matching
   kids714: predictive mean matching
        wt: predictive mean matching
    lnwage: predictive mean matching
     exper: predictive mean matching
    tenure: predictive mean matching
      isco: predictive mean matching

----------------------------------------------------------
           |               Observations per m
           |----------------------------------------------
  Variable |   Complete   Incomplete   Imputed |     Total
-----------+-----------------------------------+----------
      educ |       1647         1647      1647 |      3294
    female |       1647         1647      1647 |      3294
       age |       1647         1647      1647 |      3294
    single |       1647         1647      1647 |      3294
   married |       1647         1647      1647 |      3294
     kids6 |       1647         1647      1647 |      3294
   kids714 |       1647         1647      1647 |      3294
        wt |       1647         1647      1647 |      3294
    lnwage |       1434         1458      1458 |      2892
     exper |       1434         1458      1458 |      2892
    tenure |       1434         1458      1458 |      2892
      isco |       1434         1458      1458 |      2892
----------------------------------------------------------
(complete + incomplete = total; imputed is the minimum across m
 of the number of filled-in observations.)

. forvalues i = 1/5 {
        preserve
            keep if tag==1
            keep _`i'_* lfp
            ren _`i'_* *
            save fake_oaxaca_`i', replace
        restore
}

通过这种方式就新生成了 5 组变量，这 5 组变量可以用于创建 5 个独特的合成数据集，这些合成数据集与原始数据集具有类似的结构，并可以用来复制论文结果以及公开使用。

2.2 检验合成数据的可行性

现在通过估计一个简单的 Linear Regression、Quantile Regression 和 Heckman 两步法模型来检验合成数据集的可行性。即：

 frame create test

 frame test: {
     use http://fmwww.bc.edu/RePEc/bocode/o/oaxaca.dta, clear
     qui:reg lnwage educ exper tenure female
     est sto m1
     qui:qreg lnwage educ exper tenure female, q(10)
     est sto m2
     qui:heckman lnwage educ exper tenure female age, selec(lfp =educ     female age single married kids6 kids714) two
     est sto m3
 }

forvalues i = 1/5 {
   frame test: {
       use fake_oaxaca_`i', clear
                                              
       qui:reg lnwage educ exper tenure female
          est sto m1`i'
          qui: qreg lnwage educ exper tenure female, q(10)
          est sto m2`i'
                                            
       qui: heckman lnwage educ exper tenure female age, ///
            selec(lfp =educ female age single married kids6 kids714) two
       est sto m3`i'
   }
 }

. ** OLS
. esttab m1 m11 m12 m13 m14 m15, mtitle(Original Fake1 Fake2 Fake3 Fake4 Fake5)

----------------------------------------------------------------------------------
             (1)         (2)          (3)          (4)          (5)         (6)
        Original       Fake1        Fake2        Fake3        Fake4       Fake5
----------------------------------------------------------------------------------
educ      0.0848***   0.0664***    0.0773***    0.0676***    0.0647***   0.0597***
         (16.34)     (13.73)      (16.11)      (12.51)      (14.00)     (12.73)
                                                                                 
exper     0.0111***  0.00908***   0.00992***    0.0122***   0.00655***  0.00584***
          (7.22)      (6.34)       (6.98)       (7.64)       (4.95)      (3.96)
                                                                                 
tenure   0.00771***  0.00747***    0.0100***   0.00112      0.00626***  0.00860***
          (4.10)      (4.17)       (5.70)       (0.57)       (3.63)      (4.64)
                                                                                  
female   -0.0841***  -0.0508*     -0.0767**    -0.0914***   -0.0931***  -0.0259
         (-3.35)     (-2.18)      (-3.28)      (-3.56)      (-4.05)     (-1.05)
                                                                                 
_cons      2.213***    2.469***     2.308***     2.441***     2.540***    2.558***
         (32.38)     (39.69)      (36.55)      (33.93)      (42.24)     (40.47)
----------------------------------------------------------------------------------
N           1434        1458         1458         1458         1458        1458
----------------------------------------------------------------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001

. ** qreg 10
. esttab m2 m21 m22 m23 m24 m25, mtitle(Original Fake1 Fake2 Fake3 Fake4 Fake5)

----------------------------------------------------------------------------------
             (1)          (2)          (3)          (4)         (5)         (6)
        Original        Fake1        Fake2        Fake3       Fake4       Fake5
----------------------------------------------------------------------------------
educ       0.103***    0.0698***    0.0810***    0.0768***   0.0717***   0.0637***
          (6.21)       (6.02)       (7.09)       (4.15)      (6.57)      (6.11)
                                                                                 
exper     0.0200***    0.0111**     0.0111**     0.0135*    0.00645*    0.00341
          (4.06)       (3.23)       (3.29)       (2.45)      (2.06)      (1.04)
                                                                                 
tenure  0.000669      0.00592      0.00987*     0.00240     0.00270     0.00460
          (0.11)       (1.38)       (2.35)       (0.36)      (0.66)      (1.12)
                                                                                 
female    -0.151      -0.0822      -0.0657       -0.166      -0.116*     0.0175
         (-1.87)      (-1.47)      (-1.18)      (-1.88)     (-2.13)      (0.32)
                                                                                 
_cons      1.462***     1.963***     1.791***     1.869***    2.067***    2.104***
          (6.67)      (13.18)      (11.90)       (7.58)     (14.55)     (14.95)
----------------------------------------------------------------------------------
N           1434         1458         1458         1458        1458        1458
----------------------------------------------------------------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001

. ** heckman
. esttab m3 m31 m32 m33 m34 m35, mtitle(Original Fake1 Fake2 Fake3 Fake4 Fake5)

--------------------------------------------------------------------------------
             (1)         (2)         (3)         (4)         (5)          (6)
        Original       Fake1       Fake2       Fake3       Fake4        Fake5
--------------------------------------------------------------------------------
lnwage                                                                               
educ      0.0717***   0.0617***   0.0717***   0.0644***   0.0585***    0.0487***
         (13.13)     (11.83)     (14.16)     (11.32)     (12.20)       (9.84)
                                                                               
exper    0.00179     0.00397*    0.00246     0.00248    -0.00347*    -0.00398*
          (0.94)      (2.18)      (1.40)      (1.24)     (-2.11)      (-2.21)
                                                                             
tenure   0.00200     0.00481*    0.00637*** -0.00295     0.00101      0.00290
          (1.01)      (2.57)      (3.52)     (-1.49)      (0.58)       (1.52)
                                                                               
female    -0.105***  -0.0721*     -0.132***   -0.211***   -0.185***   -0.0583
         (-3.59)     (-2.48)     (-4.10)     (-6.35)     (-6.57)      (-1.80)
                                                                               
age       0.0146***  0.00740***   0.0108***   0.0122***   0.0141***    0.0149***
          (7.92)      (4.50)      (6.81)      (6.79)      (9.00)       (8.99)
                                                                               
_cons      1.991***    2.332***    2.087***    2.168***    2.246***     2.297***
         (27.12)     (33.01)     (29.79)     (27.02)     (34.05)      (33.22)
--------------------------------------------------------------------------------
lfp                                                                                
educ       0.149***    0.210***    0.148***     0.128***    0.136***    0.149***
          (5.37)      (6.87)      (5.72)       (5.19)      (5.59)      (5.80)
                                                                               
female    -1.785***   -1.696***   -1.662***    -1.463***   -1.510***   -1.811***
        (-11.09)    (-10.62)     (-9.66)      (-9.78)    (-10.91)    (-10.17)
                                                                               
age      -0.0388*** -0.00878     -0.0170**    -0.0193***  -0.0271***  -0.0289***
         (-5.77)     (-1.39)     (-2.94)      (-3.55)     (-4.61)     (-4.78)
                                                                               
single   -0.0998      -0.361     -0.0764       -0.192      -0.497*     -0.681***
         (-0.43)     (-1.63)     (-0.37)      (-0.96)     (-2.36)     (-3.31)
                                                                               
married   -0.867***   -0.775***   -0.544***    -0.596***   -0.746***   -0.644***
         (-5.48)     (-4.15)     (-3.43)      (-3.60)     (-4.22)     (-3.93)
                                                                                        
kids6     -0.716***   -0.571***   -0.499***    -0.599***   -0.671***   -0.563***
         (-8.71)     (-7.40)     (-6.84)      (-7.55)     (-8.84)     (-7.03)
                                                                               
kids714   -0.343***   -0.345***   -0.206**     -0.292***   -0.258***   -0.482***
         (-5.26)     (-5.22)     (-2.90)      (-4.55)     (-3.98)     (-6.73)
                                                                               
_cons      3.543***    1.483**     2.199***     2.434***    2.858***    3.112***
          (7.29)      (3.15)      (4.95)       (5.97)      (6.88)      (7.06)
--------------------------------------------------------------------------------
/mills                                                                          
lambda    -0.123     0.00819      0.0933        0.332***    0.185**    -0.0255
         (-1.88)      (0.11)      (1.15)       (4.18)      (2.70)      (-0.31)
------------------------------------------- -------------------------------------
N           1647        1647        1647         1647        1647         1647
------------------------------------------- -------------------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001

可以发现的是，利用 5 个合成数据集跑出来的回归结果跟原始数据跑出来的回归结果是具有差异的，但是整体上这个差异并不会相差甚远，只有略微的差异，不会影响到最终结论。

接下来，我们再进行分析原始数据以及其中两个合成数据集变量的协方差矩阵的结果，结果如下：

. frame test: {
.         use http://fmwww.bc.edu/RePEc/bocode/o/oaxaca.dta, clear
(Excerpt from the Swiss Labor Market Survey 1998)
.         mean lnwage exper tenure educ   female   age single married kids6 kids714

Mean estimation                   Number of obs   =      1,434

--------------------------------------------------------------
             |       Mean   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
      lnwage |   3.357604   .0140235      3.330096    3.385113
       exper |   13.15324   .2632213       12.6369    13.66958
      tenure |   7.860937   .2144401      7.440287    8.281587
        educ |   11.53696   .0639585       11.4115    11.66242
      female |   .4762901   .0131934      .4504096    .5021706
         age |   38.83891   .2915321      38.26704    39.41079
      single |   .3891213   .0128794      .3638568    .4143859
     married |   .4700139   .0131845      .4441509     .495877
       kids6 |   .2182706   .0151344      .1885826    .2479586
     kids714 |   .2782427   .0172008      .2445013     .311984
--------------------------------------------------------------
.
.         corr lnwage exper tenure educ   female   age single married kids6 kids714 , cov
(obs=1,434)

             |   lnwage    exper   tenure     educ   female      age   single  married    kids6  kids714
-------------+------------------------------------------------------------------------------------------
      lnwage |   .28201
       exper |  1.23107  99.3553
      tenure |  1.03799  47.0903  65.9418
        educ |  .469384 -3.24851 -.510834  5.86604
      female | -.043298 -.484036 -.598583  -.14532  .249612
         age |  2.05353  79.3047  54.7529  1.62913  .213554  121.877
      single | -.061535 -1.71735 -1.22853 -.005669 -.001235 -2.87447  .237872
     married |  .044889  1.05484  .938517  .089909 -.027229  1.75406  -.18302  .249275
       kids6 |  .030479 -.557053 -.447353  .118061 -.034249 -.953649 -.077317  .096222  .328459
     kids714 |  .036036  .088118  .006038  .020763  .001368  .469835 -.099274  .100813  .018081  .424272

.
. }

. forvalues i = 1/2 {
    frame test: {
       use fake_oaxaca_`i', clear
       mean lnwage exper tenure educ   female   age single married kids6 kids714
       corr lnwage exper tenure educ   female   age single married kids6 kids714 , cov
    }
}

Mean estimation                 Number of obs  = 1,458

------------------------------------------------------
         |      Mean   Std. Err.  [95% Conf. Interval]
---------+--------------------------------------------
  lnwage |  3.388637   .0126988   3.363727    3.413547
   exper |  12.95556   .2533696   12.45855    13.45257
  tenure |  7.501029    .201992   7.104803    7.897255
    educ |  11.59825   .0627407   11.47518    11.72132
  female |  .4657064   .0130682   .4400719     .491341
     age |  38.77092   .2892844   38.20346    39.33838
  single |  .3737997    .012675   .3489366    .3986628
 married |  .4835391    .013092    .457858    .5092202
   kids6 |  .2256516   .0151994   .1958365    .2554666
 kids714 |  .2921811   .0175129   .2578278    .3265343
------------------------------------------------------
(obs=1,458)

        |   lnwage    exper   tenure     educ   female      age   single  married    kids6  kids714
--------+------------------------------------------------------------------------------------------
 lnwage |  .235118
  exper |  1.08476   93.598
 tenure |  .825123  41.0372  59.4875
   educ |  .370826 -1.40578 -.144862  5.73927
 female | -.023944 -.422867 -.347883 -.073241  .248995
    age |  1.59116  77.3443  48.6604   2.5879  .185001  122.013
 single | -.051301 -1.51003 -1.02824 -.076044   .01523 -2.93971  .234234
married |  .027974   .78726  .844446  .127133  -.03248  1.67021 -.180871    .2499
  kids6 |  .025567 -.887329 -.501605  .238281 -.025544  -1.0073  -.07068  .092598   .33683
kids714 |  .014055 -.016863 -.031873  .029269 -.028408  .498002  -.08527  .109137 -.008324  .447173

(Excerpt from the Swiss Labor Market Survey 1998)

Mean estimation                   Number of obs   =      1,458

--------------------------------------------------------------
             |       Mean   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
      lnwage |   3.375149   .0131252      3.349403    3.400895
       exper |   13.38134   .2600573      12.87122    13.89147
      tenure |   7.981767   .2083602      7.573049    8.390485
        educ |   11.51749   .0640525      11.39184    11.64313
      female |   .4718793   .0130783      .4462249    .4975337
         age |   39.42593   .2911667      38.85478    39.99708
      single |   .3525377   .0125164      .3279856    .3770899
     married |   .5041152   .0130986      .4784211    .5298094
       kids6 |   .2139918   .0149255       .184714    .2432696
     kids714 |   .2716049   .0165868      .2390683    .3041415
--------------------------------------------------------------
(obs=1,458)

        |   lnwage    exper   tenure     educ   female      age   single  married    kids6  kids714
--------+------------------------------------------------------------------------------------------
 lnwage |   .25117
  exper |  1.18759  98.6042
 tenure |  1.00408  44.5726  63.2976
   educ |   .42614 -3.36842 -1.26497  5.98176
 female | -.035159 -.297665 -.319233 -.127854   .24938
    age |  1.90257  79.0684  49.8293  .911042  .277257  123.606
 single | -.050738 -1.52906 -1.13198 -.024701 -.011356 -2.80091  .228412
married |  .034221  .687831  .813819  .109228 -.007434  1.54698 -.177842  .250155
  kids6 |  .008237 -.775524 -.647205  .137813 -.012509 -1.08846  -.07206  .097952  .324801
kids714 |  .037922 -.072246  .041389  .091506  .002839  .339282 -.087581  .090165 -.012176  .401129

从以上结果可以得知，合成的数据集中的各变量的均值和协方差与原始数据的变量是非常接近的，这表明合成的数据集并不会改变原始数据集的一些特征结构。

3. 结语

正如上面所见，最后我们得到的结果远远不能完美地复制原始数据得到的结果。毕竟，以上的操作是通过引入一个随机误差来创建一个合成的数据集，以便其他人能够利用该数据尝试性的重现论文的工作。但是虽然不能完全复制论文原有的工作，但是却可以大致的得到原来的工作结果，这是值得尝试的。通过这种尝试，我们可以解决保密数据不能公开的难题。

4. 参考资料

Blog, How to come public, with private data, -Link-
Jann, Ben (2008). The Blinder-Oaxaca decomposition for linear regression models. The Stata Journal 8(4): 453-479.
Jenkins, SP, Rios‐Avila, F, 2021. "Measurement error in earnings data: replication of Meijer, Rohwedder, and Wansbeek's mixture model approach to combining survey and register data." J Appl Econ 36(4): 474-483. https://doi.org/10.1002/jae.2811

5. 相关推文

Note：产生如下推文列表的 Stata 命令为：
lianxh
安装最新版 lianxh 命令：
ssc install lianxh, replace

专题：专题课程

直播-我的甲壳虫-论文精讲与重现

专题：论文写作

连享会：论文重现网站大全
Stata-JPE论文重现：资本深化与非平衡经济增长
可重复性研究：如何保证你的研究结果可重现？

专题：Stata资源

会计期刊论文的结果可重现吗？

专题：数据处理

Stata结果重现：dependencies命令-外部命令的版本控制

New！ Stata 搜索神器：lianxh 和 songbl GIF 动图介绍
搜：推文、数据分享、期刊论文、重现代码 ……
👉 安装：
. ssc install lianxh
. ssc install songbl
👉 使用：
. lianxh DID 倍分法
. songbl all

🍏 关于我们

连享会 ( www.lianxh.cn，推文列表) 由中山大学连玉君老师团队创办，定期分享实证分析经验。
直通车： 👉【百度一下：连享会】即可直达连享会主页。亦可进一步添加「知乎」,「b 站」,「面板数据」,「公开课」等关键词细化搜索。

法明传[2024]173号：1月1日起，未用示范文本提交起诉状，部分法院将不予立案

法明传[2024]173号：1月1日起，未用示范文本提交起诉状，部分法院将不予立案

法明传[2024]173号：关于加快推进起诉状、答辩状示范文本全面应用工作的通知(附下载链接)

2025.1.1起，全国法院全面推进应用民事起诉状、答辩状示范文本(附下载链接)

2025.1.1起，全国法院全面推进应用民事起诉状、答辩状示范文本(附下载链接)

公开重现资料时如何发布涉密数据

连享会寒假班

1. 引言

2. Stata 范例

2.1 stata 操作

2.2 检验合成数据的可行性

3. 结语

4. 参考资料

5. 相关推文

🍏 关于我们

您可能也对以下帖子感兴趣

法明传[2024]173号：1月1日起，未用示范文本提交起诉状，部分法院将不予立案

法明传[2024]173号：1月1日起，未用示范文本提交起诉状，部分法院将不予立案

法明传[2024]173号：关于加快推进起诉状、答辩状示范文本全面应用工作的通知(附下载链接)

2025.1.1起，全国法院全面推进应用民事起诉状、答辩状示范文本(附下载链接)

2025.1.1起，全国法院全面推进应用民事起诉状、答辩状示范文本(附下载链接)

生成图片，分享到微信朋友圈

公开重现资料时如何发布涉密数据

连享会寒假班

1. 引言

2. Stata 范例

2.1 stata 操作

2.2 检验合成数据的可行性

3. 结语

4. 参考资料

5. 相关推文

🍏 关于我们

您可能也对以下帖子感兴趣