Stata:合成控制法 (Synthetic Control Method)操作及应用
* 高级计量经济学
* 计量经济学服务中心
* ===================================
* 空间计量及Stata应用
* ===================================
* 参考资料:
* 《初级计量经济学及Stata应用:Stata从入门到进阶》
* 《高级计量经济学及Stata应用:Stata回归分析与应用》
* 《高级计量经济学及Eviews应用》
* 《空间计量入门》
* 《零基础|轻松搞定空间计量:空间计量及GeoDa、Stata应用》
* 《空间计量第二部:空间计量及Matlab应用课程》
* 《空间计量第三部:空间计量及Stata应用课程》
* 《空间计量第四部:《空间计量及ArcGis应用课程》
* 《空间计量第五部:空间计量经济学》
* 《空间计量第六部:《空间计量及Python应用》
* 《空间计量第七部:《空间计量及R应用》
* 《空间计量第八部:《高级空间计量经济学》
clear all
ssc install synth, replace all
sysuse smoking
tsset state year
synth cigsale beer(1984(1)1988) lnincome retprice age15to24 cigsale(1988) cigsale(1980) cigsale(1975), trunit(3) trperiod(1989)
synth cigsale beer lnincome(1980&1985) retprice cigsale(1988) cigsale(1980) cigsale(1975), trunit(3) trperiod(1989) fig
synth cigsale retprice cigsale(1970) cigsale(1979) , trunit(33) counit(1(1)20) trperiod(1980) fig resultsperiod(1970(1)1990)
synth cigsale retprice cigsale(1970) cigsale(1979) , trunit(33) counit(1(1)20) trperiod(1980) resultsperiod(1970(1)1990) keep(resout)
synth cigsale beer lnincome retprice age15to24 cigsale(1988) cigsale(1980) cigsale(1975) , trunit(3) trperiod(1989) xperiod(1980(1)1988) nested
Example 1 - Construct synthetic control group:
synth cigsale beer(1984(1)1988) lnincome retprice age15to24 cigsale(1988) cigsale(1980) cigsale(1975),
trunit(3) trperiod(1989)
In this example, the unit affected by the intervention is unit no 3 (California) in the year 1989. The
donor pool (since no counit() is specified) defaults to the control units 1,2,4,5,...,39 ( ie. the other
38 states in the dataset). Since no xperiod() is provided, the predictor variables for which no
variable specific time period is specified (retprice, lnincome, and age15to24) are averaged over the
entire pre-intervention period up to the year of the intervention (1970,1981,...,1988). The beer
variable has the time period (1984(1)1988) specified, meaning that it is averaged for the periods
1984,1985,...,1988. The variable cigsale will be used three times as a predictor using the values from
periods 1988, 1980, and 1975 respectively. The MSPE is minimized over the entire pretreatment period,
because mspeperiod() is not provided. By default, results are displayed for the period from
1970,1971,...,2000 period (the earliest and latest year in the dataset).
Example 2 - Construct synthetic control group:
synth cigsale beer lnincome(1980&1985) retprice cigsale(1988) cigsale(1980) cigsale(1975), trunit(3)
trperiod(1989) fig
This example is similar to example 1, but now beer is averaged over the entire pretreatment period while
lnincome is only averaged over the periods 1980 and 1985. Since no data is available for beer prior to
1984, synth will inform the user that there is missing data for this variable and that the missing
values are ignored in the averaging. A results figure is also requested using the fig option.
Example 3 - Construct synthetic control group:
synth cigsale retprice cigsale(1970) cigsale(1979) , trunit(33) counit(1(1)20) trperiod(1980) fig
In this example, the unit affected by the intervention is state no 33, and the donor pool of potential
control units is restricted to states no 1,2,...,20. The intervention occurs in 1980, and results are
obtained for the 1970,1971,...,1990 period.
Example 4 - Construct synthetic control group:
synth cigsale retprice cigsale(1970) cigsale(1979) , trunit(33) counit(1(1)20) trperiod(1980)
resultsperiod(1970(1)1990) keep(resout)
This example is similar to example 2 but keep(resout) is specified and thus synth will save a dataset
named resout.dta in the current Stata working directory (type pwd to see the path of your working
directory). This dataset contains the result from the current fit and can be used for further
processing. Also to easily access results recall that synth routinely returns all result matrices. These
can be displayed by typing ereturn list after synth has terminated.
Example 5 - Construct synthetic control group:
synth cigsale beer lnincome retprice age15to24 cigsale(1988) cigsale(1980) cigsale(1975) , trunit(3)
trperiod(1989) xperiod(1980(1)1988) nested
This is again example 2, but the nested option is specified, which typically produces a better fit at
the expense of additional computing time. Alternativley, the user can also specified the allopt option
which can improve the fit even further and requires yet more computing time. Also, xperiod() is
specified indicating that predictors are averaged for the 1980,1981,...,1988 period.
Example 5 Run placebo in space::
. tempname resmat
forvalues i = 1/4 {
synth cigsale retprice cigsale(1988) cigsale(1980) cigsale(1975) , trunit(`i') trperiod(1989)
matrix `resmat' = nullmat(`resmat') \ e(RMSPE)
local names `"`names' `"`i'"'"'
mat colnames `resmat' = "RMSPE"
mat rownames `resmat' = `names'
matlist `resmat' , row("Treated Unit")
This is a code example to run placebo studies by iteratively reassigning the intervention in space to
the first four states. To do so, we simply run a four loop each where the trunit() setting is
incremented in each iteration. Thus, in the n of synth state number one is assigned to the intervention,
in the second run state number two, etc, etc. In each run we store the RMSPE and display it in a matrix
at the end.
# 倾向匹配得分
ssc install psmatch2
findit psmatch2
help psmatch2
help nnmatch
help psmatch
help pscore
findit propensity score
findit matching
ssc install psmatch2, replace
which psmatch2
use "ldw_exper.dta", clear
tabulate t, summarize(re78) means standard
set seed 20180105 //产生随机数种子
gen u=runiform()
sort u //排序
order u
local v1 "t"
local v2 "age edu black hisp married re74 re75 u74 u75"
global x "`v1' `v2' "
psmatch2 $x, out(re78) neighbor(1) ate ties logit common // 1:1 匹配
psmatch2 t age edu black hisp married re74 re75 u74 u75, out(re78) neighbor(1) ate ties logit common
pstest $v2, both graph
psmatch2 $x, out(re78) n(4) ate ties logit common
sum _pscore
psmatch2 $x, out(re78) n(4) cal(0.01) ate ties logit common
psmatch2 $x, out(re78) kernel ate ties logit common quietly
adopath +E:\stata\plus2
adopath +E:\stata\plus
cd E:\stata\data
use rdrobust_senate.dta, clear
* binscatter
binscatter vote margin,rd(0) n(50) linetype(lfit) ///
xtitle("Vote Share in Election at time t") ///
ytitle("Vote Share in Election at time t+2")
binscatter vote margin if margin>-50 & margin<50, ///
rd(0) n(50) linetype(lfit) ///
xtitle("Vote Share in Election at time t") ///
ytitle("Vote Share in Election at time t+2")
cmogram vote margin if margin>-50 & margin<50, ///
scatter cut(0) lineat(0) lfit ci(95) histopts(bin(25))
*2.3、twoway graph
tw (scatter vote margin if margin>-50 & margin<50) ///
(lfit vote margin if margin>-50 & margin<0) ///
(lfit vote margin if margin>=0 & margin<50, ///
xline(0,lc(red)) legend(off) ///
xtitle("Vote Share in Election at time t") ///
ytitle("Vote Share in Election at time t+2"))
rdplot vote margin, c(0) p(4) binselect(es) ci(95) ///
graph_options(title("RD Plot: U.S. Senate Election Data") ///
ytitle(Vote Share in Election at time t+2) ///
xtitle(Vote Share in Election at time t) ///
rdplot vote margin if margin>-50 & margin<50, c(0) p(2) binselect(es) ci(95) ///
graph_options(title("RD Plot: U.S. Senate Election Data") ///
ytitle(Vote Share in Election at time t+2) ///
xtitle(Vote Share in Election at time t) ///
DCdensity margin,breakpoint(0) generate(Xj Yj r0 fhat se_fhat)
cmogram population margin if margin>-50 & margin<50, ///
scatter cut(0) lineat(0) qfit ci(95) histopts(bin(20))
g D = margin>0 // generate treatment status
g marginsq = margin*margin
g D_margin = D*margin
reg vote D margin
reg vote D margin marginsq
reg vote D margin D_margin
rdrobust vote margin
reg vote D margin if margin>-17 & margin<17
reg vote D margin D_margin if margin>-17 & margin<17