查看原文
其他

什么是不好的控制变量, 什么又是好的控制变量?

因果推断研究小组 计量经济圈 2021-10-23


凡是搞计量经济的,都关注这个号了

投稿econometrics666@sina.cn

所有计量经济圈方法论丛的程序文件, 微观数据库和各种软件都放在社群里.欢迎到计量经济圈社群交流访问

由@因果推断研究小组 撰写


现在,越来越讲究因果推断识别,因此一个被称之为“条件独立性假设”就格外重要(关于CIA可以到小组交流)。今天,咱们小组引荐的,是关于控制变量选择的议题,即何为不好的(好的)控制变量?不好的控制变量最好不要引入到回归模型中来,因为会造成咱们习以为常的“选择性偏误问题”。


那到底什么是不好的控制变量呢?它指的是会受到解释变量影响的变量,即这些控制变量并没有在解释变量受到影响之前就已经是前置决定了的(predetermined)。比如,研究学历对收入的影响,那咱们是不是需要控制一下职业呢?一旦控制职业变量,意味着咱们是在同一职业里对学历影响收入进行研究,但这样做存在选择性偏误呢?学历会同时影响一个人的职业和收入的,即这里的控制变量职业并不是一个相对于解释变量学历的前置变量,因此,咱们认定它是一个不好的控制变量。


下面看一看Mostly Harmless Econometrics: An Empiricist‘s Companion里的一个chapter,如果觉得中文读起来不顺畅的话,可以看看。


We have made the point that control for covariates can make the conditional independence assumption more plausible. But more control is not always better. Some variables are bad controls and should not be included in a regression model even when their inclusion might be expected to change the short regression coe¢ cients. Bad controls are variables that are themselves outcome variables in the notional experiment at hand. That is, bad controls might just as well be dependent variables too. Good controls are variables that we can think of as having been fixed at the time the regressor of interest was determined.


The essence of the bad control problem is a version of selection bias, albeit somewhat more subtle than the selection bias. To illustrate, suppose we are interested in the effects of a college degree on earnings and that people can work in one of two occupations, white collar and blue collar. A college degree clearly opens the door to higher-paying white collar jobs. Should occupation therefore be seen as an omitted variable in a regression of wages on schooling? After all, occupation is highly correlated with both education and pay. Perhaps it's best to look at the e¤ect of college on wages for those within an occupation, say white collar only. The problem with this argument is that once we acknowledge the fact that college a¤ects occupation, comparisons of wages by college degree status within an occupation are no longer apples-to-apples, even if college degree completion is randomly assigned.


be outcomes in the causal nexus. In many cases, however, the timing is uncertain or unknown. In such cases, clear reasoning about causal channels requires explicit assumptions about what happened first, or the assertion that none of the control variables are themselves caused by the regressor of interest.

2年,计量经济圈公众号近1000篇文章,

Econometrics Circle

数据系列:空间矩阵 | 工企数据 | PM2.5 | 市场化指数 | CO2数据 |  夜间灯光 

计量系列:匹配方法 | 内生性 | 工具变量 | DID | 面板数据 | 常用TOOL | 中介调节  | 时间序列

干货系列:能源环境 | 效率研究 | 空间计量 | 国际经贸 | 计量软件 | 商科研究 | 机器学习 | SSCI | CSSCI

计量经济圈组织了一个计量社群,有如下特征:热情互助最多、前沿趋势最多、社科资料最多、社科数据最多、科研牛人最多、海外名校最多。因此,建议积极进取和有强烈研习激情的中青年学者到社群交流探讨,始终坚信优秀是通过感染优秀而互相成就彼此的。

: . Video Mini Program Like ,轻点两下取消赞 Wow ,轻点两下取消在看

您可能也对以下帖子感兴趣

文章有问题?点此查看未经处理的缓存