协整:醉汉牵着一条狗
读研究生的时候,就听周雨田老师说过这句描述协整精髓的名言:
协整就是:醉汉牵着一条狗 (A Drunk with his Dog)。
今天又被学生问及协整,想起这句话,查了一下,居然源自一篇正式发表的论文:
Murray, Michael P. (1994). “A Drunk and her Dog: An Illustration of Cointegration and Error Correction” (PDF). The American Statistician. 48 (1): 37–39. doi:10.1080/00031305.1994.10476017.
直观解释
两个不认识的醉汉 假设你看到两个醉汉在路上晃悠 (计量上称为两个随机游走序列,two random walks),二人彼此不识 (称为二者彼此独立,they’re independent)。在这种情况下,二人的行进路径没有任何关系。
醉汉牵着他的狗 如果单独看醉汉或狗的行动轨迹,都是没有规律的(随机游走,random walk),但如果一起看,就会发现两个重要特征:(1) 二者之间的距离是有规律的——介于 0 和绳长之间,这是长期关系;(2) 短期来看,要关注的问题在于:到底是醉汉拉着狗,还是狗拉着醉汉。
引申:广州和北京的房价 单独看广州或北京的房价,似乎没有明确的规律。若放在一起看,会发现二者 缠缠绵绵到天边。二者的长期关系决定于中国的土地政策和财税政策。短期关系则需重点分析两个视角:
其一,是广州的房价牵制了北京的房价吗? 有些北漂实在无法忍受帝都的高房价了,选择跑到广州发展,导致广州房价小涨,间接拖住了北京房价上涨的脚步。
其二,还是广州房价跟着北京房价跑? 投资客们看到帝都房价猛涨,但由于各种限制,无法去掺乎,随即选择在广州投资房产,推高了广州的房价。
附:协整的发展历程
The first to introduce and analyse the concept of spurious—or nonsense—correlations was Udne Yule in 1926.[2]. Before the 1980s many economists used linear regressions on (de-trended[citation needed]
) non-stationary time series data, which Nobel laureate Clive Granger and Paul Newbold showed to be a dangerous approach that could produce spurious correlation,[3], [4], since standard detrending techniques can result in data that are still non-stationary.[5]. Granger’s 1987 paper with Robert Engle formalized the cointegrating vector approach, and coined the term.[6]
参考资料:
What is cointegration of time series data in statistics?
Wikipedia-Cointegration
The Stata Blog » Cointegration or spurious regression?
延伸阅读:
Panel-data cointegration tests | New in Stata 15
The Stata Blog » time series
The Stata Blog » nonstationary
The Stata Blog » Unit-root tests in Stata
Stata 范例:Unit Roots and Cointegration - Econometrics at Illinois
Cointegration – Johansen Test with Stata
最后,附上一张伟大的照片。(猜猜中间那位是谁?)