查看原文
其他

Stata:定制论文中表1-table1

连享会 连享会 2023-10-24

👇 连享会 · 推文导航 | www.lianxh.cn

连享会 · 2022空间计量专题

作者:姜昊 (华东师范大学)
邮箱:HaoJiang0204@outlook.com

温馨提示: 文中链接在微信中无法生效。请点击底部「阅读原文」。或直接长按/扫描如下二维码,直达原文:


目录

  • 1. 命令介绍

  • 2. 案例介绍

  • 3. 相关推文



1. 命令介绍

table1_mc 是 Phil Clayton 编写的外部命令,用于为论文制定一个特征性事实描述的表格。

* 命令安装
ssc install table1_mc, replace
* 命令语法
table1_mc [if] [in] [weight], vars(var_spec) [options]
var_spec = varname vartype [%fmt1 [%fmt2]] [ \ varname vartype [%fmt1 [%fmt2]] \ ...]

默认情况下,table1_mc 会输出指定变量的基线特征结果。var_spec 用于指定的变量集合,其中:

  • varname:指定单个变量,若进行多个变量的分析需要用反斜杠 \ 隔开;
  • vartype:指定描述变量的类型,且不可省略,否则代码报错。具体包括以下 7 种变量类型:
    • contn:用于服从正态分布的连续变量,返回均值和标准误;
    • contln:用于服从对数正态分布的连续变量,返回几何平均值和几何标准误;
    • conts:用于不服从正态分布与对数正态分布的连续变量,返回中位数与上下四分位数;
    • cat:类别变量,采用 Pearson 卡方检验组别差异;
    • cate:类别变量,采用 Fisher 精确检验组别差异;
    • bin:二分类变量,采用 Pearson 卡方检验组别差异;
    • bine:二分类变量,采用 Fisher 精确检验组别差异;
  • %fmt1:变量结果输出格式设定,参考 format 的输出语法;
  • %fmt2:变量其他结果输出格式设定,参考 format 的输出语法。

options 如下:

  • by(varname):分组变量,且 varname 必须是字符串或者数字,并且仅包含非负整数,无论是否增加值标签;
  • missing:对于 catcate 的类别变量,将缺失值视为一个新的类别;
  • test:结果包括描述显著性检验的方法;
  • statistic:结果包括描述检验统计量值的列;
  • percent:报告二 (多) 分类变量在所属组别的比重;
  • percent_n:以 %(n) 格式报告二 (多) 分类变量在所属组别的比重与个数;
  • slashN:以 n/N 替代 n (%) 的格式报告二 (多) 分类变量在所属组别的统计内容;
  • catrowperc:报告多分类变量在不同组别的行百分比;
  • pdp(#):设定 值小数位数;
  • saving(filename [, export_excel_options]):设定输出到 Excel 中的文件名与其他选项;
  • clear: 将 Stata 内存数据集用 table1_mc 结果替换。

2. 案例介绍

为了进一步直观感受各个选项的作用,下文将选取汽车数据 (auto.dta) 进行案例演示。具体地,按照汽车是否属于国产 (用 foreign 变量衡量),分别对服从正态分布的 weight、服从对数正态分布的 price、不服从正态分布与对数正态分布的 mpg、多分类变量 rep78 和二分类变量 much_headroom 进行分析。

. sysuse auto, clear
. generate much_headroom = (headroom>=3)
. table1_mc, by(foreign) vars(weight contn %5.0f \ price contln %5.0f %4.2f ///
> \ mpg conts %5.0f \ rep78 cate \ much_headroom bin) onecol nospace

+--------------------------------------------+
| factor N_0 N_1 m_0 m_1 |
|--------------------------------------------|
| Weight (lbs.) 52 22 0 0 |
|--------------------------------------------|
| Price 52 22 0 0 |
|--------------------------------------------|
| Mileage (mpg) 52 22 0 0 |
|--------------------------------------------|
| Repair record 1978 48 21 4 1 |
|--------------------------------------------|
| much_headroom 52 22 0 0 |
+--------------------------------------------+
N_ ... #records used below, m_ ... #records not used

+--------------------------------------------------------------+
| Domestic Foreign p-value |
|--------------------------------------------------------------|
| N=52 N=22 |
|--------------------------------------------------------------|
| Weight (lbs.) 3317 (695) 2316 (433) <0.001 |
|--------------------------------------------------------------|
| Price 5534 (×/1.50) 5959 (×/1.44) 0.46 |
|--------------------------------------------------------------|
| Mileage (mpg) 19 (17-22) 25 (21-28) 0.002 |
|--------------------------------------------------------------|
| Repair record 1978 <0.001 |
| 1 2 (4%) 0 (0%) |
| 2 8 (17%) 0 (0%) |
| 3 27 (56%) 3 (14%) |
| 4 9 (19%) 9 (43%) |
| 5 2 (4%) 9 (43%) |
|--------------------------------------------------------------|
| much_headroom 35 (67%) 8 (36%) 0.014 |
+--------------------------------------------------------------+
Data are presented as mean (SD) or geometric mean (×/geometric SD) or
median (IQR) for continuous measures, and n (%) for categorical measures.

增加 missing 选项,则变量 rep78 的缺失值被识别为新的类别。

. table1_mc, by(foreign) vars(weight contn %5.0f \ price contln %5.0f %4.2f ///
> \ mpg conts %5.0f \ rep78 cate \ much_headroom bin) onecol nospace missing

+--------------------------------------------+
| factor N_0 N_1 m_0 m_1 |
|--------------------------------------------|
| Weight (lbs.) 52 22 0 0 |
|--------------------------------------------|
| Price 52 22 0 0 |
|--------------------------------------------|
| Mileage (mpg) 52 22 0 0 |
|--------------------------------------------|
| Repair record 1978 52 22 0 0 |
|--------------------------------------------|
| much_headroom 52 22 0 0 |
+--------------------------------------------+
N_ ... #records used below, m_ ... #records not used

+--------------------------------------------------------------+
| Domestic Foreign p-value |
|--------------------------------------------------------------|
| N=52 N=22 |
|--------------------------------------------------------------|
| Weight (lbs.) 3317 (695) 2316 (433) <0.001 |
|--------------------------------------------------------------|
| Price 5534 (×/1.50) 5959 (×/1.44) 0.46 |
|--------------------------------------------------------------|
| Mileage (mpg) 19 (17-22) 25 (21-28) 0.002 |
|--------------------------------------------------------------|
| Repair record 1978 <0.001 |
| 1 2 (4%) 0 (0%) |
| 2 8 (15%) 0 (0%) |
| 3 27 (52%) 3 (14%) |
| 4 9 (17%) 9 (41%) |
| 5 2 (4%) 9 (41%) |
| Missing 4 (8%) 1 (5%) |
|--------------------------------------------------------------|
| much_headroom 35 (67%) 8 (36%) 0.014 |
+--------------------------------------------------------------+
Data are presented as mean (SD) or geometric mean (×/geometric SD)
or median (IQR) for continuous measures, and n (%) for categorical measures.

增加 test 选项,每行结果后增加了显著性检验的方法。

. table1_mc, by(foreign) vars(weight contn %5.0f \ price contln %5.0f %4.2f ///
> \ mpg conts %5.0f \ rep78 cate \ much_headroom bin) onecol nospace missing test

+--------------------------------------------+
| factor N_0 N_1 m_0 m_1 |
|--------------------------------------------|
| Weight (lbs.) 52 22 0 0 |
|--------------------------------------------|
| Price 52 22 0 0 |
|--------------------------------------------|
| Mileage (mpg) 52 22 0 0 |
|--------------------------------------------|
| Repair record 1978 52 22 0 0 |
|--------------------------------------------|
| much_headroom 52 22 0 0 |
+--------------------------------------------+
N_ ... #records used below, m_ ... #records not used

+-----------------------------------------------------------------------------------+
| Domestic Foreign Test p-value |
|-----------------------------------------------------------------------------------|
| N=52 N=22 |
|-----------------------------------------------------------------------------------|
| Weight (lbs.) 3317 (695) 2316 (433) Ind. t test <0.001 |
|-----------------------------------------------------------------------------------|
| Price 5534 (×/1.50) 5959 (×/1.44) Ind. t test, logged data 0.46 |
|-----------------------------------------------------------------------------------|
| Mileage (mpg) 19 (17-22) 25 (21-28) Wilcoxon rank-sum 0.002 |
|-----------------------------------------------------------------------------------|
| Repair record 1978 Fisher's exact <0.001 |
| 1 2 (4%) 0 (0%) |
| 2 8 (15%) 0 (0%) |
| 3 27 (52%) 3 (14%) |
| 4 9 (17%) 9 (41%) |
| 5 2 (4%) 9 (41%) |
| Missing 4 (8%) 1 (5%) |
|-----------------------------------------------------------------------------------|
| much_headroom 35 (67%) 8 (36%) Chi-square 0.014 |
+-----------------------------------------------------------------------------------+
Data are presented as mean (SD) or geometric mean (×/geometric SD) or median (IQR) for
continuous measures, and n (%) for categorical measures.

增加 statistic 选项,每行结果后增加了检验统计量值。

. table1_mc, by(foreign) vars(weight contn %5.0f \ price contln %5.0f %4.2f ///
> \ mpg conts %5.0f \ rep78 cate \ much_headroom bin) onecol nospace missing test statistic

+--------------------------------------------+
| factor N_0 N_1 m_0 m_1 |
|--------------------------------------------|
| Weight (lbs.) 52 22 0 0 |
|--------------------------------------------|
| Price 52 22 0 0 |
|--------------------------------------------|
| Mileage (mpg) 52 22 0 0 |
|--------------------------------------------|
| Repair record 1978 52 22 0 0 |
|--------------------------------------------|
| much_headroom 52 22 0 0 |
+--------------------------------------------+
N_ ... #records used below, m_ ... #records not used

+-----------------------------------------------------------------------------------------------+
| Domestic Foreign Test Statistic p-value |
|-----------------------------------------------------------------------------------------------|
| N=52 N=22 |
|-----------------------------------------------------------------------------------------------|
| Weight (lbs.) 3317 (695) 2316 (433) Ind. t test t(72)= 6.25 <0.001 |
|-----------------------------------------------------------------------------------------------|
| Price 5534 (×/1.50) 5959 (×/1.44) Ind. t test, logged data t(72)= -0.74 0.46 |
|-----------------------------------------------------------------------------------------------|
| Mileage (mpg) 19 (17-22) 25 (21-28) Wilcoxon rank-sum Z= -3.10 0.002 |
|-----------------------------------------------------------------------------------------------|
| Repair record 1978 Fisher's exact N/A <0.001 |
| 1 2 (4%) 0 (0%) |
| 2 8 (15%) 0 (0%) |
| 3 27 (52%) 3 (14%) |
| 4 9 (17%) 9 (41%) |
| 5 2 (4%) 9 (41%) |
| Missing 4 (8%) 1 (5%) |
|-----------------------------------------------------------------------------------------------|
| much_headroom 35 (67%) 8 (36%) Chi-square Chi2(1)= 6.08 0.014 |
+-----------------------------------------------------------------------------------------------+
Data are presented as mean (SD) or geometric mean (×/geometric SD) or median (IQR) for continuous
measures, and n (%) for categorical measures.

增加 saving 选项将结果保存至指定位置,并利用 clear 选项将 Stata 内存中数据用输出结果替换。

. table1_mc, by(foreign) vars(weight contn %5.0f \ price contln %5.0f %4.2f ///
> \ mpg conts %5.0f \ rep78 cate \ much_headroom bin) onecol nospace ///
> missing test statistic saving("Table 1.xlsx", replace) clear
file Table 1.xlsx saved

3. 相关推文

Note:产生如下推文列表的 Stata 命令为:
lianxh 统计, m
安装最新版 lianxh 命令:
ssc install lianxh, replace

  • 专题:Stata入门
    • 25常见种误区:P值、置信区间和统计功效
  • 专题:Stata命令
    • Stata:描述性统计分析新命令-dstat
  • 专题:Stata资源
    • 在线统计课本分享:online-statistics-book
  • 专题:数据处理
    • Stata数据处理:清洗中国城市建设统计年鉴
    • Stata:变量非重复值统计-distinct
    • Stata:mtab2-将二维统计表存储为矩阵
    • 滚动吧统计量!Stata数据处理
    • Stata数据处理:统计组内非重复值个数
  • 专题:Stata绘图
    • 常用科研统计绘图工具介绍
  • 专题:结果输出
    • Stata结果输出:addest自己定制输出的统计量
    • Stata结果输出-addest:自己添加统计量
    • baselinetable命令:论文基本统计量表格输出到Excel和Word
    • sumup:快速呈现分组统计量
    • Stata:一文搞定论文表1——基本统计量列表
  • 专题:回归分析
    • 抛弃p值?经济显著性与统计显著性
  • 专题:机器学习
    • Lasso:拉索中如何做统计推断

课程推荐:因果推断实用计量方法
主讲老师:丘嘉平教授
🍓 课程主页https://gitee.com/lianxh/YGqjp

New! Stata 搜索神器:lianxhsongbl  GIF 动图介绍
搜: 推文、数据分享、期刊论文、重现代码 ……
👉 安装:
. ssc install lianxh
. ssc install songbl
👉  使用:
. lianxh DID 倍分法
. songbl all

🍏 关于我们

  • 连享会 ( www.lianxh.cn,推文列表) 由中山大学连玉君老师团队创办,定期分享实证分析经验。
  • 直通车: 👉【百度一下: 连享会】即可直达连享会主页。亦可进一步添加 「知乎」,「b 站」,「面板数据」,「公开课」 等关键词细化搜索。


您可能也对以下帖子感兴趣

文章有问题?点此查看未经处理的缓存