其他
report包:助你自动出统计报告!
点击下方公众号,回复资料分享,收获惊喜
简介
作图和建立模型的过程有时候还是比较享受的,不过写统计结果和报告却显得有点无聊,一个不小心还容易摘录错误。如果这时候有人替我们搞定,那就太好了!
小编引荐一个小机器人,它的名字叫report[1],可以帮助我们解决这部分脏活累活,那么,我们人类就可以有更多时间“不务正业”了。
安装
# install.packages("remotes")
# remotes::install_github("easystats/report")
# install.packages("palmerpenguins")
library("report")
library("palmerpenguins") # for data
案例介绍
1. 统计描述
以下例子中将会使用到palmerpenguins包中的数据集"penguins",首先将其中的缺失值剔除,再保存到新建的mydata中:
# 剔除缺失值
mydata <- na.omit(penguins
# 统计描述
report(mydata)
# 下面是输出的结果(下同)
The data contains 333 observations of the following 8 variables:
- species: 3 levels, namely Adelie (n = 146, 43.84%), Chinstrap (n = 68, 20.42%) and Gentoo (n = 119, 35.74%)
- island: 3 levels, namely Biscoe (n = 163, 48.95%), Dream (n = 123, 36.94%) and Torgersen (n = 47, 14.11%)
- bill_length_mm: n = 333, Mean = 43.99, SD = 5.47, Median = 44.50, MAD = 6.97, range: [32.10, 59.60], Skewness = 0.05, Kurtosis = -0.88, 0% missing
- bill_depth_mm: n = 333, Mean = 17.16, SD = 1.97, Median = 17.30, MAD = 2.22, range: [13.10, 21.50], Skewness = -0.15, Kurtosis = -0.89, 0% missing
- flipper_length_mm: n = 333, Mean = 200.97, SD = 14.02, Median = 197.00, MAD = 16.31, range: [172, 231], Skewness = 0.36, Kurtosis = -0.96, 0% missing
- body_mass_g: n = 333, Mean = 4207.06, SD = 805.22, Median = 4050.00, MAD = 889.56, range: [2700, 6300], Skewness = 0.47, Kurtosis = -0.73, 0% missing
- sex: 2 levels, namely female (n = 165, 49.55%) and male (n = 168, 50.45%)
- year: n = 333, Mean = 2008.04, SD = 0.81, Median = 2008.00, MAD = 1.48, range: [2007, 2009], Skewness = -0.08, Kurtosis = -1.48, 0% missing
在分类变量中,报告了每个类别的样本量以及对应的百分比。
在连续变量中,报告了许多统计值,包括n(样本量),mean(均数),sd(标准差),median(中位数),MAD(中位数绝对偏差:一个离散程度的指标),range(范围),skewness(偏度),kurtosis (峰度),missing(缺失值)。
2. 相关分析
cor_test <- cor.test(mydata$bill_length_mm, mydata$body_mass_g)report(cor_test)
# Effect sizes were labelled following Funder's (2019) recommendations.
# The Pearson's product-moment correlation between mydata$bill_length_mm and mydata$body_mass_g is positive, statistically significant, and very large (r = 0.59, 95% CI [0.51, 0.66], t(331) = 13.28, p < .001)
还贴心的注明,效应量(effect size)是根据Funder (2019)给出的。然后,各位复制一下拿去用吧。
3. t检验
继续做t检验的报告:
t_test <- t.test(mydata$bill_length_mm ~ mydata$sex)report(t_test)
# Effect sizes were labelled following Cohen's (1988) recommendations.
# The Welch Two Sample t-test testing the difference of mydata$bill_length_mm by mydata$sex (mean in group female = 42.10, mean in group male = 45.85) suggests that the effect is positive, statistically significant, and medium (difference = 3.76, 95% CI [-4.87, -2.65], t(329.29) = -6.67, p < .001; Cohen's d = -0.73, 95% CI [-0.95, -0.51])
4. 方差分析
anova_test <- aov(bill_length_mm ~ species + flipper_length_mm, data = mydata)report(anova_test)
# The ANOVA (formula: bill_length_mm ~ species + flipper_length_mm) suggests that:
# - The main effect of species is statistically significant and large (F(2, 329) = 519.86, p < .001; Eta2 (partial) = 0.76, 90% CI [0.73, 0.79])
# - The main effect of flipper_length_mm is statistically significant and large (F(1, 329) = 102.80, p < .001; Eta2 (partial) = 0.24, 90% CI [0.18, 0.30])
# Effect sizes were labelled following Field's (2013) recommendations.
5. 线性回归
做线性回归的统计报告:
linear_model <- lm(bill_length_mm ~ sex + flipper_length_mm, data = mydata)report(linear_model)
# We fitted a linear model (estimated using OLS) to predict bill_length_mm with sex and flipper_length_mm (formula: bill_length_mm ~ sex + flipper_length_mm). The model explains a statistically significant and substantial proportion of variance (R2 = 0.46, F(2, 330) = 140.67, p < .001, adj. R2 = 0.46). The model's intercept, corresponding to sex = female and flipper_length_mm = 0, is at -4.47 (95% CI [-10.83, 1.90], t(330) = -1.38, p = 0.168). Within this model:
# - The effect of sex [male] is statistically significant and positive (beta = 2.07, 95% CI [1.17, 2.97], t(330) = 4.54, p < .001; Std. beta = 0.38, 95% CI [0.21, 0.54])
# - The effect of flipper_length_mm is statistically significant and positive (beta = 0.24, 95% CI [0.20, 0.27], t(330) = 14.46, p < .001; Std. beta = 0.60, 95% CI [0.52, 0.69])
# Standardized parameters were obtained by fitting the model on a standardized version of the dataset.
6. Logistic回归
logit_model <- glm(vs ~ mpg + cyl, data = mtcars, family = "binomial")report(logit_model)
# We fitted a logistic model (estimated using ML) to predict vs with mpg and cyl (formula: vs ~ mpg + cyl). The model's explanatory power is substantial (Tjur's R2 = 0.67). The model's intercept, corresponding to mpg = 0 and cyl = 0, is at 15.97 (95% CI [-2.71, 44.69], p = 0.147). Within this model:
# - The effect of mpg is statistically non-significant and negative (beta = -0.16, 95% CI [-0.71, 0.34], p = 0.496; Std. beta = -0.98, 95% CI [-4.28, 2.03])
# - The effect of cyl is statistically significant and negative (beta = -2.15, 95% CI [-5.19, -0.54], p < .05; Std. beta = -3.84, 95% CI [-9.26, -0.97])
# Standardized parameters were obtained by fitting the model on a standardized version of the dataset. 95% Confidence Intervals (CIs) and p-values were computed using
小编有话说
大家五一快乐呀!小编五月要去杭州导师那做学术啦,可能呆一个月。不过公众号还是不会停的,继续学习,继续分享。也祝大家五月科研顺利,身体健康!祝大家:
所遇皆温暖;所想皆成真;所求皆所愿;所盼皆所期。
参考资料
小编有话说
大家五一快乐呀!小编五月要去杭州导师那做学术啦,可能呆一个月。不过公众号还是不会停的,继续学习,继续分享。也祝大家五月科研顺利,身体健康!祝大家:
所遇皆温暖;所想皆成真;所求皆所愿;所盼皆所期。
report: https://easystats.github.io/report/articles/report.html
推荐:可以保存以下照片,在b站扫该二维码,或者b站搜索【庄闪闪】观看R可视化系列视频教程。
相关推荐
R可视乎 | 双变量映射地图绘制
R可视乎|克利夫兰点图系列