Python Pingouin:搞定各种假设检验和统计模型 !
↓推荐关注↓
原文作者:Giannis Tolios
来源:我得学城
1. 假设检验的基本步骤
t检验
、方差分析(ANOVA)
、卡方检验
、克鲁斯卡尔-沃利斯检验(Kruskal-Wallis)
等等。2. Pingouin库
3. 种子数据集
种子数据集
(https://archive.ics.uci.edu/ml/datasets/seeds)。4. ANOVA的案例研究
Pingouin
库和种子数据集,通过实际的假设检验案例来进行探讨。零假设
和备择假设
:
:所有小麦品种的紧密度均值相同。 :小麦品种的紧密度均值不同。
groupby()
函数将数据集行按小麦品种分组,并计算每列的均值。boxplot()
函数为紧密度变量创建箱线图。kdeplot()
为每个小麦品种创建一个KDE图来直观地评估这一点。normality()
函数运行Shapiro-Wilk正态性测试 ,确认所有样本都是正态分布的。qqplot()
函数轻松为各种理论分布创建Q-Q图。此外,还在图中包括了一个最佳拟合线,基于线性回归模型。homoscedasticity()
函数让我们通过使用Levene测试轻松评估这一点,这是评估方差相等的典型方法 。5. 结论
Pingouin
库和种子数据集
介绍了统计假设检验的基本概念。希望能帮助大家了解这些概念。参考文献
[1] Biau, David Jean, Brigitte M. Jolles, and Raphaël Porcher. “P value and the theory of hypothesis testing: an explanation for new researchers.” Clinical Orthopaedics and Related Research® 468.3 (2010): 885–892.
[2] Lenhard, Johannes. “Models and statistical inference: The controversy between Fisher and Neyman–Pearson.” The British journal for the philosophy of science (2020).
[3] Vallat, Raphael. “Pingouin: statistics in Python.” J. Open Source Softw. 3.31 (2018): 1026.
[4] Charytanowicz, Małgorzata, et al. “Complete gradient clustering algorithm for features analysis of x-ray images.” Information technologies in biomedicine (2010): 15–24.
[5] Scheffe, Henry. The analysis of variance. Vol. 72. John Wiley & Sons, 1999.
[6] Shapiro, Samuel Sanford, and Martin B. Wilk. “An analysis of variance test for normality (complete samples).” Biometrika 52.3/4 (1965): 591–611.
[7] Schmider, Emanuel, et al. “Is it really robust? Reinvestigating the robustness of ANOVA against violations of the normal distribution assumption.” Methodology: European Journal of Research Methods for the Behavioral and Social Sciences 6.4 (2010): 147.
[8] Levene, Howard. “Robust tests for equality of variances.” Contributions to probability and statistics. Essays in honor of Harold Hotelling (1961): 279–292.
[9] Liu, Hangcheng. “Comparing Welch ANOVA, a Kruskal-Wallis test, and traditional ANOVA in case of heterogeneity of variance.” (2015).
[10] Games, Paul A., and John F. Howell. “Pairwise multiple comparison procedures with unequal n’s and/or variances: a Monte Carlo study.” Journal of Educational Statistics 1.2 (1976): 113–125.
- EOF -
加主页君微信,不仅Python技能+1
主页君日常还会在个人微信分享Python相关工具、资源和精选技术文章,不定期分享一些有意思的活动、岗位内推以及如何用技术做业余项目
加个微信,打开一扇窗
觉得本文对你有帮助?请分享给更多人
推荐关注「Python开发者」,提升Python技能
点赞和在看就是最大的支持❤️