其他
mlr3:模型比较
前面一篇介绍了性能评价,今天学习多个模型比较。
benchmark
用于比较多个模型,比如多个模型在单个任务的表现、多个模型在多个任务的表现等,使用不同的预处理进行的多个模型的表现等!
首先创建一个design
mlr3
通过design进行比较多个模型,这个design是包含Task
、Learner
、Resampling
的组合。
library(mlr3verse)
# 使用benchmark_grid函数创建
design <- benchmark_grid(
tasks = tsks(c("spam", "german_credit", "sonar")),
learners = lrns(c("classif.ranger", "classif.rpart", "classif.featureless"), predict_type = "prob"),
resamplings = rsmps(c("holdout", "cv"))
)
print(design)
## task learner resampling
## 1: <TaskClassif[49]> <LearnerClassifRanger[37]> <ResamplingHoldout[19]>
## 2: <TaskClassif[49]> <LearnerClassifRanger[37]> <ResamplingCV[19]>
## 3: <TaskClassif[49]> <LearnerClassifRpart[37]> <ResamplingHoldout[19]>
## 4: <TaskClassif[49]> <LearnerClassifRpart[37]> <ResamplingCV[19]>
## 5: <TaskClassif[49]> <LearnerClassifFeatureless[37]> <ResamplingHoldout[19]>
## 6: <TaskClassif[49]> <LearnerClassifFeatureless[37]> <ResamplingCV[19]>
## 7: <TaskClassif[49]> <LearnerClassifRanger[37]> <ResamplingHoldout[19]>
## 8: <TaskClassif[49]> <LearnerClassifRanger[37]> <ResamplingCV[19]>
## 9: <TaskClassif[49]> <LearnerClassifRpart[37]> <ResamplingHoldout[19]>
## 10: <TaskClassif[49]> <LearnerClassifRpart[37]> <ResamplingCV[19]>
## 11: <TaskClassif[49]> <LearnerClassifFeatureless[37]> <ResamplingHoldout[19]>
## 12: <TaskClassif[49]> <LearnerClassifFeatureless[37]> <ResamplingCV[19]>
## 13: <TaskClassif[49]> <LearnerClassifRanger[37]> <ResamplingHoldout[19]>
## 14: <TaskClassif[49]> <LearnerClassifRanger[37]> <ResamplingCV[19]>
## 15: <TaskClassif[49]> <LearnerClassifRpart[37]> <ResamplingHoldout[19]>
## 16: <TaskClassif[49]> <LearnerClassifRpart[37]> <ResamplingCV[19]>
## 17: <TaskClassif[49]> <LearnerClassifFeatureless[37]> <ResamplingHoldout[19]>
## 18: <TaskClassif[49]> <LearnerClassifFeatureless[37]> <ResamplingCV[19]>
然后进行比较,也是1行代码即可!
bmr <- benchmark(design, store_models = T)
## INFO [20:47:16.049] [mlr3] Running benchmark with 99 resampling iterations
## INFO [20:47:16.053] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 1/10)
## INFO [20:47:16.070] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 10/10)
## INFO [20:47:16.280] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 3/10)
## INFO [20:47:16.290] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 6/10)
## INFO [20:47:16.300] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 9/10)
## INFO [20:47:16.309] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 2/10)
## INFO [20:47:16.506] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 8/10)
## INFO [20:47:18.070] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 8/10)
## INFO [20:47:18.149] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 1/10)
## INFO [20:47:18.159] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 7/10)
## INFO [20:47:18.176] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 3/10)
## INFO [20:47:18.193] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 1/1)
## INFO [20:47:18.203] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 8/10)
## INFO [20:47:18.400] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 4/10)
## INFO [20:47:18.410] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 4/10)
## INFO [20:47:18.486] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 5/10)
## INFO [20:47:19.873] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 6/10)
## INFO [20:47:19.950] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 5/10)
## INFO [20:47:19.967] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 10/10)
## INFO [20:47:19.976] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 1/10)
## INFO [20:47:19.994] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 8/10)
## INFO [20:47:20.002] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 10/10)
## INFO [20:47:20.019] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 4/10)
## INFO [20:47:20.027] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 9/10)
## INFO [20:47:20.103] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 8/10)
## INFO [20:47:20.113] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 3/10)
## INFO [20:47:20.189] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 1/10)
## INFO [20:47:20.379] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 4/10)
## INFO [20:47:20.397] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 6/10)
## INFO [20:47:20.423] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 7/10)
## INFO [20:47:20.440] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 5/10)
## INFO [20:47:20.448] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 10/10)
## INFO [20:47:20.456] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 6/10)
## INFO [20:47:20.473] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 3/10)
## INFO [20:47:20.703] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 3/10)
## INFO [20:47:20.714] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 6/10)
## INFO [20:47:20.731] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 1/1)
## INFO [20:47:20.738] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 7/10)
## INFO [20:47:20.748] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 9/10)
## INFO [20:47:20.794] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 5/10)
## INFO [20:47:20.989] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 1/1)
## INFO [20:47:21.006] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 2/10)
## INFO [20:47:21.024] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 4/10)
## INFO [20:47:21.225] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 1/10)
## INFO [20:47:21.234] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 9/10)
## INFO [20:47:22.618] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 1/10)
## INFO [20:47:22.695] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 9/10)
## INFO [20:47:22.704] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 1/10)
## INFO [20:47:24.109] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 4/10)
## INFO [20:47:24.117] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 2/10)
## INFO [20:47:25.675] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 8/10)
## INFO [20:47:25.726] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 3/10)
## INFO [20:47:27.115] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 1/1)
## INFO [20:47:28.155] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 5/10)
## INFO [20:47:28.165] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 3/10)
## INFO [20:47:28.186] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 6/10)
## INFO [20:47:28.233] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 10/10)
## INFO [20:47:28.458] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 7/10)
## INFO [20:47:29.832] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 6/10)
## INFO [20:47:29.841] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 5/10)
## INFO [20:47:29.859] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 3/10)
## INFO [20:47:29.878] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 2/10)
## INFO [20:47:29.898] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 7/10)
## INFO [20:47:29.950] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 10/10)
## INFO [20:47:31.332] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 9/10)
## INFO [20:47:31.342] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 8/10)
## INFO [20:47:31.360] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 10/10)
## INFO [20:47:31.439] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 2/10)
## INFO [20:47:31.513] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 4/10)
## INFO [20:47:32.917] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 7/10)
## INFO [20:47:32.994] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 8/10)
## INFO [20:47:33.003] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 3/10)
## INFO [20:47:33.194] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 1/10)
## INFO [20:47:33.212] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 2/10)
## INFO [20:47:33.221] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 10/10)
## INFO [20:47:33.495] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 8/10)
## INFO [20:47:33.512] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 9/10)
## INFO [20:47:33.704] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 4/10)
## INFO [20:47:33.753] [mlr3] Applying learner 'classif.ranger' on task 'spam' (iter 6/10)
## INFO [20:47:35.136] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 10/10)
## INFO [20:47:35.147] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 6/10)
## INFO [20:47:35.332] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 5/10)
## INFO [20:47:35.380] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 7/10)
## INFO [20:47:35.581] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 1/1)
## INFO [20:47:35.643] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 2/10)
## INFO [20:47:35.653] [mlr3] Applying learner 'classif.ranger' on task 'german_credit' (iter 1/1)
## INFO [20:47:35.826] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 7/10)
## INFO [20:47:35.835] [mlr3] Applying learner 'classif.ranger' on task 'sonar' (iter 5/10)
## INFO [20:47:35.910] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 1/1)
## INFO [20:47:35.951] [mlr3] Applying learner 'classif.rpart' on task 'german_credit' (iter 9/10)
## INFO [20:47:35.969] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 5/10)
## INFO [20:47:35.980] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 1/1)
## INFO [20:47:35.997] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 4/10)
## INFO [20:47:36.257] [mlr3] Applying learner 'classif.featureless' on task 'sonar' (iter 1/1)
## INFO [20:47:36.264] [mlr3] Applying learner 'classif.featureless' on task 'spam' (iter 2/10)
## INFO [20:47:36.274] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 1/10)
## INFO [20:47:36.322] [mlr3] Applying learner 'classif.rpart' on task 'spam' (iter 2/10)
## INFO [20:47:36.366] [mlr3] Applying learner 'classif.featureless' on task 'german_credit' (iter 7/10)
## INFO [20:47:36.375] [mlr3] Applying learner 'classif.rpart' on task 'sonar' (iter 9/10)
## INFO [20:47:36.414] [mlr3] Finished benchmark
查看模型的表现,使用多种度量指标:
measures <- msrs(c("classif.acc", "classif.mcc"))
tab <- bmr$aggregate(measures)
print(tab)
## nr resample_result task_id learner_id resampling_id
## 1: 1 <ResampleResult[22]> spam classif.ranger holdout
## 2: 2 <ResampleResult[22]> spam classif.ranger cv
## 3: 3 <ResampleResult[22]> spam classif.rpart holdout
## 4: 4 <ResampleResult[22]> spam classif.rpart cv
## 5: 5 <ResampleResult[22]> spam classif.featureless holdout
## 6: 6 <ResampleResult[22]> spam classif.featureless cv
## 7: 7 <ResampleResult[22]> german_credit classif.ranger holdout
## 8: 8 <ResampleResult[22]> german_credit classif.ranger cv
## 9: 9 <ResampleResult[22]> german_credit classif.rpart holdout
## 10: 10 <ResampleResult[22]> german_credit classif.rpart cv
## 11: 11 <ResampleResult[22]> german_credit classif.featureless holdout
## 12: 12 <ResampleResult[22]> german_credit classif.featureless cv
## 13: 13 <ResampleResult[22]> sonar classif.ranger holdout
## 14: 14 <ResampleResult[22]> sonar classif.ranger cv
## 15: 15 <ResampleResult[22]> sonar classif.rpart holdout
## 16: 16 <ResampleResult[22]> sonar classif.rpart cv
## 17: 17 <ResampleResult[22]> sonar classif.featureless holdout
## 18: 18 <ResampleResult[22]> sonar classif.featureless cv
## iters classif.acc classif.mcc
## 1: 1 0.9445893 0.8835453
## 2: 10 0.9495723 0.8943582
## 3: 1 0.8917862 0.7725102
## 4: 10 0.8934967 0.7765629
## 5: 1 0.6069100 0.0000000
## 6: 10 0.6059511 0.0000000
## 7: 1 0.7567568 0.4358851
## 8: 10 0.7670000 0.3927548
## 9: 1 0.6996997 0.2847394
## 10: 10 0.7290000 0.2984376
## 11: 1 0.6516517 0.0000000
## 12: 10 0.7000000 0.0000000
## 13: 1 0.7971014 0.6247458
## 14: 10 0.8221429 0.6390361
## 15: 1 0.6956522 0.3981439
## 16: 10 0.6545238 0.3098052
## 17: 1 0.4782609 0.0000000
## 18: 10 0.5340476 0.0000000
可视化结果
library(ggplot2)
autoplot(bmr) + theme_bw() +
theme(axis.text.x = element_text(angle = 45,hjust = 1))
上面的图给出了多个模型在不同数据集中的平均表现,我们也可以查看多个模型在某一个特定数据集中的表现:
bmr_german <- bmr$clone(deep = T)$filter(task_ids = "german_credit",resampling_ids = "holdout")
autoplot(bmr_german, type = "roc")
当然也可以只提取其中一个结果:
tab <- bmr$aggregate(measures)
rr <- tab[task_id == "german_credit" & learner_id == "classif.ranger"]$resample_result[[1]]
print(rr)
## <ResampleResult> of 1 iterations
## * Task: german_credit
## * Learner: classif.ranger
## * Warnings: 0 in 0 iterations
## * Errors: 0 in 0 iterations
查看一个结果的表现:
rr$aggregate(msr("classif.auc"))
## classif.auc
## 0.8085969
合并多个BenchmarkResult
,比如在2台电脑上做了2个不同的benchmarks,可以直接合并成一个更大的对象:
task <- tsk("iris")
resampling <- rsmp("holdout")$instantiate(task)
rr1 <- resample(task, lrn("classif.rpart"), resampling)
## INFO [20:47:40.585] [mlr3] Applying learner 'classif.rpart' on task 'iris' (iter 1/1)
rr2 <- resample(task, lrn("classif.featureless"), resampling)
## INFO [20:47:40.606] [mlr3] Applying learner 'classif.featureless' on task 'iris' (iter 1/1)
# 通过以下代码合并结果
bmr1 <- as_benchmark_result(rr1)
bmr2 <- as_benchmark_result(rr2)
bmr1$combine(bmr2)
bmr1
## <BenchmarkResult> of 2 rows with 2 resampling runs
## nr task_id learner_id resampling_id iters warnings errors
## 1 iris classif.rpart holdout 1 0 0
## 2 iris classif.featureless holdout 1 0 0
以上就是今天的内容,希望对你有帮助哦!欢迎点赞、在看、关注、转发!
欢迎在评论区留言或直接添加我的微信!
欢迎关注我的公众号:医学和生信笔记
“医学和生信笔记 公众号主要分享:1.医学小知识、肛肠科小知识;2.R语言和Python相关的数据分析、可视化、机器学习等;3.生物信息学学习资料和自己的学习笔记!