开发者:上海品职教育科技有限公司 隐私政策详情

应用版本:4.2.11(IOS)|3.2.5(安卓)APP下载

CC · 2023年10月01日

K-fold

* 问题详情,请 查看题干

NO.PZ201512020300000507

问题如下:

Assuming a Classification and Regression Tree (CART) model is used to accomplish Step 3, which of the following is most likely to result in model overfitting?

选项:

A.

Using the k-fold cross validation method

B.

Including an overfitting penalty (i.e., regularization term).

C.

Using a fitting curve to select a model with low bias error and high variance error.

解释:

C is correct. A fitting curve shows the trade-off between bias error and variance error for various potential models. A model with low bias error and high variance error is, by definition, overfitted.
A is incorrect, because there are two common methods to reduce over
fitting, one of which is proper data sampling and cross-validation. K-fold cross validation is such a method for estimating out-of-sample error directly by determining the error in validation samples.
B is incorrect, because there are two common methods to reduce over
fitting, one of which is preventing the algorithm from getting too complex during selection and training, which requires estimating an overfitting penalty.

老师,这个K-fold检验不是为了解决数据量不够或说undefitting的问题么?它不会造成over fitting?

1 个答案

星星_品职助教 · 2023年10月01日

同学你好,

K-fold cross validation的本质是一种交叉验证(cross validation)。交叉验证的目的是为了解决overfitting的问题,即避免模型只在training set中拟合的很好,在其他数据集中却拟合糟糕导致没有预测能力。