Cross validation Vs. Train Validate TestCross validation Vs. Train Validate Test cont'dWhy k-fold cross validation (CV) overfits? Or why discrepancy occurs between CV and test set?K-Fold Cross validation confusion?k-fold cross-validation: model selection or variation in models when using k-fold cross validationhow to prepare data for cross validation in mnist dataset?Terminology - cross-validation, testing and validation set for classification taskSome confusions on Model selection using cross-validation approachTarget encoding with cross validationConfusion in applying k-fold cross validation to datasetShould I perform cross validation only on the training set?Cross validation test and train errors

Limit of sequence (by definiton)

C# Toy Robot Simulator

Will I be allowed to enter the US after living there illegally then legally in the past?

Would a spacecraft carry arc welding supplies?

Minimum number of turns to capture all pieces in Checkers

Grade changes with auto grader

Does Turkey make the "structural steel frame" for the F-35 fighter?

when to use がつ or げつ readings for 月？

Conveying the idea of "tricky"

Most optimal hallways with random gravity inside?

Why do military jets sometimes have elevators in a depressed position when parked?

Did I Traumatize My Puppy?

The Immortal Jellyfish

What is the following style of typography called?

Matrix class in C#

Is it allowed to let the engine of an aircraft idle without a pilot in the plane. (For both helicopters and aeroplanes)

Who inspired the character Geordi La Forge?

When was the famous "sudo warning" introduced? Under what background? By whom?

Moving through the space of an invisible enemy creature in combat

Inconsistency of PE to KE conversion in moving reference frames

How to pronounce correctly [b] and [p]? As well [t]/[d] and [k]/[g]

Linux Commands in Python

Cutting a 4.5m long 2x6 in half with a circular saw

Why were germanium diodes so fast and germanium transistors so slow?

Cross validation Vs. Train Validate Test

Cross validation Vs. Train Validate Test cont'dWhy k-fold cross validation (CV) overfits? Or why discrepancy occurs between CV and test set?K-Fold Cross validation confusion?k-fold cross-validation: model selection or variation in models when using k-fold cross validationhow to prepare data for cross validation in mnist dataset?Terminology - cross-validation, testing and validation set for classification taskSome confusions on Model selection using cross-validation approachTarget encoding with cross validationConfusion in applying k-fold cross validation to datasetShould I perform cross validation only on the training set?Cross validation test and train errors

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty
margin-bottom:0;

I have a doubt regarding the cross validation approach and train-validation-test approach.

I was told that I can split a dataset into 3 parts:

Train: we train the model.

Validation: we validate and adjust model parameters.

Test: never seen before data. We get an unbiased final estimate.

So far, we have split into three subsets. Until here everything is okay. Attached is a picture:

enter image description here

Then I came across the K-fold cross validation approach and what I don’t understand is how I can relate the Test subset from the above approach. Meaning, in 5-fold cross validation we split the data into 5 and in each iteration the non-validation subset is used as the train subset and the validation is used as test set. But, in terms of the above mentioned example, where is the validation part in k-fold cross validation? We either have validation or test subset.

When I refer myself to train/validation/test, that “test” is the scoring:

Model development is generally a two-stage process. The first stage is training and validation, during which you apply algorithms to data for which you know the outcomes to uncover patterns between its features and the target variable. The second stage is scoring, in which you apply the trained model to a new dataset. Then, it returns outcomes in the form of probability scores for classification problems and estimated averages for regression problems. Finally, you deploy the trained model into a production application or use the insights it uncovers to improve business processes.

As an example, I found the Sci-Kit learn cross validation version as you can see in the following picture:

enter image description here

When doing the splitting, you can see that the algorithm that they give you, only takes care of the training part of the original dataset. So, in the end, we are not able to perform the Final evaluation process as you can see in the attached picture.

Thank you!

scikitpage

edited May 26 at 10:06

asked May 26 at 6:15

NaveganTeX

1235 bronze badges

add a comment
|

I have a doubt regarding the cross validation approach and train-validation-test approach.

I was told that I can split a dataset into 3 parts:

Train: we train the model.

Validation: we validate and adjust model parameters.

Test: never seen before data. We get an unbiased final estimate.

So far, we have split into three subsets. Until here everything is okay. Attached is a picture:

enter image description here

When I refer myself to train/validation/test, that “test” is the scoring:

As an example, I found the Sci-Kit learn cross validation version as you can see in the following picture:

enter image description here

Thank you!

scikitpage

edited May 26 at 10:06

asked May 26 at 6:15

NaveganTeX

1235 bronze badges

add a comment
|

I have a doubt regarding the cross validation approach and train-validation-test approach.

I was told that I can split a dataset into 3 parts:

Train: we train the model.

Validation: we validate and adjust model parameters.

Test: never seen before data. We get an unbiased final estimate.

So far, we have split into three subsets. Until here everything is okay. Attached is a picture:

enter image description here

When I refer myself to train/validation/test, that “test” is the scoring:

As an example, I found the Sci-Kit learn cross validation version as you can see in the following picture:

enter image description here

Thank you!

scikitpage

edited May 26 at 10:06

asked May 26 at 6:15

NaveganTeX

1235 bronze badges

I have a doubt regarding the cross validation approach and train-validation-test approach.

I was told that I can split a dataset into 3 parts:

Train: we train the model.

Validation: we validate and adjust model parameters.

Test: never seen before data. We get an unbiased final estimate.

So far, we have split into three subsets. Until here everything is okay. Attached is a picture:

enter image description here

When I refer myself to train/validation/test, that “test” is the scoring:

As an example, I found the Sci-Kit learn cross validation version as you can see in the following picture:

enter image description here

Thank you!

scikitpage

machine-learning cross-validation

edited May 26 at 10:06

asked May 26 at 6:15

NaveganTeX

1235 bronze badges

edited May 26 at 10:06

asked May 26 at 6:15

NaveganTeX

1235 bronze badges

edited May 26 at 10:06

asked May 26 at 6:15

NaveganTeX

1235 bronze badges

asked May 26 at 6:15

NaveganTeX

1235 bronze badges

asked May 26 at 6:15

NaveganTeX

1235 bronze badges

add a comment
|

4 Answers
4

active

oldest

votes

If k-fold cross-validation is used to optimize the model paremeters, the training set is split into k parts. Training happens k times, each time leaving out a different part of the training set. Typically, the error of these k-models is averaged. This is done for each of the model parameters to be tested, and the model with the lowest error is chosen. The test set has not been used so far.

Only at the very end the test set is used to test the performance of the (optimized) model.

# example: k-fold cross validation for hyperparameter optimization (k=3)

original data split into training and test set:

|---------------- train ---------------------| |--- test ---|

cross-validation: test set is not used, error is calculated from
validation set (k-times) and averaged:

|---- train ------------------|- validation -| |--- test ---|
|---- train ---|- validation -|---- train ---| |--- test ---|
|- validation -|----------- train -----------| |--- test ---|

final measure of model performance: model is trained on all training data
and the error is calculated from test set:

|---------------- train ---------------------|--- test ---|

In some cases, k-fold cross-validation is used on the entire data set if no parameter optimization is needed (this is rare, but it happens). In this case there would not be a validation set and the k parts are used as a test set one by one. The error of each of these k tests is typically averaged.

# example: k-fold cross validation

|----- test -----|------------ train --------------|
|----- train ----|----- test -----|----- train ----|
|------------ train --------------|----- test -----|

edited May 26 at 9:58

answered May 26 at 9:50

louic

2531 silver badge7 bronze badges

$begingroup$
Thank you very much for the example, Louic! This is what I was able to catch from your answer. So, initially you split the dataset into train and test. Then you put away the TEST set. Then you run CV on the training data and calculate the error based on the validation folds. That’s awesome..., here we adjust the model etc. But, when it comes to the final measure of model performance, how do you implement that? All the libraries I have been working with only provide the cross validation part but they don’t take into account the TEST split.
$endgroup$
– NaveganTeX
May 26 at 10:19

1

$begingroup$
In scikit-learn you can use train_test_split to split into a training and test set. After that a GridSearchCV on the training set takes care of parameter/model optimisation. Finally, you can calculate an error using the predictions on the test set (eg. using roc_auc_score, f1_score, or another appropriate error measure.Does that answer your question?
$endgroup$
– louic
May 26 at 17:56

$begingroup$
It does indeed! I was even more confused because I’m working with recommender systems and it’s a little bit different but now I have a more clear idea. Thank you very much, Louic!
$endgroup$
– NaveganTeX
May 26 at 21:54

add a comment
|

@louic's answer is correct: You split your data in two parts: training and test, and then you use k-fold cross-validation on the training dataset to tune the parameters. This is useful if you have little training data, because you don't have to exclude the validation data from the training dataset.

But I find this comment confusing: "In some cases, k-fold cross-validation is used on the entire dataset ... if no parameter optimization is needed". It's correct that if you don't need any optimization of the model after running it for the first time, the performance on the validation data from your k-fold cross-validation runs gives you an unbiased estimate of the model performance. But this is a strange case indeed. It's much more comment to use k-fold cross validation on the entire dataset, and tune your algorithm. This means you lose the unbiased estimate of the model performance, but this is not always needed.

answered May 26 at 10:20

Paul

6733 silver badges10 bronze badges

1

$begingroup$
Agreed, thanks. I mentioned that second example because I wanted to make it clear that cross-validation is a method (as opposed to an application of a method). To me it seemed that OP had trouble distinguishing the method and its application.
$endgroup$
– louic
May 26 at 11:04

$begingroup$
We could use Nested Cross Validation as an alternative to cross validation + test set
$endgroup$
– NaveganTeX
May 26 at 11:17

2

$begingroup$
Remember that the point of k-fold cross-validation is to be able to use a good amount of test or validation data, without reducing your training dataset too much. If you have sufficient data to have a good size training data set, plus reasonable amounts of test and validation data, then I would suggest not to bother with k-fold cross validation, let alone nested solutions.
$endgroup$
– Paul
May 26 at 11:23

add a comment
|

Excellent question!

I find this train/test/validation confusing (I've been doing ML for 5 years).

Who says your image is correct? Let's go to an ML authority (Sk-Learn)

In general, we do k-Fold on train/test (see Sk-Learn image below).

Technically, you could go one step further and do a Cross Validation on everything (train/test/validation). I've never done it though ...

Good luck!

enter image description here

answered May 26 at 9:09

FrancoSwiss

4393 silver badges7 bronze badges

1

$begingroup$
We will figure it out, someday!!!
$endgroup$
– NaveganTeX
May 26 at 9:19

add a comment
|

Thanks for this post! I was having the same problem understanding K-Fold CV and this has really cleared things up.

I have one further question relating to @louic answer above:

"Training happens k times, each time leaving out a different part of the training set. Typically, the error of these k-models is averaged. This is done for each of the model parameters to be tested, and the model with the lowest error is chosen."

When we say "the model with the lowest error is chosen" which of the following does this mean:

we choose the parameters which has the lowest averaged error and then find the the model with the absolute lowest error from within the K models

we choose the parameters which has the lowest averaged error and then train a new model - on the entire training set, using these parameters?

we choose the parameters which has the lowest averaged error and then take the average weights of the K-models trained

Thanks

Iain

answered Sep 26 at 8:43

Iain MacCormick

165 bronze badges

$begingroup$
I downvoted because this is a question, not an answer. I see you posted the question as a question here. I suggest you delete this answer. datascience.stackexchange.com/questions/60861/…
$endgroup$
– Solomonoff's Secret
Sep 27 at 14:20

add a comment
|

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f52632%2fcross-validation-vs-train-validate-test%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

4 Answers
4

active

oldest

votes

4 Answers
4

active

oldest

votes

Only at the very end the test set is used to test the performance of the (optimized) model.

# example: k-fold cross validation for hyperparameter optimization (k=3)

original data split into training and test set:

|---------------- train ---------------------| |--- test ---|

cross-validation: test set is not used, error is calculated from
validation set (k-times) and averaged:

|---- train ------------------|- validation -| |--- test ---|
|---- train ---|- validation -|---- train ---| |--- test ---|
|- validation -|----------- train -----------| |--- test ---|

final measure of model performance: model is trained on all training data
and the error is calculated from test set:

|---------------- train ---------------------|--- test ---|

# example: k-fold cross validation

|----- test -----|------------ train --------------|
|----- train ----|----- test -----|----- train ----|
|------------ train --------------|----- test -----|

edited May 26 at 9:58

answered May 26 at 9:50

louic

2531 silver badge7 bronze badges

$begingroup$
Thank you very much for the example, Louic! This is what I was able to catch from your answer. So, initially you split the dataset into train and test. Then you put away the TEST set. Then you run CV on the training data and calculate the error based on the validation folds. That’s awesome..., here we adjust the model etc. But, when it comes to the final measure of model performance, how do you implement that? All the libraries I have been working with only provide the cross validation part but they don’t take into account the TEST split.
$endgroup$
– NaveganTeX
May 26 at 10:19

1

$begingroup$
In scikit-learn you can use train_test_split to split into a training and test set. After that a GridSearchCV on the training set takes care of parameter/model optimisation. Finally, you can calculate an error using the predictions on the test set (eg. using roc_auc_score, f1_score, or another appropriate error measure.Does that answer your question?
$endgroup$
– louic
May 26 at 17:56

$begingroup$
It does indeed! I was even more confused because I’m working with recommender systems and it’s a little bit different but now I have a more clear idea. Thank you very much, Louic!
$endgroup$
– NaveganTeX
May 26 at 21:54

add a comment
|

Only at the very end the test set is used to test the performance of the (optimized) model.

# example: k-fold cross validation for hyperparameter optimization (k=3)

original data split into training and test set:

|---------------- train ---------------------| |--- test ---|

cross-validation: test set is not used, error is calculated from
validation set (k-times) and averaged:

|---- train ------------------|- validation -| |--- test ---|
|---- train ---|- validation -|---- train ---| |--- test ---|
|- validation -|----------- train -----------| |--- test ---|

final measure of model performance: model is trained on all training data
and the error is calculated from test set:

|---------------- train ---------------------|--- test ---|

# example: k-fold cross validation

|----- test -----|------------ train --------------|
|----- train ----|----- test -----|----- train ----|
|------------ train --------------|----- test -----|

edited May 26 at 9:58

answered May 26 at 9:50

louic

2531 silver badge7 bronze badges

$begingroup$
Thank you very much for the example, Louic! This is what I was able to catch from your answer. So, initially you split the dataset into train and test. Then you put away the TEST set. Then you run CV on the training data and calculate the error based on the validation folds. That’s awesome..., here we adjust the model etc. But, when it comes to the final measure of model performance, how do you implement that? All the libraries I have been working with only provide the cross validation part but they don’t take into account the TEST split.
$endgroup$
– NaveganTeX
May 26 at 10:19

1

$begingroup$
In scikit-learn you can use train_test_split to split into a training and test set. After that a GridSearchCV on the training set takes care of parameter/model optimisation. Finally, you can calculate an error using the predictions on the test set (eg. using roc_auc_score, f1_score, or another appropriate error measure.Does that answer your question?
$endgroup$
– louic
May 26 at 17:56

$begingroup$
It does indeed! I was even more confused because I’m working with recommender systems and it’s a little bit different but now I have a more clear idea. Thank you very much, Louic!
$endgroup$
– NaveganTeX
May 26 at 21:54

add a comment
|

Only at the very end the test set is used to test the performance of the (optimized) model.

# example: k-fold cross validation for hyperparameter optimization (k=3)

original data split into training and test set:

|---------------- train ---------------------| |--- test ---|

cross-validation: test set is not used, error is calculated from
validation set (k-times) and averaged:

|---- train ------------------|- validation -| |--- test ---|
|---- train ---|- validation -|---- train ---| |--- test ---|
|- validation -|----------- train -----------| |--- test ---|

final measure of model performance: model is trained on all training data
and the error is calculated from test set:

|---------------- train ---------------------|--- test ---|

# example: k-fold cross validation

|----- test -----|------------ train --------------|
|----- train ----|----- test -----|----- train ----|
|------------ train --------------|----- test -----|

edited May 26 at 9:58

answered May 26 at 9:50

louic

2531 silver badge7 bronze badges

Only at the very end the test set is used to test the performance of the (optimized) model.

# example: k-fold cross validation for hyperparameter optimization (k=3)

original data split into training and test set:

|---------------- train ---------------------| |--- test ---|

cross-validation: test set is not used, error is calculated from
validation set (k-times) and averaged:

|---- train ------------------|- validation -| |--- test ---|
|---- train ---|- validation -|---- train ---| |--- test ---|
|- validation -|----------- train -----------| |--- test ---|

final measure of model performance: model is trained on all training data
and the error is calculated from test set:

|---------------- train ---------------------|--- test ---|

# example: k-fold cross validation

|----- test -----|------------ train --------------|
|----- train ----|----- test -----|----- train ----|
|------------ train --------------|----- test -----|

edited May 26 at 9:58

answered May 26 at 9:50

louic

2531 silver badge7 bronze badges

edited May 26 at 9:58

answered May 26 at 9:50

louic

2531 silver badge7 bronze badges

answered May 26 at 9:50

louic

2531 silver badge7 bronze badges

answered May 26 at 9:50

louic

2531 silver badge7 bronze badges

$begingroup$
Thank you very much for the example, Louic! This is what I was able to catch from your answer. So, initially you split the dataset into train and test. Then you put away the TEST set. Then you run CV on the training data and calculate the error based on the validation folds. That’s awesome..., here we adjust the model etc. But, when it comes to the final measure of model performance, how do you implement that? All the libraries I have been working with only provide the cross validation part but they don’t take into account the TEST split.
$endgroup$
– NaveganTeX
May 26 at 10:19

1

$begingroup$
In scikit-learn you can use train_test_split to split into a training and test set. After that a GridSearchCV on the training set takes care of parameter/model optimisation. Finally, you can calculate an error using the predictions on the test set (eg. using roc_auc_score, f1_score, or another appropriate error measure.Does that answer your question?
$endgroup$
– louic
May 26 at 17:56

$begingroup$
It does indeed! I was even more confused because I’m working with recommender systems and it’s a little bit different but now I have a more clear idea. Thank you very much, Louic!
$endgroup$
– NaveganTeX
May 26 at 21:54

add a comment
|

$begingroup$
Thank you very much for the example, Louic! This is what I was able to catch from your answer. So, initially you split the dataset into train and test. Then you put away the TEST set. Then you run CV on the training data and calculate the error based on the validation folds. That’s awesome..., here we adjust the model etc. But, when it comes to the final measure of model performance, how do you implement that? All the libraries I have been working with only provide the cross validation part but they don’t take into account the TEST split.
$endgroup$
– NaveganTeX
May 26 at 10:19

1

$begingroup$
In scikit-learn you can use train_test_split to split into a training and test set. After that a GridSearchCV on the training set takes care of parameter/model optimisation. Finally, you can calculate an error using the predictions on the test set (eg. using roc_auc_score, f1_score, or another appropriate error measure.Does that answer your question?
$endgroup$
– louic
May 26 at 17:56

$begingroup$
It does indeed! I was even more confused because I’m working with recommender systems and it’s a little bit different but now I have a more clear idea. Thank you very much, Louic!
$endgroup$
– NaveganTeX
May 26 at 21:54

Thank you very much for the example, Louic! This is what I was able to catch from your answer. So, initially you split the dataset into train and test. Then you put away the TEST set. Then you run CV on the training data and calculate the error based on the validation folds. That’s awesome..., here we adjust the model etc. But, when it comes to the final measure of model performance, how do you implement that? All the libraries I have been working with only provide the cross validation part but they don’t take into account the TEST split.

– NaveganTeX
May 26 at 10:19

In scikit-learn you can use train_test_split to split into a training and test set. After that a GridSearchCV on the training set takes care of parameter/model optimisation. Finally, you can calculate an error using the predictions on the test set (eg. using roc_auc_score, f1_score, or another appropriate error measure.Does that answer your question?

– louic
May 26 at 17:56

It does indeed! I was even more confused because I’m working with recommender systems and it’s a little bit different but now I have a more clear idea. Thank you very much, Louic!

– NaveganTeX
May 26 at 21:54

add a comment
|

answered May 26 at 10:20

Paul

6733 silver badges10 bronze badges

1

$begingroup$
Agreed, thanks. I mentioned that second example because I wanted to make it clear that cross-validation is a method (as opposed to an application of a method). To me it seemed that OP had trouble distinguishing the method and its application.
$endgroup$
– louic
May 26 at 11:04

$begingroup$
We could use Nested Cross Validation as an alternative to cross validation + test set
$endgroup$
– NaveganTeX
May 26 at 11:17

2

$begingroup$
Remember that the point of k-fold cross-validation is to be able to use a good amount of test or validation data, without reducing your training dataset too much. If you have sufficient data to have a good size training data set, plus reasonable amounts of test and validation data, then I would suggest not to bother with k-fold cross validation, let alone nested solutions.
$endgroup$
– Paul
May 26 at 11:23

add a comment
|

answered May 26 at 10:20

Paul

6733 silver badges10 bronze badges

1

$begingroup$
Agreed, thanks. I mentioned that second example because I wanted to make it clear that cross-validation is a method (as opposed to an application of a method). To me it seemed that OP had trouble distinguishing the method and its application.
$endgroup$
– louic
May 26 at 11:04

$begingroup$
We could use Nested Cross Validation as an alternative to cross validation + test set
$endgroup$
– NaveganTeX
May 26 at 11:17

2

$begingroup$
Remember that the point of k-fold cross-validation is to be able to use a good amount of test or validation data, without reducing your training dataset too much. If you have sufficient data to have a good size training data set, plus reasonable amounts of test and validation data, then I would suggest not to bother with k-fold cross validation, let alone nested solutions.
$endgroup$
– Paul
May 26 at 11:23

add a comment
|

answered May 26 at 10:20

Paul

6733 silver badges10 bronze badges

answered May 26 at 10:20

Paul

6733 silver badges10 bronze badges

answered May 26 at 10:20

Paul

6733 silver badges10 bronze badges

answered May 26 at 10:20

Paul

6733 silver badges10 bronze badges

answered May 26 at 10:20

Paul

6733 silver badges10 bronze badges

1

$begingroup$
Agreed, thanks. I mentioned that second example because I wanted to make it clear that cross-validation is a method (as opposed to an application of a method). To me it seemed that OP had trouble distinguishing the method and its application.
$endgroup$
– louic
May 26 at 11:04

$begingroup$
We could use Nested Cross Validation as an alternative to cross validation + test set
$endgroup$
– NaveganTeX
May 26 at 11:17

2

$begingroup$
Remember that the point of k-fold cross-validation is to be able to use a good amount of test or validation data, without reducing your training dataset too much. If you have sufficient data to have a good size training data set, plus reasonable amounts of test and validation data, then I would suggest not to bother with k-fold cross validation, let alone nested solutions.
$endgroup$
– Paul
May 26 at 11:23

add a comment
|

1

$begingroup$
Agreed, thanks. I mentioned that second example because I wanted to make it clear that cross-validation is a method (as opposed to an application of a method). To me it seemed that OP had trouble distinguishing the method and its application.
$endgroup$
– louic
May 26 at 11:04

$begingroup$
We could use Nested Cross Validation as an alternative to cross validation + test set
$endgroup$
– NaveganTeX
May 26 at 11:17

2

$begingroup$
Remember that the point of k-fold cross-validation is to be able to use a good amount of test or validation data, without reducing your training dataset too much. If you have sufficient data to have a good size training data set, plus reasonable amounts of test and validation data, then I would suggest not to bother with k-fold cross validation, let alone nested solutions.
$endgroup$
– Paul
May 26 at 11:23

Agreed, thanks. I mentioned that second example because I wanted to make it clear that cross-validation is a method (as opposed to an application of a method). To me it seemed that OP had trouble distinguishing the method and its application.

– louic
May 26 at 11:04

We could use Nested Cross Validation as an alternative to cross validation + test set

– NaveganTeX
May 26 at 11:17

Remember that the point of k-fold cross-validation is to be able to use a good amount of test or validation data, without reducing your training dataset too much. If you have sufficient data to have a good size training data set, plus reasonable amounts of test and validation data, then I would suggest not to bother with k-fold cross validation, let alone nested solutions.

– Paul
May 26 at 11:23

add a comment
|

Excellent question!

I find this train/test/validation confusing (I've been doing ML for 5 years).

Who says your image is correct? Let's go to an ML authority (Sk-Learn)

In general, we do k-Fold on train/test (see Sk-Learn image below).

Technically, you could go one step further and do a Cross Validation on everything (train/test/validation). I've never done it though ...

Good luck!

enter image description here

answered May 26 at 9:09

FrancoSwiss

4393 silver badges7 bronze badges

1

$begingroup$
We will figure it out, someday!!!
$endgroup$
– NaveganTeX
May 26 at 9:19

add a comment
|

Excellent question!

I find this train/test/validation confusing (I've been doing ML for 5 years).

Who says your image is correct? Let's go to an ML authority (Sk-Learn)

In general, we do k-Fold on train/test (see Sk-Learn image below).

Technically, you could go one step further and do a Cross Validation on everything (train/test/validation). I've never done it though ...

Good luck!

enter image description here

answered May 26 at 9:09

FrancoSwiss

4393 silver badges7 bronze badges

1

$begingroup$
We will figure it out, someday!!!
$endgroup$
– NaveganTeX
May 26 at 9:19

add a comment
|

Excellent question!

I find this train/test/validation confusing (I've been doing ML for 5 years).

Who says your image is correct? Let's go to an ML authority (Sk-Learn)

In general, we do k-Fold on train/test (see Sk-Learn image below).

Technically, you could go one step further and do a Cross Validation on everything (train/test/validation). I've never done it though ...

Good luck!

enter image description here

answered May 26 at 9:09

FrancoSwiss

4393 silver badges7 bronze badges

Excellent question!

I find this train/test/validation confusing (I've been doing ML for 5 years).

Who says your image is correct? Let's go to an ML authority (Sk-Learn)

In general, we do k-Fold on train/test (see Sk-Learn image below).

Technically, you could go one step further and do a Cross Validation on everything (train/test/validation). I've never done it though ...

Good luck!

enter image description here

answered May 26 at 9:09

FrancoSwiss

4393 silver badges7 bronze badges

answered May 26 at 9:09

FrancoSwiss

4393 silver badges7 bronze badges

answered May 26 at 9:09

FrancoSwiss

4393 silver badges7 bronze badges

answered May 26 at 9:09

FrancoSwiss

4393 silver badges7 bronze badges

1

$begingroup$
We will figure it out, someday!!!
$endgroup$
– NaveganTeX
May 26 at 9:19

add a comment
|

1

$begingroup$
We will figure it out, someday!!!
$endgroup$
– NaveganTeX
May 26 at 9:19

We will figure it out, someday!!!

– NaveganTeX
May 26 at 9:19

add a comment
|

Thanks for this post! I was having the same problem understanding K-Fold CV and this has really cleared things up.

I have one further question relating to @louic answer above:

When we say "the model with the lowest error is chosen" which of the following does this mean:

we choose the parameters which has the lowest averaged error and then find the the model with the absolute lowest error from within the K models

we choose the parameters which has the lowest averaged error and then train a new model - on the entire training set, using these parameters?

we choose the parameters which has the lowest averaged error and then take the average weights of the K-models trained

Thanks

Iain

answered Sep 26 at 8:43

Iain MacCormick

165 bronze badges

$begingroup$
I downvoted because this is a question, not an answer. I see you posted the question as a question here. I suggest you delete this answer. datascience.stackexchange.com/questions/60861/…
$endgroup$
– Solomonoff's Secret
Sep 27 at 14:20

add a comment
|

Thanks for this post! I was having the same problem understanding K-Fold CV and this has really cleared things up.

I have one further question relating to @louic answer above:

When we say "the model with the lowest error is chosen" which of the following does this mean:

we choose the parameters which has the lowest averaged error and then find the the model with the absolute lowest error from within the K models

we choose the parameters which has the lowest averaged error and then train a new model - on the entire training set, using these parameters?

we choose the parameters which has the lowest averaged error and then take the average weights of the K-models trained

Thanks

Iain

answered Sep 26 at 8:43

Iain MacCormick

165 bronze badges

$begingroup$
I downvoted because this is a question, not an answer. I see you posted the question as a question here. I suggest you delete this answer. datascience.stackexchange.com/questions/60861/…
$endgroup$
– Solomonoff's Secret
Sep 27 at 14:20

add a comment
|

Thanks for this post! I was having the same problem understanding K-Fold CV and this has really cleared things up.

I have one further question relating to @louic answer above:

When we say "the model with the lowest error is chosen" which of the following does this mean:

we choose the parameters which has the lowest averaged error and then find the the model with the absolute lowest error from within the K models

we choose the parameters which has the lowest averaged error and then train a new model - on the entire training set, using these parameters?

we choose the parameters which has the lowest averaged error and then take the average weights of the K-models trained

Thanks

Iain

answered Sep 26 at 8:43

Iain MacCormick

165 bronze badges

Thanks for this post! I was having the same problem understanding K-Fold CV and this has really cleared things up.

I have one further question relating to @louic answer above:

When we say "the model with the lowest error is chosen" which of the following does this mean:

we choose the parameters which has the lowest averaged error and then find the the model with the absolute lowest error from within the K models

we choose the parameters which has the lowest averaged error and then train a new model - on the entire training set, using these parameters?

we choose the parameters which has the lowest averaged error and then take the average weights of the K-models trained

Thanks

Iain

answered Sep 26 at 8:43

Iain MacCormick

165 bronze badges

answered Sep 26 at 8:43

Iain MacCormick

165 bronze badges

answered Sep 26 at 8:43

Iain MacCormick

165 bronze badges

answered Sep 26 at 8:43

Iain MacCormick

165 bronze badges

$begingroup$
I downvoted because this is a question, not an answer. I see you posted the question as a question here. I suggest you delete this answer. datascience.stackexchange.com/questions/60861/…
$endgroup$
– Solomonoff's Secret
Sep 27 at 14:20

add a comment
|

$begingroup$
I downvoted because this is a question, not an answer. I see you posted the question as a question here. I suggest you delete this answer. datascience.stackexchange.com/questions/60861/…
$endgroup$
– Solomonoff's Secret
Sep 27 at 14:20

I downvoted because this is a question, not an answer. I see you posted the question as a question here. I suggest you delete this answer. datascience.stackexchange.com/questions/60861/…

– Solomonoff's Secret
Sep 27 at 14:20

add a comment
|

draft saved

draft discarded

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

vQIhlA oaE73QVRnfjIqUET u,r GOuK,et XLXKrRz Hrv

搜尋此網誌

Bsrgvty

4 Answers
4

Your Answer

Post as a guest

4 Answers
4

4 Answers
4

Post as a guest

Popular posts from this blog

Tamil (spriik) Luke uk diar | Nawigatjuun

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Post as a guest

4 Answers 4

4 Answers 4

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Tamil (spriik) Luke uk diar | Nawigatjuun

4 Answers
4

4 Answers
4

4 Answers
4