Cross validation Vs. Train Validate TestCross validation Vs. Train Validate Test cont'dWhy k-fold cross validation (CV) overfits? Or why discrepancy occurs between CV and test set?K-Fold Cross validation confusion?k-fold cross-validation: model selection or variation in models when using k-fold cross validationhow to prepare data for cross validation in mnist dataset?Terminology - cross-validation, testing and validation set for classification taskSome confusions on Model selection using cross-validation approachTarget encoding with cross validationConfusion in applying k-fold cross validation to datasetShould I perform cross validation only on the training set?Cross validation test and train errors

Limit of sequence (by definiton)

C# Toy Robot Simulator

Will I be allowed to enter the US after living there illegally then legally in the past?

Would a spacecraft carry arc welding supplies?

Minimum number of turns to capture all pieces in Checkers

Grade changes with auto grader

Does Turkey make the "structural steel frame" for the F-35 fighter?

when to use がつ or げつ readings for 月?

Conveying the idea of "tricky"

Most optimal hallways with random gravity inside?

Why do military jets sometimes have elevators in a depressed position when parked?

Did I Traumatize My Puppy?

The Immortal Jellyfish

What is the following style of typography called?

Matrix class in C#

Is it allowed to let the engine of an aircraft idle without a pilot in the plane. (For both helicopters and aeroplanes)

Who inspired the character Geordi La Forge?

When was the famous "sudo warning" introduced? Under what background? By whom?

Moving through the space of an invisible enemy creature in combat

Inconsistency of PE to KE conversion in moving reference frames

How to pronounce correctly [b] and [p]? As well [t]/[d] and [k]/[g]

Linux Commands in Python

Cutting a 4.5m long 2x6 in half with a circular saw

Why were germanium diodes so fast and germanium transistors so slow?



Cross validation Vs. Train Validate Test


Cross validation Vs. Train Validate Test cont'dWhy k-fold cross validation (CV) overfits? Or why discrepancy occurs between CV and test set?K-Fold Cross validation confusion?k-fold cross-validation: model selection or variation in models when using k-fold cross validationhow to prepare data for cross validation in mnist dataset?Terminology - cross-validation, testing and validation set for classification taskSome confusions on Model selection using cross-validation approachTarget encoding with cross validationConfusion in applying k-fold cross validation to datasetShould I perform cross validation only on the training set?Cross validation test and train errors






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty
margin-bottom:0;









2














$begingroup$


I have a doubt regarding the cross validation approach and train-validation-test approach.



I was told that I can split a dataset into 3 parts:



  1. Train: we train the model.

  2. Validation: we validate and adjust model parameters.

  3. Test: never seen before data. We get an unbiased final estimate.

So far, we have split into three subsets. Until here everything is okay. Attached is a picture:



enter image description here



Then I came across the K-fold cross validation approach and what I don’t understand is how I can relate the Test subset from the above approach. Meaning, in 5-fold cross validation we split the data into 5 and in each iteration the non-validation subset is used as the train subset and the validation is used as test set. But, in terms of the above mentioned example, where is the validation part in k-fold cross validation? We either have validation or test subset.



When I refer myself to train/validation/test, that “test” is the scoring:



Model development is generally a two-stage process. The first stage is training and validation, during which you apply algorithms to data for which you know the outcomes to uncover patterns between its features and the target variable. The second stage is scoring, in which you apply the trained model to a new dataset. Then, it returns outcomes in the form of probability scores for classification problems and estimated averages for regression problems. Finally, you deploy the trained model into a production application or use the insights it uncovers to improve business processes.



As an example, I found the Sci-Kit learn cross validation version as you can see in the following picture:



enter image description here



When doing the splitting, you can see that the algorithm that they give you, only takes care of the training part of the original dataset. So, in the end, we are not able to perform the Final evaluation process as you can see in the attached picture.



Thank you!



scikitpage










share|improve this question












$endgroup$





















    2














    $begingroup$


    I have a doubt regarding the cross validation approach and train-validation-test approach.



    I was told that I can split a dataset into 3 parts:



    1. Train: we train the model.

    2. Validation: we validate and adjust model parameters.

    3. Test: never seen before data. We get an unbiased final estimate.

    So far, we have split into three subsets. Until here everything is okay. Attached is a picture:



    enter image description here



    Then I came across the K-fold cross validation approach and what I don’t understand is how I can relate the Test subset from the above approach. Meaning, in 5-fold cross validation we split the data into 5 and in each iteration the non-validation subset is used as the train subset and the validation is used as test set. But, in terms of the above mentioned example, where is the validation part in k-fold cross validation? We either have validation or test subset.



    When I refer myself to train/validation/test, that “test” is the scoring:



    Model development is generally a two-stage process. The first stage is training and validation, during which you apply algorithms to data for which you know the outcomes to uncover patterns between its features and the target variable. The second stage is scoring, in which you apply the trained model to a new dataset. Then, it returns outcomes in the form of probability scores for classification problems and estimated averages for regression problems. Finally, you deploy the trained model into a production application or use the insights it uncovers to improve business processes.



    As an example, I found the Sci-Kit learn cross validation version as you can see in the following picture:



    enter image description here



    When doing the splitting, you can see that the algorithm that they give you, only takes care of the training part of the original dataset. So, in the end, we are not able to perform the Final evaluation process as you can see in the attached picture.



    Thank you!



    scikitpage










    share|improve this question












    $endgroup$

















      2












      2








      2





      $begingroup$


      I have a doubt regarding the cross validation approach and train-validation-test approach.



      I was told that I can split a dataset into 3 parts:



      1. Train: we train the model.

      2. Validation: we validate and adjust model parameters.

      3. Test: never seen before data. We get an unbiased final estimate.

      So far, we have split into three subsets. Until here everything is okay. Attached is a picture:



      enter image description here



      Then I came across the K-fold cross validation approach and what I don’t understand is how I can relate the Test subset from the above approach. Meaning, in 5-fold cross validation we split the data into 5 and in each iteration the non-validation subset is used as the train subset and the validation is used as test set. But, in terms of the above mentioned example, where is the validation part in k-fold cross validation? We either have validation or test subset.



      When I refer myself to train/validation/test, that “test” is the scoring:



      Model development is generally a two-stage process. The first stage is training and validation, during which you apply algorithms to data for which you know the outcomes to uncover patterns between its features and the target variable. The second stage is scoring, in which you apply the trained model to a new dataset. Then, it returns outcomes in the form of probability scores for classification problems and estimated averages for regression problems. Finally, you deploy the trained model into a production application or use the insights it uncovers to improve business processes.



      As an example, I found the Sci-Kit learn cross validation version as you can see in the following picture:



      enter image description here



      When doing the splitting, you can see that the algorithm that they give you, only takes care of the training part of the original dataset. So, in the end, we are not able to perform the Final evaluation process as you can see in the attached picture.



      Thank you!



      scikitpage










      share|improve this question












      $endgroup$




      I have a doubt regarding the cross validation approach and train-validation-test approach.



      I was told that I can split a dataset into 3 parts:



      1. Train: we train the model.

      2. Validation: we validate and adjust model parameters.

      3. Test: never seen before data. We get an unbiased final estimate.

      So far, we have split into three subsets. Until here everything is okay. Attached is a picture:



      enter image description here



      Then I came across the K-fold cross validation approach and what I don’t understand is how I can relate the Test subset from the above approach. Meaning, in 5-fold cross validation we split the data into 5 and in each iteration the non-validation subset is used as the train subset and the validation is used as test set. But, in terms of the above mentioned example, where is the validation part in k-fold cross validation? We either have validation or test subset.



      When I refer myself to train/validation/test, that “test” is the scoring:



      Model development is generally a two-stage process. The first stage is training and validation, during which you apply algorithms to data for which you know the outcomes to uncover patterns between its features and the target variable. The second stage is scoring, in which you apply the trained model to a new dataset. Then, it returns outcomes in the form of probability scores for classification problems and estimated averages for regression problems. Finally, you deploy the trained model into a production application or use the insights it uncovers to improve business processes.



      As an example, I found the Sci-Kit learn cross validation version as you can see in the following picture:



      enter image description here



      When doing the splitting, you can see that the algorithm that they give you, only takes care of the training part of the original dataset. So, in the end, we are not able to perform the Final evaluation process as you can see in the attached picture.



      Thank you!



      scikitpage







      machine-learning cross-validation






      share|improve this question
















      share|improve this question













      share|improve this question




      share|improve this question








      edited May 26 at 10:06







      NaveganTeX

















      asked May 26 at 6:15









      NaveganTeXNaveganTeX

      1235 bronze badges




      1235 bronze badges























          4 Answers
          4






          active

          oldest

          votes


















          2
















          $begingroup$

          If k-fold cross-validation is used to optimize the model paremeters, the training set is split into k parts. Training happens k times, each time leaving out a different part of the training set. Typically, the error of these k-models is averaged. This is done for each of the model parameters to be tested, and the model with the lowest error is chosen. The test set has not been used so far.



          Only at the very end the test set is used to test the performance of the (optimized) model.



          # example: k-fold cross validation for hyperparameter optimization (k=3)

          original data split into training and test set:

          |---------------- train ---------------------| |--- test ---|

          cross-validation: test set is not used, error is calculated from
          validation set (k-times) and averaged:

          |---- train ------------------|- validation -| |--- test ---|
          |---- train ---|- validation -|---- train ---| |--- test ---|
          |- validation -|----------- train -----------| |--- test ---|

          final measure of model performance: model is trained on all training data
          and the error is calculated from test set:

          |---------------- train ---------------------|--- test ---|


          In some cases, k-fold cross-validation is used on the entire data set if no parameter optimization is needed (this is rare, but it happens). In this case there would not be a validation set and the k parts are used as a test set one by one. The error of each of these k tests is typically averaged.



          # example: k-fold cross validation

          |----- test -----|------------ train --------------|
          |----- train ----|----- test -----|----- train ----|
          |------------ train --------------|----- test -----|





          share|improve this answer












          $endgroup$














          • $begingroup$
            Thank you very much for the example, Louic! This is what I was able to catch from your answer. So, initially you split the dataset into train and test. Then you put away the TEST set. Then you run CV on the training data and calculate the error based on the validation folds. That’s awesome..., here we adjust the model etc. But, when it comes to the final measure of model performance, how do you implement that? All the libraries I have been working with only provide the cross validation part but they don’t take into account the TEST split.
            $endgroup$
            – NaveganTeX
            May 26 at 10:19






          • 1




            $begingroup$
            In scikit-learn you can use train_test_split to split into a training and test set. After that a GridSearchCV on the training set takes care of parameter/model optimisation. Finally, you can calculate an error using the predictions on the test set (eg. using roc_auc_score, f1_score, or another appropriate error measure.Does that answer your question?
            $endgroup$
            – louic
            May 26 at 17:56











          • $begingroup$
            It does indeed! I was even more confused because I’m working with recommender systems and it’s a little bit different but now I have a more clear idea. Thank you very much, Louic!
            $endgroup$
            – NaveganTeX
            May 26 at 21:54


















          1
















          $begingroup$

          @louic's answer is correct: You split your data in two parts: training and test, and then you use k-fold cross-validation on the training dataset to tune the parameters. This is useful if you have little training data, because you don't have to exclude the validation data from the training dataset.



          But I find this comment confusing: "In some cases, k-fold cross-validation is used on the entire dataset ... if no parameter optimization is needed". It's correct that if you don't need any optimization of the model after running it for the first time, the performance on the validation data from your k-fold cross-validation runs gives you an unbiased estimate of the model performance. But this is a strange case indeed. It's much more comment to use k-fold cross validation on the entire dataset, and tune your algorithm. This means you lose the unbiased estimate of the model performance, but this is not always needed.






          share|improve this answer










          $endgroup$










          • 1




            $begingroup$
            Agreed, thanks. I mentioned that second example because I wanted to make it clear that cross-validation is a method (as opposed to an application of a method). To me it seemed that OP had trouble distinguishing the method and its application.
            $endgroup$
            – louic
            May 26 at 11:04










          • $begingroup$
            We could use Nested Cross Validation as an alternative to cross validation + test set
            $endgroup$
            – NaveganTeX
            May 26 at 11:17






          • 2




            $begingroup$
            Remember that the point of k-fold cross-validation is to be able to use a good amount of test or validation data, without reducing your training dataset too much. If you have sufficient data to have a good size training data set, plus reasonable amounts of test and validation data, then I would suggest not to bother with k-fold cross validation, let alone nested solutions.
            $endgroup$
            – Paul
            May 26 at 11:23


















          0
















          $begingroup$

          Excellent question!




          I find this train/test/validation confusing (I've been doing ML for 5 years).



          Who says your image is correct? Let's go to an ML authority (Sk-Learn)



          In general, we do k-Fold on train/test (see Sk-Learn image below).




          Technically, you could go one step further and do a Cross Validation on everything (train/test/validation). I've never done it though ...



          Good luck!



          enter image description here






          share|improve this answer










          $endgroup$










          • 1




            $begingroup$
            We will figure it out, someday!!!
            $endgroup$
            – NaveganTeX
            May 26 at 9:19


















          0
















          $begingroup$

          Thanks for this post! I was having the same problem understanding K-Fold CV and this has really cleared things up.



          I have one further question relating to @louic answer above:



          "Training happens k times, each time leaving out a different part of the training set. Typically, the error of these k-models is averaged. This is done for each of the model parameters to be tested, and the model with the lowest error is chosen."



          When we say "the model with the lowest error is chosen" which of the following does this mean:



          1. we choose the parameters which has the lowest averaged error and then find the the model with the absolute lowest error from within the K models

          2. we choose the parameters which has the lowest averaged error and then train a new model - on the entire training set, using these parameters?

          3. we choose the parameters which has the lowest averaged error and then take the average weights of the K-models trained

          Thanks



          Iain






          share|improve this answer










          $endgroup$














          • $begingroup$
            I downvoted because this is a question, not an answer. I see you posted the question as a question here. I suggest you delete this answer. datascience.stackexchange.com/questions/60861/…
            $endgroup$
            – Solomonoff's Secret
            Sep 27 at 14:20












          Your Answer








          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "557"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );














          draft saved

          draft discarded
















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f52632%2fcross-validation-vs-train-validate-test%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown


























          4 Answers
          4






          active

          oldest

          votes








          4 Answers
          4






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          2
















          $begingroup$

          If k-fold cross-validation is used to optimize the model paremeters, the training set is split into k parts. Training happens k times, each time leaving out a different part of the training set. Typically, the error of these k-models is averaged. This is done for each of the model parameters to be tested, and the model with the lowest error is chosen. The test set has not been used so far.



          Only at the very end the test set is used to test the performance of the (optimized) model.



          # example: k-fold cross validation for hyperparameter optimization (k=3)

          original data split into training and test set:

          |---------------- train ---------------------| |--- test ---|

          cross-validation: test set is not used, error is calculated from
          validation set (k-times) and averaged:

          |---- train ------------------|- validation -| |--- test ---|
          |---- train ---|- validation -|---- train ---| |--- test ---|
          |- validation -|----------- train -----------| |--- test ---|

          final measure of model performance: model is trained on all training data
          and the error is calculated from test set:

          |---------------- train ---------------------|--- test ---|


          In some cases, k-fold cross-validation is used on the entire data set if no parameter optimization is needed (this is rare, but it happens). In this case there would not be a validation set and the k parts are used as a test set one by one. The error of each of these k tests is typically averaged.



          # example: k-fold cross validation

          |----- test -----|------------ train --------------|
          |----- train ----|----- test -----|----- train ----|
          |------------ train --------------|----- test -----|





          share|improve this answer












          $endgroup$














          • $begingroup$
            Thank you very much for the example, Louic! This is what I was able to catch from your answer. So, initially you split the dataset into train and test. Then you put away the TEST set. Then you run CV on the training data and calculate the error based on the validation folds. That’s awesome..., here we adjust the model etc. But, when it comes to the final measure of model performance, how do you implement that? All the libraries I have been working with only provide the cross validation part but they don’t take into account the TEST split.
            $endgroup$
            – NaveganTeX
            May 26 at 10:19






          • 1




            $begingroup$
            In scikit-learn you can use train_test_split to split into a training and test set. After that a GridSearchCV on the training set takes care of parameter/model optimisation. Finally, you can calculate an error using the predictions on the test set (eg. using roc_auc_score, f1_score, or another appropriate error measure.Does that answer your question?
            $endgroup$
            – louic
            May 26 at 17:56











          • $begingroup$
            It does indeed! I was even more confused because I’m working with recommender systems and it’s a little bit different but now I have a more clear idea. Thank you very much, Louic!
            $endgroup$
            – NaveganTeX
            May 26 at 21:54















          2
















          $begingroup$

          If k-fold cross-validation is used to optimize the model paremeters, the training set is split into k parts. Training happens k times, each time leaving out a different part of the training set. Typically, the error of these k-models is averaged. This is done for each of the model parameters to be tested, and the model with the lowest error is chosen. The test set has not been used so far.



          Only at the very end the test set is used to test the performance of the (optimized) model.



          # example: k-fold cross validation for hyperparameter optimization (k=3)

          original data split into training and test set:

          |---------------- train ---------------------| |--- test ---|

          cross-validation: test set is not used, error is calculated from
          validation set (k-times) and averaged:

          |---- train ------------------|- validation -| |--- test ---|
          |---- train ---|- validation -|---- train ---| |--- test ---|
          |- validation -|----------- train -----------| |--- test ---|

          final measure of model performance: model is trained on all training data
          and the error is calculated from test set:

          |---------------- train ---------------------|--- test ---|


          In some cases, k-fold cross-validation is used on the entire data set if no parameter optimization is needed (this is rare, but it happens). In this case there would not be a validation set and the k parts are used as a test set one by one. The error of each of these k tests is typically averaged.



          # example: k-fold cross validation

          |----- test -----|------------ train --------------|
          |----- train ----|----- test -----|----- train ----|
          |------------ train --------------|----- test -----|





          share|improve this answer












          $endgroup$














          • $begingroup$
            Thank you very much for the example, Louic! This is what I was able to catch from your answer. So, initially you split the dataset into train and test. Then you put away the TEST set. Then you run CV on the training data and calculate the error based on the validation folds. That’s awesome..., here we adjust the model etc. But, when it comes to the final measure of model performance, how do you implement that? All the libraries I have been working with only provide the cross validation part but they don’t take into account the TEST split.
            $endgroup$
            – NaveganTeX
            May 26 at 10:19






          • 1




            $begingroup$
            In scikit-learn you can use train_test_split to split into a training and test set. After that a GridSearchCV on the training set takes care of parameter/model optimisation. Finally, you can calculate an error using the predictions on the test set (eg. using roc_auc_score, f1_score, or another appropriate error measure.Does that answer your question?
            $endgroup$
            – louic
            May 26 at 17:56











          • $begingroup$
            It does indeed! I was even more confused because I’m working with recommender systems and it’s a little bit different but now I have a more clear idea. Thank you very much, Louic!
            $endgroup$
            – NaveganTeX
            May 26 at 21:54













          2














          2










          2







          $begingroup$

          If k-fold cross-validation is used to optimize the model paremeters, the training set is split into k parts. Training happens k times, each time leaving out a different part of the training set. Typically, the error of these k-models is averaged. This is done for each of the model parameters to be tested, and the model with the lowest error is chosen. The test set has not been used so far.



          Only at the very end the test set is used to test the performance of the (optimized) model.



          # example: k-fold cross validation for hyperparameter optimization (k=3)

          original data split into training and test set:

          |---------------- train ---------------------| |--- test ---|

          cross-validation: test set is not used, error is calculated from
          validation set (k-times) and averaged:

          |---- train ------------------|- validation -| |--- test ---|
          |---- train ---|- validation -|---- train ---| |--- test ---|
          |- validation -|----------- train -----------| |--- test ---|

          final measure of model performance: model is trained on all training data
          and the error is calculated from test set:

          |---------------- train ---------------------|--- test ---|


          In some cases, k-fold cross-validation is used on the entire data set if no parameter optimization is needed (this is rare, but it happens). In this case there would not be a validation set and the k parts are used as a test set one by one. The error of each of these k tests is typically averaged.



          # example: k-fold cross validation

          |----- test -----|------------ train --------------|
          |----- train ----|----- test -----|----- train ----|
          |------------ train --------------|----- test -----|





          share|improve this answer












          $endgroup$



          If k-fold cross-validation is used to optimize the model paremeters, the training set is split into k parts. Training happens k times, each time leaving out a different part of the training set. Typically, the error of these k-models is averaged. This is done for each of the model parameters to be tested, and the model with the lowest error is chosen. The test set has not been used so far.



          Only at the very end the test set is used to test the performance of the (optimized) model.



          # example: k-fold cross validation for hyperparameter optimization (k=3)

          original data split into training and test set:

          |---------------- train ---------------------| |--- test ---|

          cross-validation: test set is not used, error is calculated from
          validation set (k-times) and averaged:

          |---- train ------------------|- validation -| |--- test ---|
          |---- train ---|- validation -|---- train ---| |--- test ---|
          |- validation -|----------- train -----------| |--- test ---|

          final measure of model performance: model is trained on all training data
          and the error is calculated from test set:

          |---------------- train ---------------------|--- test ---|


          In some cases, k-fold cross-validation is used on the entire data set if no parameter optimization is needed (this is rare, but it happens). In this case there would not be a validation set and the k parts are used as a test set one by one. The error of each of these k tests is typically averaged.



          # example: k-fold cross validation

          |----- test -----|------------ train --------------|
          |----- train ----|----- test -----|----- train ----|
          |------------ train --------------|----- test -----|






          share|improve this answer















          share|improve this answer




          share|improve this answer








          edited May 26 at 9:58

























          answered May 26 at 9:50









          louiclouic

          2531 silver badge7 bronze badges




          2531 silver badge7 bronze badges














          • $begingroup$
            Thank you very much for the example, Louic! This is what I was able to catch from your answer. So, initially you split the dataset into train and test. Then you put away the TEST set. Then you run CV on the training data and calculate the error based on the validation folds. That’s awesome..., here we adjust the model etc. But, when it comes to the final measure of model performance, how do you implement that? All the libraries I have been working with only provide the cross validation part but they don’t take into account the TEST split.
            $endgroup$
            – NaveganTeX
            May 26 at 10:19






          • 1




            $begingroup$
            In scikit-learn you can use train_test_split to split into a training and test set. After that a GridSearchCV on the training set takes care of parameter/model optimisation. Finally, you can calculate an error using the predictions on the test set (eg. using roc_auc_score, f1_score, or another appropriate error measure.Does that answer your question?
            $endgroup$
            – louic
            May 26 at 17:56











          • $begingroup$
            It does indeed! I was even more confused because I’m working with recommender systems and it’s a little bit different but now I have a more clear idea. Thank you very much, Louic!
            $endgroup$
            – NaveganTeX
            May 26 at 21:54
















          • $begingroup$
            Thank you very much for the example, Louic! This is what I was able to catch from your answer. So, initially you split the dataset into train and test. Then you put away the TEST set. Then you run CV on the training data and calculate the error based on the validation folds. That’s awesome..., here we adjust the model etc. But, when it comes to the final measure of model performance, how do you implement that? All the libraries I have been working with only provide the cross validation part but they don’t take into account the TEST split.
            $endgroup$
            – NaveganTeX
            May 26 at 10:19






          • 1




            $begingroup$
            In scikit-learn you can use train_test_split to split into a training and test set. After that a GridSearchCV on the training set takes care of parameter/model optimisation. Finally, you can calculate an error using the predictions on the test set (eg. using roc_auc_score, f1_score, or another appropriate error measure.Does that answer your question?
            $endgroup$
            – louic
            May 26 at 17:56











          • $begingroup$
            It does indeed! I was even more confused because I’m working with recommender systems and it’s a little bit different but now I have a more clear idea. Thank you very much, Louic!
            $endgroup$
            – NaveganTeX
            May 26 at 21:54















          $begingroup$
          Thank you very much for the example, Louic! This is what I was able to catch from your answer. So, initially you split the dataset into train and test. Then you put away the TEST set. Then you run CV on the training data and calculate the error based on the validation folds. That’s awesome..., here we adjust the model etc. But, when it comes to the final measure of model performance, how do you implement that? All the libraries I have been working with only provide the cross validation part but they don’t take into account the TEST split.
          $endgroup$
          – NaveganTeX
          May 26 at 10:19




          $begingroup$
          Thank you very much for the example, Louic! This is what I was able to catch from your answer. So, initially you split the dataset into train and test. Then you put away the TEST set. Then you run CV on the training data and calculate the error based on the validation folds. That’s awesome..., here we adjust the model etc. But, when it comes to the final measure of model performance, how do you implement that? All the libraries I have been working with only provide the cross validation part but they don’t take into account the TEST split.
          $endgroup$
          – NaveganTeX
          May 26 at 10:19




          1




          1




          $begingroup$
          In scikit-learn you can use train_test_split to split into a training and test set. After that a GridSearchCV on the training set takes care of parameter/model optimisation. Finally, you can calculate an error using the predictions on the test set (eg. using roc_auc_score, f1_score, or another appropriate error measure.Does that answer your question?
          $endgroup$
          – louic
          May 26 at 17:56





          $begingroup$
          In scikit-learn you can use train_test_split to split into a training and test set. After that a GridSearchCV on the training set takes care of parameter/model optimisation. Finally, you can calculate an error using the predictions on the test set (eg. using roc_auc_score, f1_score, or another appropriate error measure.Does that answer your question?
          $endgroup$
          – louic
          May 26 at 17:56













          $begingroup$
          It does indeed! I was even more confused because I’m working with recommender systems and it’s a little bit different but now I have a more clear idea. Thank you very much, Louic!
          $endgroup$
          – NaveganTeX
          May 26 at 21:54




          $begingroup$
          It does indeed! I was even more confused because I’m working with recommender systems and it’s a little bit different but now I have a more clear idea. Thank you very much, Louic!
          $endgroup$
          – NaveganTeX
          May 26 at 21:54













          1
















          $begingroup$

          @louic's answer is correct: You split your data in two parts: training and test, and then you use k-fold cross-validation on the training dataset to tune the parameters. This is useful if you have little training data, because you don't have to exclude the validation data from the training dataset.



          But I find this comment confusing: "In some cases, k-fold cross-validation is used on the entire dataset ... if no parameter optimization is needed". It's correct that if you don't need any optimization of the model after running it for the first time, the performance on the validation data from your k-fold cross-validation runs gives you an unbiased estimate of the model performance. But this is a strange case indeed. It's much more comment to use k-fold cross validation on the entire dataset, and tune your algorithm. This means you lose the unbiased estimate of the model performance, but this is not always needed.






          share|improve this answer










          $endgroup$










          • 1




            $begingroup$
            Agreed, thanks. I mentioned that second example because I wanted to make it clear that cross-validation is a method (as opposed to an application of a method). To me it seemed that OP had trouble distinguishing the method and its application.
            $endgroup$
            – louic
            May 26 at 11:04










          • $begingroup$
            We could use Nested Cross Validation as an alternative to cross validation + test set
            $endgroup$
            – NaveganTeX
            May 26 at 11:17






          • 2




            $begingroup$
            Remember that the point of k-fold cross-validation is to be able to use a good amount of test or validation data, without reducing your training dataset too much. If you have sufficient data to have a good size training data set, plus reasonable amounts of test and validation data, then I would suggest not to bother with k-fold cross validation, let alone nested solutions.
            $endgroup$
            – Paul
            May 26 at 11:23















          1
















          $begingroup$

          @louic's answer is correct: You split your data in two parts: training and test, and then you use k-fold cross-validation on the training dataset to tune the parameters. This is useful if you have little training data, because you don't have to exclude the validation data from the training dataset.



          But I find this comment confusing: "In some cases, k-fold cross-validation is used on the entire dataset ... if no parameter optimization is needed". It's correct that if you don't need any optimization of the model after running it for the first time, the performance on the validation data from your k-fold cross-validation runs gives you an unbiased estimate of the model performance. But this is a strange case indeed. It's much more comment to use k-fold cross validation on the entire dataset, and tune your algorithm. This means you lose the unbiased estimate of the model performance, but this is not always needed.






          share|improve this answer










          $endgroup$










          • 1




            $begingroup$
            Agreed, thanks. I mentioned that second example because I wanted to make it clear that cross-validation is a method (as opposed to an application of a method). To me it seemed that OP had trouble distinguishing the method and its application.
            $endgroup$
            – louic
            May 26 at 11:04










          • $begingroup$
            We could use Nested Cross Validation as an alternative to cross validation + test set
            $endgroup$
            – NaveganTeX
            May 26 at 11:17






          • 2




            $begingroup$
            Remember that the point of k-fold cross-validation is to be able to use a good amount of test or validation data, without reducing your training dataset too much. If you have sufficient data to have a good size training data set, plus reasonable amounts of test and validation data, then I would suggest not to bother with k-fold cross validation, let alone nested solutions.
            $endgroup$
            – Paul
            May 26 at 11:23













          1














          1










          1







          $begingroup$

          @louic's answer is correct: You split your data in two parts: training and test, and then you use k-fold cross-validation on the training dataset to tune the parameters. This is useful if you have little training data, because you don't have to exclude the validation data from the training dataset.



          But I find this comment confusing: "In some cases, k-fold cross-validation is used on the entire dataset ... if no parameter optimization is needed". It's correct that if you don't need any optimization of the model after running it for the first time, the performance on the validation data from your k-fold cross-validation runs gives you an unbiased estimate of the model performance. But this is a strange case indeed. It's much more comment to use k-fold cross validation on the entire dataset, and tune your algorithm. This means you lose the unbiased estimate of the model performance, but this is not always needed.






          share|improve this answer










          $endgroup$



          @louic's answer is correct: You split your data in two parts: training and test, and then you use k-fold cross-validation on the training dataset to tune the parameters. This is useful if you have little training data, because you don't have to exclude the validation data from the training dataset.



          But I find this comment confusing: "In some cases, k-fold cross-validation is used on the entire dataset ... if no parameter optimization is needed". It's correct that if you don't need any optimization of the model after running it for the first time, the performance on the validation data from your k-fold cross-validation runs gives you an unbiased estimate of the model performance. But this is a strange case indeed. It's much more comment to use k-fold cross validation on the entire dataset, and tune your algorithm. This means you lose the unbiased estimate of the model performance, but this is not always needed.







          share|improve this answer













          share|improve this answer




          share|improve this answer










          answered May 26 at 10:20









          PaulPaul

          6733 silver badges10 bronze badges




          6733 silver badges10 bronze badges










          • 1




            $begingroup$
            Agreed, thanks. I mentioned that second example because I wanted to make it clear that cross-validation is a method (as opposed to an application of a method). To me it seemed that OP had trouble distinguishing the method and its application.
            $endgroup$
            – louic
            May 26 at 11:04










          • $begingroup$
            We could use Nested Cross Validation as an alternative to cross validation + test set
            $endgroup$
            – NaveganTeX
            May 26 at 11:17






          • 2




            $begingroup$
            Remember that the point of k-fold cross-validation is to be able to use a good amount of test or validation data, without reducing your training dataset too much. If you have sufficient data to have a good size training data set, plus reasonable amounts of test and validation data, then I would suggest not to bother with k-fold cross validation, let alone nested solutions.
            $endgroup$
            – Paul
            May 26 at 11:23












          • 1




            $begingroup$
            Agreed, thanks. I mentioned that second example because I wanted to make it clear that cross-validation is a method (as opposed to an application of a method). To me it seemed that OP had trouble distinguishing the method and its application.
            $endgroup$
            – louic
            May 26 at 11:04










          • $begingroup$
            We could use Nested Cross Validation as an alternative to cross validation + test set
            $endgroup$
            – NaveganTeX
            May 26 at 11:17






          • 2




            $begingroup$
            Remember that the point of k-fold cross-validation is to be able to use a good amount of test or validation data, without reducing your training dataset too much. If you have sufficient data to have a good size training data set, plus reasonable amounts of test and validation data, then I would suggest not to bother with k-fold cross validation, let alone nested solutions.
            $endgroup$
            – Paul
            May 26 at 11:23







          1




          1




          $begingroup$
          Agreed, thanks. I mentioned that second example because I wanted to make it clear that cross-validation is a method (as opposed to an application of a method). To me it seemed that OP had trouble distinguishing the method and its application.
          $endgroup$
          – louic
          May 26 at 11:04




          $begingroup$
          Agreed, thanks. I mentioned that second example because I wanted to make it clear that cross-validation is a method (as opposed to an application of a method). To me it seemed that OP had trouble distinguishing the method and its application.
          $endgroup$
          – louic
          May 26 at 11:04












          $begingroup$
          We could use Nested Cross Validation as an alternative to cross validation + test set
          $endgroup$
          – NaveganTeX
          May 26 at 11:17




          $begingroup$
          We could use Nested Cross Validation as an alternative to cross validation + test set
          $endgroup$
          – NaveganTeX
          May 26 at 11:17




          2




          2




          $begingroup$
          Remember that the point of k-fold cross-validation is to be able to use a good amount of test or validation data, without reducing your training dataset too much. If you have sufficient data to have a good size training data set, plus reasonable amounts of test and validation data, then I would suggest not to bother with k-fold cross validation, let alone nested solutions.
          $endgroup$
          – Paul
          May 26 at 11:23




          $begingroup$
          Remember that the point of k-fold cross-validation is to be able to use a good amount of test or validation data, without reducing your training dataset too much. If you have sufficient data to have a good size training data set, plus reasonable amounts of test and validation data, then I would suggest not to bother with k-fold cross validation, let alone nested solutions.
          $endgroup$
          – Paul
          May 26 at 11:23











          0
















          $begingroup$

          Excellent question!




          I find this train/test/validation confusing (I've been doing ML for 5 years).



          Who says your image is correct? Let's go to an ML authority (Sk-Learn)



          In general, we do k-Fold on train/test (see Sk-Learn image below).




          Technically, you could go one step further and do a Cross Validation on everything (train/test/validation). I've never done it though ...



          Good luck!



          enter image description here






          share|improve this answer










          $endgroup$










          • 1




            $begingroup$
            We will figure it out, someday!!!
            $endgroup$
            – NaveganTeX
            May 26 at 9:19















          0
















          $begingroup$

          Excellent question!




          I find this train/test/validation confusing (I've been doing ML for 5 years).



          Who says your image is correct? Let's go to an ML authority (Sk-Learn)



          In general, we do k-Fold on train/test (see Sk-Learn image below).




          Technically, you could go one step further and do a Cross Validation on everything (train/test/validation). I've never done it though ...



          Good luck!



          enter image description here






          share|improve this answer










          $endgroup$










          • 1




            $begingroup$
            We will figure it out, someday!!!
            $endgroup$
            – NaveganTeX
            May 26 at 9:19













          0














          0










          0







          $begingroup$

          Excellent question!




          I find this train/test/validation confusing (I've been doing ML for 5 years).



          Who says your image is correct? Let's go to an ML authority (Sk-Learn)



          In general, we do k-Fold on train/test (see Sk-Learn image below).




          Technically, you could go one step further and do a Cross Validation on everything (train/test/validation). I've never done it though ...



          Good luck!



          enter image description here






          share|improve this answer










          $endgroup$



          Excellent question!




          I find this train/test/validation confusing (I've been doing ML for 5 years).



          Who says your image is correct? Let's go to an ML authority (Sk-Learn)



          In general, we do k-Fold on train/test (see Sk-Learn image below).




          Technically, you could go one step further and do a Cross Validation on everything (train/test/validation). I've never done it though ...



          Good luck!



          enter image description here







          share|improve this answer













          share|improve this answer




          share|improve this answer










          answered May 26 at 9:09









          FrancoSwissFrancoSwiss

          4393 silver badges7 bronze badges




          4393 silver badges7 bronze badges










          • 1




            $begingroup$
            We will figure it out, someday!!!
            $endgroup$
            – NaveganTeX
            May 26 at 9:19












          • 1




            $begingroup$
            We will figure it out, someday!!!
            $endgroup$
            – NaveganTeX
            May 26 at 9:19







          1




          1




          $begingroup$
          We will figure it out, someday!!!
          $endgroup$
          – NaveganTeX
          May 26 at 9:19




          $begingroup$
          We will figure it out, someday!!!
          $endgroup$
          – NaveganTeX
          May 26 at 9:19











          0
















          $begingroup$

          Thanks for this post! I was having the same problem understanding K-Fold CV and this has really cleared things up.



          I have one further question relating to @louic answer above:



          "Training happens k times, each time leaving out a different part of the training set. Typically, the error of these k-models is averaged. This is done for each of the model parameters to be tested, and the model with the lowest error is chosen."



          When we say "the model with the lowest error is chosen" which of the following does this mean:



          1. we choose the parameters which has the lowest averaged error and then find the the model with the absolute lowest error from within the K models

          2. we choose the parameters which has the lowest averaged error and then train a new model - on the entire training set, using these parameters?

          3. we choose the parameters which has the lowest averaged error and then take the average weights of the K-models trained

          Thanks



          Iain






          share|improve this answer










          $endgroup$














          • $begingroup$
            I downvoted because this is a question, not an answer. I see you posted the question as a question here. I suggest you delete this answer. datascience.stackexchange.com/questions/60861/…
            $endgroup$
            – Solomonoff's Secret
            Sep 27 at 14:20















          0
















          $begingroup$

          Thanks for this post! I was having the same problem understanding K-Fold CV and this has really cleared things up.



          I have one further question relating to @louic answer above:



          "Training happens k times, each time leaving out a different part of the training set. Typically, the error of these k-models is averaged. This is done for each of the model parameters to be tested, and the model with the lowest error is chosen."



          When we say "the model with the lowest error is chosen" which of the following does this mean:



          1. we choose the parameters which has the lowest averaged error and then find the the model with the absolute lowest error from within the K models

          2. we choose the parameters which has the lowest averaged error and then train a new model - on the entire training set, using these parameters?

          3. we choose the parameters which has the lowest averaged error and then take the average weights of the K-models trained

          Thanks



          Iain






          share|improve this answer










          $endgroup$














          • $begingroup$
            I downvoted because this is a question, not an answer. I see you posted the question as a question here. I suggest you delete this answer. datascience.stackexchange.com/questions/60861/…
            $endgroup$
            – Solomonoff's Secret
            Sep 27 at 14:20













          0














          0










          0







          $begingroup$

          Thanks for this post! I was having the same problem understanding K-Fold CV and this has really cleared things up.



          I have one further question relating to @louic answer above:



          "Training happens k times, each time leaving out a different part of the training set. Typically, the error of these k-models is averaged. This is done for each of the model parameters to be tested, and the model with the lowest error is chosen."



          When we say "the model with the lowest error is chosen" which of the following does this mean:



          1. we choose the parameters which has the lowest averaged error and then find the the model with the absolute lowest error from within the K models

          2. we choose the parameters which has the lowest averaged error and then train a new model - on the entire training set, using these parameters?

          3. we choose the parameters which has the lowest averaged error and then take the average weights of the K-models trained

          Thanks



          Iain






          share|improve this answer










          $endgroup$



          Thanks for this post! I was having the same problem understanding K-Fold CV and this has really cleared things up.



          I have one further question relating to @louic answer above:



          "Training happens k times, each time leaving out a different part of the training set. Typically, the error of these k-models is averaged. This is done for each of the model parameters to be tested, and the model with the lowest error is chosen."



          When we say "the model with the lowest error is chosen" which of the following does this mean:



          1. we choose the parameters which has the lowest averaged error and then find the the model with the absolute lowest error from within the K models

          2. we choose the parameters which has the lowest averaged error and then train a new model - on the entire training set, using these parameters?

          3. we choose the parameters which has the lowest averaged error and then take the average weights of the K-models trained

          Thanks



          Iain







          share|improve this answer













          share|improve this answer




          share|improve this answer










          answered Sep 26 at 8:43









          Iain MacCormickIain MacCormick

          165 bronze badges




          165 bronze badges














          • $begingroup$
            I downvoted because this is a question, not an answer. I see you posted the question as a question here. I suggest you delete this answer. datascience.stackexchange.com/questions/60861/…
            $endgroup$
            – Solomonoff's Secret
            Sep 27 at 14:20
















          • $begingroup$
            I downvoted because this is a question, not an answer. I see you posted the question as a question here. I suggest you delete this answer. datascience.stackexchange.com/questions/60861/…
            $endgroup$
            – Solomonoff's Secret
            Sep 27 at 14:20















          $begingroup$
          I downvoted because this is a question, not an answer. I see you posted the question as a question here. I suggest you delete this answer. datascience.stackexchange.com/questions/60861/…
          $endgroup$
          – Solomonoff's Secret
          Sep 27 at 14:20




          $begingroup$
          I downvoted because this is a question, not an answer. I see you posted the question as a question here. I suggest you delete this answer. datascience.stackexchange.com/questions/60861/…
          $endgroup$
          – Solomonoff's Secret
          Sep 27 at 14:20


















          draft saved

          draft discarded















































          Thanks for contributing an answer to Data Science Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f52632%2fcross-validation-vs-train-validate-test%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown









          Popular posts from this blog

          Tamil (spriik) Luke uk diar | Nawigatjuun

          Align equal signs while including text over equalitiesAMS align: left aligned text/math plus multicolumn alignmentMultiple alignmentsAligning equations in multiple placesNumbering and aligning an equation with multiple columnsHow to align one equation with another multline equationUsing \ in environments inside the begintabularxNumber equations and preserving alignment of equal signsHow can I align equations to the left and to the right?Double equation alignment problem within align enviromentAligned within align: Why are they right-aligned?

          Training a classifier when some of the features are unknownWhy does Gradient Boosting regression predict negative values when there are no negative y-values in my training set?How to improve an existing (trained) classifier?What is effect when I set up some self defined predisctor variables?Why Matlab neural network classification returns decimal values on prediction dataset?Fitting and transforming text data in training, testing, and validation setsHow to quantify the performance of the classifier (multi-class SVM) using the test data?How do I control for some patients providing multiple samples in my training data?Training and Test setTraining a convolutional neural network for image denoising in MatlabShouldn't an autoencoder with #(neurons in hidden layer) = #(neurons in input layer) be “perfect”?