If I want an interpretable model, are there methods other than Linear Regression?Obtaining knowledge from a random forestHow does a simple logistic regression model achieve a 92% classification accuracy on MNIST?Will Linear Regression choose as good of a model as any other regression algorithm given enough data?Gradient Boosting for Linear Regression - why does it not work?Problems with one-hot encoding vs. dummy encodingHow to predict one variable depending upon two other variable in Random Forest?Different machine learning models give contradictory results

Is CC'ing the manager in first e-mail regarding urgent matter considered escalation?

How did Gdoley Israel react to the landing on the Moon?

How can medieval knights protects themselves against guns?

Translate the French quote "Il n’y a pas d'amour, il n’y a que des preuves d’amour" to English?

Why did the Big Bang produce hydrogen?

A bob hanging in an accelerating train moves backward. What is the force moving it backward?

Cos Method in Finance / Practice

Is there a difference between “When you are reduced to 0 hit points” and “when you would be reduced to 0 hit points”?

Are there mechanics that will make improvised weapon attacks magical for a pure fighter?

How do I defeat the Molduga

Mistakenly agreed with a scammer’s term of use

Large products with glass doors

What's politest way of writing you got no reply?

Two people from small group of friends want to have a "meeting" with me. The circumstances are strange and give me a bad feeling

Demonstrate the type of kernel compression in practice

Member most ancient

Does trajectory of an object orbiting a planet depend on the object's mass? (With hypothetical apollo example)

Where does Thor get his hammer when he's not using it?

Why does the B-2 Spirit have a pattern of thin white lines?

How can I speed up secure erasing of a disk?

Can I find Gigantamax Pokémon outside of Max Raid Battles?

Searching for a Thurston paper with egg / 3-manifold analogy?

Where to stand for this winter view in Grindelwald, Switzerland?

What type of interpreter were most 8-bit BASIC implementations?



If I want an interpretable model, are there methods other than Linear Regression?


Obtaining knowledge from a random forestHow does a simple logistic regression model achieve a 92% classification accuracy on MNIST?Will Linear Regression choose as good of a model as any other regression algorithm given enough data?Gradient Boosting for Linear Regression - why does it not work?Problems with one-hot encoding vs. dummy encodingHow to predict one variable depending upon two other variable in Random Forest?Different machine learning models give contradictory results






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty
margin-bottom:0;









16















$begingroup$


I encountered some statisticians that never use models other than Linear Regression for prediction because they believe that "ML models" such as random forest or gradient boosting are hard to explain or "not interpretable".



In a Linear Regression, given that the set of assumptions is verified (normality of errors, homoskedasticity, no multi-collinearity), the t-tests provide a way to test the significance of variables, tests that to my knowledge are not available in the random forests or gradient boosting models.



Therefore, my question is if I want to model a dependent variable with a set of independent variables, for the sake of interpretability should I always use Linear Regression?










share|cite|improve this question











$endgroup$










  • 6




    $begingroup$
    Depends on what you still consider linear. Generalized linear models and generalized additive models still work on the basis of a linear component being estimated, but can model a wide variety of relationships.
    $endgroup$
    – Frans Rodenburg
    Sep 12 at 4:40






  • 2




    $begingroup$
    Also depends what you mean by interpretable. Various ways of 'peering into the black box' have been proposed for machine learning models, but may or may not be appropriate for your goals.
    $endgroup$
    – user20160
    Sep 12 at 5:11






  • 5




    $begingroup$
    I don't quite see what inferential statistics and t tests have to do with interpretability, which IMO mainly is about coefficient estimates.
    $endgroup$
    – S. Kolassa - Reinstate Monica
    Sep 12 at 6:52






  • 3




    $begingroup$
    @StephanKolassa "Interretability" can also pertain to function form. For example, the coefficient estimates produced by algorithmic fractional polynomial curve fitting in regression models (whether linear regresion, GLM, or something else), while obtaining nice fit, are almost certainly anti-intuitive: can you call to mind the array of shapes produced by models of the form $y_i=beta_0 + beta_1x_i^-3/5 + beta_2x_i^1/3 + beta_3x_i^3 + varepsilon_i$, and therefore interpret the relationship between $y$ and $x$ implied by your coefficient estimates?
    $endgroup$
    – Alexis
    Sep 12 at 22:16







  • 2




    $begingroup$
    @UserX What you describe is still linear regression (i.e. is linear in the parameters). Contrast $y_i = beta_0 + beta_1x_i + beta_2x_i^2 + varepsilon_i$ with $y_i = beta_0 + beta_1x_i + x_i^beta_2 +varepsilon_i$: the former is a linear regression model, while the latter cannot be estimated using linear regression.
    $endgroup$
    – Alexis
    Sep 13 at 17:45


















16















$begingroup$


I encountered some statisticians that never use models other than Linear Regression for prediction because they believe that "ML models" such as random forest or gradient boosting are hard to explain or "not interpretable".



In a Linear Regression, given that the set of assumptions is verified (normality of errors, homoskedasticity, no multi-collinearity), the t-tests provide a way to test the significance of variables, tests that to my knowledge are not available in the random forests or gradient boosting models.



Therefore, my question is if I want to model a dependent variable with a set of independent variables, for the sake of interpretability should I always use Linear Regression?










share|cite|improve this question











$endgroup$










  • 6




    $begingroup$
    Depends on what you still consider linear. Generalized linear models and generalized additive models still work on the basis of a linear component being estimated, but can model a wide variety of relationships.
    $endgroup$
    – Frans Rodenburg
    Sep 12 at 4:40






  • 2




    $begingroup$
    Also depends what you mean by interpretable. Various ways of 'peering into the black box' have been proposed for machine learning models, but may or may not be appropriate for your goals.
    $endgroup$
    – user20160
    Sep 12 at 5:11






  • 5




    $begingroup$
    I don't quite see what inferential statistics and t tests have to do with interpretability, which IMO mainly is about coefficient estimates.
    $endgroup$
    – S. Kolassa - Reinstate Monica
    Sep 12 at 6:52






  • 3




    $begingroup$
    @StephanKolassa "Interretability" can also pertain to function form. For example, the coefficient estimates produced by algorithmic fractional polynomial curve fitting in regression models (whether linear regresion, GLM, or something else), while obtaining nice fit, are almost certainly anti-intuitive: can you call to mind the array of shapes produced by models of the form $y_i=beta_0 + beta_1x_i^-3/5 + beta_2x_i^1/3 + beta_3x_i^3 + varepsilon_i$, and therefore interpret the relationship between $y$ and $x$ implied by your coefficient estimates?
    $endgroup$
    – Alexis
    Sep 12 at 22:16







  • 2




    $begingroup$
    @UserX What you describe is still linear regression (i.e. is linear in the parameters). Contrast $y_i = beta_0 + beta_1x_i + beta_2x_i^2 + varepsilon_i$ with $y_i = beta_0 + beta_1x_i + x_i^beta_2 +varepsilon_i$: the former is a linear regression model, while the latter cannot be estimated using linear regression.
    $endgroup$
    – Alexis
    Sep 13 at 17:45














16













16









16


5



$begingroup$


I encountered some statisticians that never use models other than Linear Regression for prediction because they believe that "ML models" such as random forest or gradient boosting are hard to explain or "not interpretable".



In a Linear Regression, given that the set of assumptions is verified (normality of errors, homoskedasticity, no multi-collinearity), the t-tests provide a way to test the significance of variables, tests that to my knowledge are not available in the random forests or gradient boosting models.



Therefore, my question is if I want to model a dependent variable with a set of independent variables, for the sake of interpretability should I always use Linear Regression?










share|cite|improve this question











$endgroup$




I encountered some statisticians that never use models other than Linear Regression for prediction because they believe that "ML models" such as random forest or gradient boosting are hard to explain or "not interpretable".



In a Linear Regression, given that the set of assumptions is verified (normality of errors, homoskedasticity, no multi-collinearity), the t-tests provide a way to test the significance of variables, tests that to my knowledge are not available in the random forests or gradient boosting models.



Therefore, my question is if I want to model a dependent variable with a set of independent variables, for the sake of interpretability should I always use Linear Regression?







regression machine-learning multiple-regression model-interpretation






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited Sep 14 at 12:08









Tim

71.3k14 gold badges152 silver badges254 bronze badges




71.3k14 gold badges152 silver badges254 bronze badges










asked Sep 12 at 4:08









VictorVictor

6032 silver badges8 bronze badges




6032 silver badges8 bronze badges










  • 6




    $begingroup$
    Depends on what you still consider linear. Generalized linear models and generalized additive models still work on the basis of a linear component being estimated, but can model a wide variety of relationships.
    $endgroup$
    – Frans Rodenburg
    Sep 12 at 4:40






  • 2




    $begingroup$
    Also depends what you mean by interpretable. Various ways of 'peering into the black box' have been proposed for machine learning models, but may or may not be appropriate for your goals.
    $endgroup$
    – user20160
    Sep 12 at 5:11






  • 5




    $begingroup$
    I don't quite see what inferential statistics and t tests have to do with interpretability, which IMO mainly is about coefficient estimates.
    $endgroup$
    – S. Kolassa - Reinstate Monica
    Sep 12 at 6:52






  • 3




    $begingroup$
    @StephanKolassa "Interretability" can also pertain to function form. For example, the coefficient estimates produced by algorithmic fractional polynomial curve fitting in regression models (whether linear regresion, GLM, or something else), while obtaining nice fit, are almost certainly anti-intuitive: can you call to mind the array of shapes produced by models of the form $y_i=beta_0 + beta_1x_i^-3/5 + beta_2x_i^1/3 + beta_3x_i^3 + varepsilon_i$, and therefore interpret the relationship between $y$ and $x$ implied by your coefficient estimates?
    $endgroup$
    – Alexis
    Sep 12 at 22:16







  • 2




    $begingroup$
    @UserX What you describe is still linear regression (i.e. is linear in the parameters). Contrast $y_i = beta_0 + beta_1x_i + beta_2x_i^2 + varepsilon_i$ with $y_i = beta_0 + beta_1x_i + x_i^beta_2 +varepsilon_i$: the former is a linear regression model, while the latter cannot be estimated using linear regression.
    $endgroup$
    – Alexis
    Sep 13 at 17:45













  • 6




    $begingroup$
    Depends on what you still consider linear. Generalized linear models and generalized additive models still work on the basis of a linear component being estimated, but can model a wide variety of relationships.
    $endgroup$
    – Frans Rodenburg
    Sep 12 at 4:40






  • 2




    $begingroup$
    Also depends what you mean by interpretable. Various ways of 'peering into the black box' have been proposed for machine learning models, but may or may not be appropriate for your goals.
    $endgroup$
    – user20160
    Sep 12 at 5:11






  • 5




    $begingroup$
    I don't quite see what inferential statistics and t tests have to do with interpretability, which IMO mainly is about coefficient estimates.
    $endgroup$
    – S. Kolassa - Reinstate Monica
    Sep 12 at 6:52






  • 3




    $begingroup$
    @StephanKolassa "Interretability" can also pertain to function form. For example, the coefficient estimates produced by algorithmic fractional polynomial curve fitting in regression models (whether linear regresion, GLM, or something else), while obtaining nice fit, are almost certainly anti-intuitive: can you call to mind the array of shapes produced by models of the form $y_i=beta_0 + beta_1x_i^-3/5 + beta_2x_i^1/3 + beta_3x_i^3 + varepsilon_i$, and therefore interpret the relationship between $y$ and $x$ implied by your coefficient estimates?
    $endgroup$
    – Alexis
    Sep 12 at 22:16







  • 2




    $begingroup$
    @UserX What you describe is still linear regression (i.e. is linear in the parameters). Contrast $y_i = beta_0 + beta_1x_i + beta_2x_i^2 + varepsilon_i$ with $y_i = beta_0 + beta_1x_i + x_i^beta_2 +varepsilon_i$: the former is a linear regression model, while the latter cannot be estimated using linear regression.
    $endgroup$
    – Alexis
    Sep 13 at 17:45








6




6




$begingroup$
Depends on what you still consider linear. Generalized linear models and generalized additive models still work on the basis of a linear component being estimated, but can model a wide variety of relationships.
$endgroup$
– Frans Rodenburg
Sep 12 at 4:40




$begingroup$
Depends on what you still consider linear. Generalized linear models and generalized additive models still work on the basis of a linear component being estimated, but can model a wide variety of relationships.
$endgroup$
– Frans Rodenburg
Sep 12 at 4:40




2




2




$begingroup$
Also depends what you mean by interpretable. Various ways of 'peering into the black box' have been proposed for machine learning models, but may or may not be appropriate for your goals.
$endgroup$
– user20160
Sep 12 at 5:11




$begingroup$
Also depends what you mean by interpretable. Various ways of 'peering into the black box' have been proposed for machine learning models, but may or may not be appropriate for your goals.
$endgroup$
– user20160
Sep 12 at 5:11




5




5




$begingroup$
I don't quite see what inferential statistics and t tests have to do with interpretability, which IMO mainly is about coefficient estimates.
$endgroup$
– S. Kolassa - Reinstate Monica
Sep 12 at 6:52




$begingroup$
I don't quite see what inferential statistics and t tests have to do with interpretability, which IMO mainly is about coefficient estimates.
$endgroup$
– S. Kolassa - Reinstate Monica
Sep 12 at 6:52




3




3




$begingroup$
@StephanKolassa "Interretability" can also pertain to function form. For example, the coefficient estimates produced by algorithmic fractional polynomial curve fitting in regression models (whether linear regresion, GLM, or something else), while obtaining nice fit, are almost certainly anti-intuitive: can you call to mind the array of shapes produced by models of the form $y_i=beta_0 + beta_1x_i^-3/5 + beta_2x_i^1/3 + beta_3x_i^3 + varepsilon_i$, and therefore interpret the relationship between $y$ and $x$ implied by your coefficient estimates?
$endgroup$
– Alexis
Sep 12 at 22:16





$begingroup$
@StephanKolassa "Interretability" can also pertain to function form. For example, the coefficient estimates produced by algorithmic fractional polynomial curve fitting in regression models (whether linear regresion, GLM, or something else), while obtaining nice fit, are almost certainly anti-intuitive: can you call to mind the array of shapes produced by models of the form $y_i=beta_0 + beta_1x_i^-3/5 + beta_2x_i^1/3 + beta_3x_i^3 + varepsilon_i$, and therefore interpret the relationship between $y$ and $x$ implied by your coefficient estimates?
$endgroup$
– Alexis
Sep 12 at 22:16





2




2




$begingroup$
@UserX What you describe is still linear regression (i.e. is linear in the parameters). Contrast $y_i = beta_0 + beta_1x_i + beta_2x_i^2 + varepsilon_i$ with $y_i = beta_0 + beta_1x_i + x_i^beta_2 +varepsilon_i$: the former is a linear regression model, while the latter cannot be estimated using linear regression.
$endgroup$
– Alexis
Sep 13 at 17:45





$begingroup$
@UserX What you describe is still linear regression (i.e. is linear in the parameters). Contrast $y_i = beta_0 + beta_1x_i + beta_2x_i^2 + varepsilon_i$ with $y_i = beta_0 + beta_1x_i + x_i^beta_2 +varepsilon_i$: the former is a linear regression model, while the latter cannot be estimated using linear regression.
$endgroup$
– Alexis
Sep 13 at 17:45











4 Answers
4






active

oldest

votes


















28

















$begingroup$

It is hard for me to believe that you heard people saying this, because it would be a dumb thing to say. It's like saying that you use only the hammer (including drilling holes and for changing the lightbulbs), because it's straightforward to use and gives predictable results.



Second, linear regression is not always "interpretable". If you have linear regression model with many polynomial terms, or just a lot of features, it would be hard to interpret. For example, say that you used the raw values of each of the 784 pixels from MNIST† as features. Would knowing that pixel 237 has weight equal to -2311.67 tell you anything about the model? For image data, looking at activation maps of the convolutional neural network would be much easier to understand.



Finally, there are models that are equally interpretable, e.g. logistic regression, decision trees, naive Bayes algorithm, and many more.



† - As noticed by Ingolifs in the comment, and as discussed in this thread, MNIST may be not the best example, since this is a very simple dataset. For most of the realistic image datasets, logistic regression would not work and looking at the weights would not give any straightforward answers. If you look closer at the weights in the linked thread, then their interpretation is also not straightforward, for example weights for predicting "5" or "9" do not show any obvious pattern (see image below, copied from the other thread).








share|cite|improve this answer












$endgroup$










  • 2




    $begingroup$
    This answer I think does a good job showing how clearly logistic regression on MNIST can be explained.
    $endgroup$
    – Ingolifs
    Sep 12 at 23:08







  • 1




    $begingroup$
    @Ingolifs agree, but this is an activation map, you could do the same for neural network.
    $endgroup$
    – Tim
    Sep 13 at 5:37










  • $begingroup$
    Regardless of what it's called, it gives a clear explanation of what the logistic regression is using to make its decisions in a way you don't really get for activation maps of neural networks.
    $endgroup$
    – Ingolifs
    Sep 13 at 9:40






  • 1




    $begingroup$
    @Ingolifs MNIST is maybe not the best example because it's very simple, but the point is that you'd use same method for neural network.
    $endgroup$
    – Tim
    Sep 13 at 11:59


















11

















$begingroup$

Decision Tree would be another choice. Or Lasso Regression to create a sparse system.



Check this figure from An Introduction to Statistical Learning book.
enter image description here
http://www.sr-sv.com/wp-content/uploads/2015/09/STAT01.png






share|cite|improve this answer












$endgroup$














  • $begingroup$
    What is the "ISL" book?
    $endgroup$
    – donlan
    Sep 13 at 16:00






  • 1




    $begingroup$
    @donlan amazon.com/… thanks for suggestion answer revised.
    $endgroup$
    – Haitao Du
    Sep 13 at 17:13










  • $begingroup$
    no problem! was going to look it up after reading this thread
    $endgroup$
    – donlan
    Sep 13 at 18:18


















6

















$begingroup$

No, that is needlessly restrictive. There are a large range of interpretable models including not just (as Frans Rodenburg says) linear models, generalized linear models and generalized additive models, but also machine learning methods used for regression. I include random forests, gradient boosted machines, neural networks, and more. Just because you don't get coefficients out of machine learning models that are similar to those from linear regressions does not mean that their workings cannot be understood. It just takes a bit more work.



To understand why, I'd recommend reading this question: Obtaining knowledge from a random forest . What it shows is how you can approach making almost any machine learning model interpretable.






share|cite|improve this answer










$endgroup$






















    6

















    $begingroup$

    I would agrre with Tim's and mkt's answers - ML models are not necessarily uninterpretable. I would direct you to the Descriptive mAchine Learning EXplanations, DALEX R package, which is devoted to making ML models interpretable.






    share|cite|improve this answer










    $endgroup$














    • $begingroup$
      the DALEX Package is indeed very interesting, do you know if something similar exists for Python?
      $endgroup$
      – Victor
      Sep 15 at 4:36










    • $begingroup$
      @Victor I don't know of a Python version of DALEX, but you could try calling R from Python using rpy2.readthedocs.io/en/version_2.8.x/introduction.html for example.
      $endgroup$
      – babelproofreader
      Sep 16 at 9:53












    Your Answer








    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "65"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );














    draft saved

    draft discarded
















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f426900%2fif-i-want-an-interpretable-model-are-there-methods-other-than-linear-regression%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown


























    4 Answers
    4






    active

    oldest

    votes








    4 Answers
    4






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    28

















    $begingroup$

    It is hard for me to believe that you heard people saying this, because it would be a dumb thing to say. It's like saying that you use only the hammer (including drilling holes and for changing the lightbulbs), because it's straightforward to use and gives predictable results.



    Second, linear regression is not always "interpretable". If you have linear regression model with many polynomial terms, or just a lot of features, it would be hard to interpret. For example, say that you used the raw values of each of the 784 pixels from MNIST† as features. Would knowing that pixel 237 has weight equal to -2311.67 tell you anything about the model? For image data, looking at activation maps of the convolutional neural network would be much easier to understand.



    Finally, there are models that are equally interpretable, e.g. logistic regression, decision trees, naive Bayes algorithm, and many more.



    † - As noticed by Ingolifs in the comment, and as discussed in this thread, MNIST may be not the best example, since this is a very simple dataset. For most of the realistic image datasets, logistic regression would not work and looking at the weights would not give any straightforward answers. If you look closer at the weights in the linked thread, then their interpretation is also not straightforward, for example weights for predicting "5" or "9" do not show any obvious pattern (see image below, copied from the other thread).








    share|cite|improve this answer












    $endgroup$










    • 2




      $begingroup$
      This answer I think does a good job showing how clearly logistic regression on MNIST can be explained.
      $endgroup$
      – Ingolifs
      Sep 12 at 23:08







    • 1




      $begingroup$
      @Ingolifs agree, but this is an activation map, you could do the same for neural network.
      $endgroup$
      – Tim
      Sep 13 at 5:37










    • $begingroup$
      Regardless of what it's called, it gives a clear explanation of what the logistic regression is using to make its decisions in a way you don't really get for activation maps of neural networks.
      $endgroup$
      – Ingolifs
      Sep 13 at 9:40






    • 1




      $begingroup$
      @Ingolifs MNIST is maybe not the best example because it's very simple, but the point is that you'd use same method for neural network.
      $endgroup$
      – Tim
      Sep 13 at 11:59















    28

















    $begingroup$

    It is hard for me to believe that you heard people saying this, because it would be a dumb thing to say. It's like saying that you use only the hammer (including drilling holes and for changing the lightbulbs), because it's straightforward to use and gives predictable results.



    Second, linear regression is not always "interpretable". If you have linear regression model with many polynomial terms, or just a lot of features, it would be hard to interpret. For example, say that you used the raw values of each of the 784 pixels from MNIST† as features. Would knowing that pixel 237 has weight equal to -2311.67 tell you anything about the model? For image data, looking at activation maps of the convolutional neural network would be much easier to understand.



    Finally, there are models that are equally interpretable, e.g. logistic regression, decision trees, naive Bayes algorithm, and many more.



    † - As noticed by Ingolifs in the comment, and as discussed in this thread, MNIST may be not the best example, since this is a very simple dataset. For most of the realistic image datasets, logistic regression would not work and looking at the weights would not give any straightforward answers. If you look closer at the weights in the linked thread, then their interpretation is also not straightforward, for example weights for predicting "5" or "9" do not show any obvious pattern (see image below, copied from the other thread).








    share|cite|improve this answer












    $endgroup$










    • 2




      $begingroup$
      This answer I think does a good job showing how clearly logistic regression on MNIST can be explained.
      $endgroup$
      – Ingolifs
      Sep 12 at 23:08







    • 1




      $begingroup$
      @Ingolifs agree, but this is an activation map, you could do the same for neural network.
      $endgroup$
      – Tim
      Sep 13 at 5:37










    • $begingroup$
      Regardless of what it's called, it gives a clear explanation of what the logistic regression is using to make its decisions in a way you don't really get for activation maps of neural networks.
      $endgroup$
      – Ingolifs
      Sep 13 at 9:40






    • 1




      $begingroup$
      @Ingolifs MNIST is maybe not the best example because it's very simple, but the point is that you'd use same method for neural network.
      $endgroup$
      – Tim
      Sep 13 at 11:59













    28















    28











    28







    $begingroup$

    It is hard for me to believe that you heard people saying this, because it would be a dumb thing to say. It's like saying that you use only the hammer (including drilling holes and for changing the lightbulbs), because it's straightforward to use and gives predictable results.



    Second, linear regression is not always "interpretable". If you have linear regression model with many polynomial terms, or just a lot of features, it would be hard to interpret. For example, say that you used the raw values of each of the 784 pixels from MNIST† as features. Would knowing that pixel 237 has weight equal to -2311.67 tell you anything about the model? For image data, looking at activation maps of the convolutional neural network would be much easier to understand.



    Finally, there are models that are equally interpretable, e.g. logistic regression, decision trees, naive Bayes algorithm, and many more.



    † - As noticed by Ingolifs in the comment, and as discussed in this thread, MNIST may be not the best example, since this is a very simple dataset. For most of the realistic image datasets, logistic regression would not work and looking at the weights would not give any straightforward answers. If you look closer at the weights in the linked thread, then their interpretation is also not straightforward, for example weights for predicting "5" or "9" do not show any obvious pattern (see image below, copied from the other thread).








    share|cite|improve this answer












    $endgroup$



    It is hard for me to believe that you heard people saying this, because it would be a dumb thing to say. It's like saying that you use only the hammer (including drilling holes and for changing the lightbulbs), because it's straightforward to use and gives predictable results.



    Second, linear regression is not always "interpretable". If you have linear regression model with many polynomial terms, or just a lot of features, it would be hard to interpret. For example, say that you used the raw values of each of the 784 pixels from MNIST† as features. Would knowing that pixel 237 has weight equal to -2311.67 tell you anything about the model? For image data, looking at activation maps of the convolutional neural network would be much easier to understand.



    Finally, there are models that are equally interpretable, e.g. logistic regression, decision trees, naive Bayes algorithm, and many more.



    † - As noticed by Ingolifs in the comment, and as discussed in this thread, MNIST may be not the best example, since this is a very simple dataset. For most of the realistic image datasets, logistic regression would not work and looking at the weights would not give any straightforward answers. If you look closer at the weights in the linked thread, then their interpretation is also not straightforward, for example weights for predicting "5" or "9" do not show any obvious pattern (see image below, copied from the other thread).









    share|cite|improve this answer















    share|cite|improve this answer




    share|cite|improve this answer








    edited Sep 14 at 12:08

























    answered Sep 12 at 5:40









    TimTim

    71.3k14 gold badges152 silver badges254 bronze badges




    71.3k14 gold badges152 silver badges254 bronze badges










    • 2




      $begingroup$
      This answer I think does a good job showing how clearly logistic regression on MNIST can be explained.
      $endgroup$
      – Ingolifs
      Sep 12 at 23:08







    • 1




      $begingroup$
      @Ingolifs agree, but this is an activation map, you could do the same for neural network.
      $endgroup$
      – Tim
      Sep 13 at 5:37










    • $begingroup$
      Regardless of what it's called, it gives a clear explanation of what the logistic regression is using to make its decisions in a way you don't really get for activation maps of neural networks.
      $endgroup$
      – Ingolifs
      Sep 13 at 9:40






    • 1




      $begingroup$
      @Ingolifs MNIST is maybe not the best example because it's very simple, but the point is that you'd use same method for neural network.
      $endgroup$
      – Tim
      Sep 13 at 11:59












    • 2




      $begingroup$
      This answer I think does a good job showing how clearly logistic regression on MNIST can be explained.
      $endgroup$
      – Ingolifs
      Sep 12 at 23:08







    • 1




      $begingroup$
      @Ingolifs agree, but this is an activation map, you could do the same for neural network.
      $endgroup$
      – Tim
      Sep 13 at 5:37










    • $begingroup$
      Regardless of what it's called, it gives a clear explanation of what the logistic regression is using to make its decisions in a way you don't really get for activation maps of neural networks.
      $endgroup$
      – Ingolifs
      Sep 13 at 9:40






    • 1




      $begingroup$
      @Ingolifs MNIST is maybe not the best example because it's very simple, but the point is that you'd use same method for neural network.
      $endgroup$
      – Tim
      Sep 13 at 11:59







    2




    2




    $begingroup$
    This answer I think does a good job showing how clearly logistic regression on MNIST can be explained.
    $endgroup$
    – Ingolifs
    Sep 12 at 23:08





    $begingroup$
    This answer I think does a good job showing how clearly logistic regression on MNIST can be explained.
    $endgroup$
    – Ingolifs
    Sep 12 at 23:08





    1




    1




    $begingroup$
    @Ingolifs agree, but this is an activation map, you could do the same for neural network.
    $endgroup$
    – Tim
    Sep 13 at 5:37




    $begingroup$
    @Ingolifs agree, but this is an activation map, you could do the same for neural network.
    $endgroup$
    – Tim
    Sep 13 at 5:37












    $begingroup$
    Regardless of what it's called, it gives a clear explanation of what the logistic regression is using to make its decisions in a way you don't really get for activation maps of neural networks.
    $endgroup$
    – Ingolifs
    Sep 13 at 9:40




    $begingroup$
    Regardless of what it's called, it gives a clear explanation of what the logistic regression is using to make its decisions in a way you don't really get for activation maps of neural networks.
    $endgroup$
    – Ingolifs
    Sep 13 at 9:40




    1




    1




    $begingroup$
    @Ingolifs MNIST is maybe not the best example because it's very simple, but the point is that you'd use same method for neural network.
    $endgroup$
    – Tim
    Sep 13 at 11:59




    $begingroup$
    @Ingolifs MNIST is maybe not the best example because it's very simple, but the point is that you'd use same method for neural network.
    $endgroup$
    – Tim
    Sep 13 at 11:59













    11

















    $begingroup$

    Decision Tree would be another choice. Or Lasso Regression to create a sparse system.



    Check this figure from An Introduction to Statistical Learning book.
    enter image description here
    http://www.sr-sv.com/wp-content/uploads/2015/09/STAT01.png






    share|cite|improve this answer












    $endgroup$














    • $begingroup$
      What is the "ISL" book?
      $endgroup$
      – donlan
      Sep 13 at 16:00






    • 1




      $begingroup$
      @donlan amazon.com/… thanks for suggestion answer revised.
      $endgroup$
      – Haitao Du
      Sep 13 at 17:13










    • $begingroup$
      no problem! was going to look it up after reading this thread
      $endgroup$
      – donlan
      Sep 13 at 18:18















    11

















    $begingroup$

    Decision Tree would be another choice. Or Lasso Regression to create a sparse system.



    Check this figure from An Introduction to Statistical Learning book.
    enter image description here
    http://www.sr-sv.com/wp-content/uploads/2015/09/STAT01.png






    share|cite|improve this answer












    $endgroup$














    • $begingroup$
      What is the "ISL" book?
      $endgroup$
      – donlan
      Sep 13 at 16:00






    • 1




      $begingroup$
      @donlan amazon.com/… thanks for suggestion answer revised.
      $endgroup$
      – Haitao Du
      Sep 13 at 17:13










    • $begingroup$
      no problem! was going to look it up after reading this thread
      $endgroup$
      – donlan
      Sep 13 at 18:18













    11















    11











    11







    $begingroup$

    Decision Tree would be another choice. Or Lasso Regression to create a sparse system.



    Check this figure from An Introduction to Statistical Learning book.
    enter image description here
    http://www.sr-sv.com/wp-content/uploads/2015/09/STAT01.png






    share|cite|improve this answer












    $endgroup$



    Decision Tree would be another choice. Or Lasso Regression to create a sparse system.



    Check this figure from An Introduction to Statistical Learning book.
    enter image description here
    http://www.sr-sv.com/wp-content/uploads/2015/09/STAT01.png







    share|cite|improve this answer















    share|cite|improve this answer




    share|cite|improve this answer








    edited Sep 13 at 17:14

























    answered Sep 12 at 5:20









    Haitao DuHaitao Du

    24.5k8 gold badges68 silver badges167 bronze badges




    24.5k8 gold badges68 silver badges167 bronze badges














    • $begingroup$
      What is the "ISL" book?
      $endgroup$
      – donlan
      Sep 13 at 16:00






    • 1




      $begingroup$
      @donlan amazon.com/… thanks for suggestion answer revised.
      $endgroup$
      – Haitao Du
      Sep 13 at 17:13










    • $begingroup$
      no problem! was going to look it up after reading this thread
      $endgroup$
      – donlan
      Sep 13 at 18:18
















    • $begingroup$
      What is the "ISL" book?
      $endgroup$
      – donlan
      Sep 13 at 16:00






    • 1




      $begingroup$
      @donlan amazon.com/… thanks for suggestion answer revised.
      $endgroup$
      – Haitao Du
      Sep 13 at 17:13










    • $begingroup$
      no problem! was going to look it up after reading this thread
      $endgroup$
      – donlan
      Sep 13 at 18:18















    $begingroup$
    What is the "ISL" book?
    $endgroup$
    – donlan
    Sep 13 at 16:00




    $begingroup$
    What is the "ISL" book?
    $endgroup$
    – donlan
    Sep 13 at 16:00




    1




    1




    $begingroup$
    @donlan amazon.com/… thanks for suggestion answer revised.
    $endgroup$
    – Haitao Du
    Sep 13 at 17:13




    $begingroup$
    @donlan amazon.com/… thanks for suggestion answer revised.
    $endgroup$
    – Haitao Du
    Sep 13 at 17:13












    $begingroup$
    no problem! was going to look it up after reading this thread
    $endgroup$
    – donlan
    Sep 13 at 18:18




    $begingroup$
    no problem! was going to look it up after reading this thread
    $endgroup$
    – donlan
    Sep 13 at 18:18











    6

















    $begingroup$

    No, that is needlessly restrictive. There are a large range of interpretable models including not just (as Frans Rodenburg says) linear models, generalized linear models and generalized additive models, but also machine learning methods used for regression. I include random forests, gradient boosted machines, neural networks, and more. Just because you don't get coefficients out of machine learning models that are similar to those from linear regressions does not mean that their workings cannot be understood. It just takes a bit more work.



    To understand why, I'd recommend reading this question: Obtaining knowledge from a random forest . What it shows is how you can approach making almost any machine learning model interpretable.






    share|cite|improve this answer










    $endgroup$



















      6

















      $begingroup$

      No, that is needlessly restrictive. There are a large range of interpretable models including not just (as Frans Rodenburg says) linear models, generalized linear models and generalized additive models, but also machine learning methods used for regression. I include random forests, gradient boosted machines, neural networks, and more. Just because you don't get coefficients out of machine learning models that are similar to those from linear regressions does not mean that their workings cannot be understood. It just takes a bit more work.



      To understand why, I'd recommend reading this question: Obtaining knowledge from a random forest . What it shows is how you can approach making almost any machine learning model interpretable.






      share|cite|improve this answer










      $endgroup$

















        6















        6











        6







        $begingroup$

        No, that is needlessly restrictive. There are a large range of interpretable models including not just (as Frans Rodenburg says) linear models, generalized linear models and generalized additive models, but also machine learning methods used for regression. I include random forests, gradient boosted machines, neural networks, and more. Just because you don't get coefficients out of machine learning models that are similar to those from linear regressions does not mean that their workings cannot be understood. It just takes a bit more work.



        To understand why, I'd recommend reading this question: Obtaining knowledge from a random forest . What it shows is how you can approach making almost any machine learning model interpretable.






        share|cite|improve this answer










        $endgroup$



        No, that is needlessly restrictive. There are a large range of interpretable models including not just (as Frans Rodenburg says) linear models, generalized linear models and generalized additive models, but also machine learning methods used for regression. I include random forests, gradient boosted machines, neural networks, and more. Just because you don't get coefficients out of machine learning models that are similar to those from linear regressions does not mean that their workings cannot be understood. It just takes a bit more work.



        To understand why, I'd recommend reading this question: Obtaining knowledge from a random forest . What it shows is how you can approach making almost any machine learning model interpretable.







        share|cite|improve this answer













        share|cite|improve this answer




        share|cite|improve this answer










        answered Sep 12 at 5:11









        mkt - Reinstate Monicamkt - Reinstate Monica

        9,7266 gold badges33 silver badges98 bronze badges




        9,7266 gold badges33 silver badges98 bronze badges
























            6

















            $begingroup$

            I would agrre with Tim's and mkt's answers - ML models are not necessarily uninterpretable. I would direct you to the Descriptive mAchine Learning EXplanations, DALEX R package, which is devoted to making ML models interpretable.






            share|cite|improve this answer










            $endgroup$














            • $begingroup$
              the DALEX Package is indeed very interesting, do you know if something similar exists for Python?
              $endgroup$
              – Victor
              Sep 15 at 4:36










            • $begingroup$
              @Victor I don't know of a Python version of DALEX, but you could try calling R from Python using rpy2.readthedocs.io/en/version_2.8.x/introduction.html for example.
              $endgroup$
              – babelproofreader
              Sep 16 at 9:53















            6

















            $begingroup$

            I would agrre with Tim's and mkt's answers - ML models are not necessarily uninterpretable. I would direct you to the Descriptive mAchine Learning EXplanations, DALEX R package, which is devoted to making ML models interpretable.






            share|cite|improve this answer










            $endgroup$














            • $begingroup$
              the DALEX Package is indeed very interesting, do you know if something similar exists for Python?
              $endgroup$
              – Victor
              Sep 15 at 4:36










            • $begingroup$
              @Victor I don't know of a Python version of DALEX, but you could try calling R from Python using rpy2.readthedocs.io/en/version_2.8.x/introduction.html for example.
              $endgroup$
              – babelproofreader
              Sep 16 at 9:53













            6















            6











            6







            $begingroup$

            I would agrre with Tim's and mkt's answers - ML models are not necessarily uninterpretable. I would direct you to the Descriptive mAchine Learning EXplanations, DALEX R package, which is devoted to making ML models interpretable.






            share|cite|improve this answer










            $endgroup$



            I would agrre with Tim's and mkt's answers - ML models are not necessarily uninterpretable. I would direct you to the Descriptive mAchine Learning EXplanations, DALEX R package, which is devoted to making ML models interpretable.







            share|cite|improve this answer













            share|cite|improve this answer




            share|cite|improve this answer










            answered Sep 12 at 8:09









            babelproofreaderbabelproofreader

            3,9803 gold badges19 silver badges33 bronze badges




            3,9803 gold badges19 silver badges33 bronze badges














            • $begingroup$
              the DALEX Package is indeed very interesting, do you know if something similar exists for Python?
              $endgroup$
              – Victor
              Sep 15 at 4:36










            • $begingroup$
              @Victor I don't know of a Python version of DALEX, but you could try calling R from Python using rpy2.readthedocs.io/en/version_2.8.x/introduction.html for example.
              $endgroup$
              – babelproofreader
              Sep 16 at 9:53
















            • $begingroup$
              the DALEX Package is indeed very interesting, do you know if something similar exists for Python?
              $endgroup$
              – Victor
              Sep 15 at 4:36










            • $begingroup$
              @Victor I don't know of a Python version of DALEX, but you could try calling R from Python using rpy2.readthedocs.io/en/version_2.8.x/introduction.html for example.
              $endgroup$
              – babelproofreader
              Sep 16 at 9:53















            $begingroup$
            the DALEX Package is indeed very interesting, do you know if something similar exists for Python?
            $endgroup$
            – Victor
            Sep 15 at 4:36




            $begingroup$
            the DALEX Package is indeed very interesting, do you know if something similar exists for Python?
            $endgroup$
            – Victor
            Sep 15 at 4:36












            $begingroup$
            @Victor I don't know of a Python version of DALEX, but you could try calling R from Python using rpy2.readthedocs.io/en/version_2.8.x/introduction.html for example.
            $endgroup$
            – babelproofreader
            Sep 16 at 9:53




            $begingroup$
            @Victor I don't know of a Python version of DALEX, but you could try calling R from Python using rpy2.readthedocs.io/en/version_2.8.x/introduction.html for example.
            $endgroup$
            – babelproofreader
            Sep 16 at 9:53


















            draft saved

            draft discarded















































            Thanks for contributing an answer to Cross Validated!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f426900%2fif-i-want-an-interpretable-model-are-there-methods-other-than-linear-regression%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown









            Popular posts from this blog

            Distance measures on a map of a game The 2019 Stack Overflow Developer Survey Results Are Inmin distance in a graphShortest distance path on contour plotHow to plot a tilted map?Finding points outside of a diskDelaunay link distanceAnnulus from GeoDisks: drawing a ring on a mapNegative Correlation DistanceFind distance along a path (GPS coordinates)Finding position at given distance in a GeoPathMathematics behind distance estimation using camera

            How to get a smooth, uniform ParametricPlot of a 2D Region?How to plot a complicated Region?How to exclude a region from ParametricPlotHow discretize a region placing vertices on a specific non-uniform gridHow to transform a Plot or a ParametricPlot into a RegionHow can I get a smooth plot of a bounded region?Smooth ParametricPlot3D with RegionFunction?Smooth border of a region ParametricPlotSmooth region boundarySmooth region plot from list of pointsGet minimum y of a certain x in a region

            Genealogie vun de Merowenger Vum Merowech bis zum Chilperich I. | Navigatiounsmenü