Why would they pick a gamma distribution here?Which has the heavier tail, lognormal or gamma?Why Are they doing exponential distributions?Relationship between inverse gamma and gamma distributionThree Parameter Gamma DistributionGamma Distribution with PercentagesParameterization of Gamma DistributionExplain why if $X$ has a $Gamma(alpha, 1)$ distribution, then the random variable $beta X$ has a $Gamma(alpha, beta)$ distributionWhy does not the weighted sum of gamma distribution come from weighted gamma variables?
Would a level 18+ Champion Fighter recover HP outside of combat?
Why is JavaScript not compiled to bytecode before sending over the network?
Why do some AFBs have planes parked at 45 degrees to others?
Is there mention of Maitreya Buddha in Pali Canon?
grammar of "sapientiae tuae non est numerus"
Is it safe to drink the water from the fountains found all over the older parts of Rome?
When was Newton "not good enough" for spaceflight; first use and first absolute requirement for relativistic corrections?
Black screen for 1-2 seconds while alt-tabbing a fullscreen game or using a Windows key
How to find maximum amperage need for fuse
Should we say "todo Colombia" or "toda Colombia"?
How can conflict be conducted between nations when warfare is never an option?
How important is quick release for a tripod?
Toy Vector Library - Magnitude & Unit Vector Functions
I shift the source code, you shift the input!
Is rotating a pawn so that it faces a different direction and then moves in that direction technically permitted according to the 2018 FIDE Laws?
I have to make an API where I can return orders (product name) placed by a customer using customer Id?
Debugging a custom object in LWC
Conversion of mass into energy with 100% efficiency
Why did George Lucas set Star Wars in the past instead of the future?
Are unitarily equivalent permutation matrices permutation similar?
Is it possible to duplicate an item in Stardew Valley?
What are the downsides of being a debt-free country?
What LEGO set do these bags come from
New manager unapproved PTO my old manager approved, because of a conference at the same time that's now a "condition of my employment here"
Why would they pick a gamma distribution here?
Which has the heavier tail, lognormal or gamma?Why Are they doing exponential distributions?Relationship between inverse gamma and gamma distributionThree Parameter Gamma DistributionGamma Distribution with PercentagesParameterization of Gamma DistributionExplain why if $X$ has a $Gamma(alpha, 1)$ distribution, then the random variable $beta X$ has a $Gamma(alpha, beta)$ distributionWhy does not the weighted sum of gamma distribution come from weighted gamma variables?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty
margin-bottom:0;
$begingroup$
In one of the exercises for my course, we're using a Kaggle medical dataset.
The exercise says:
we want to model the distribution of individual charges and we also really want to be able to capture our uncertainty about that distribution so we can better capture the range of values we might see. Loading the data and performing an initial view:

We may suspect from the above that there is some sort of exponential-like distribution at play here. ...The insurance claim charges may possibly be multimodal. The gamma distribution may be applicable and we could test this for the distribution of charges that weren't insurance claims first.
I looked up "Gamma distribution" and found "a continuous, positive-only, unimodal distribution that encodes the time required for «alpha» events to occur in a Poisson process with mean arrival time of «beta»"
There's no time involved here, just unrelated charges, either insured or not.
Why would they choose a gamma distribution?
gamma-distribution
$endgroup$
add a comment
|
$begingroup$
In one of the exercises for my course, we're using a Kaggle medical dataset.
The exercise says:
we want to model the distribution of individual charges and we also really want to be able to capture our uncertainty about that distribution so we can better capture the range of values we might see. Loading the data and performing an initial view:

We may suspect from the above that there is some sort of exponential-like distribution at play here. ...The insurance claim charges may possibly be multimodal. The gamma distribution may be applicable and we could test this for the distribution of charges that weren't insurance claims first.
I looked up "Gamma distribution" and found "a continuous, positive-only, unimodal distribution that encodes the time required for «alpha» events to occur in a Poisson process with mean arrival time of «beta»"
There's no time involved here, just unrelated charges, either insured or not.
Why would they choose a gamma distribution?
gamma-distribution
$endgroup$
add a comment
|
$begingroup$
In one of the exercises for my course, we're using a Kaggle medical dataset.
The exercise says:
we want to model the distribution of individual charges and we also really want to be able to capture our uncertainty about that distribution so we can better capture the range of values we might see. Loading the data and performing an initial view:

We may suspect from the above that there is some sort of exponential-like distribution at play here. ...The insurance claim charges may possibly be multimodal. The gamma distribution may be applicable and we could test this for the distribution of charges that weren't insurance claims first.
I looked up "Gamma distribution" and found "a continuous, positive-only, unimodal distribution that encodes the time required for «alpha» events to occur in a Poisson process with mean arrival time of «beta»"
There's no time involved here, just unrelated charges, either insured or not.
Why would they choose a gamma distribution?
gamma-distribution
$endgroup$
In one of the exercises for my course, we're using a Kaggle medical dataset.
The exercise says:
we want to model the distribution of individual charges and we also really want to be able to capture our uncertainty about that distribution so we can better capture the range of values we might see. Loading the data and performing an initial view:

We may suspect from the above that there is some sort of exponential-like distribution at play here. ...The insurance claim charges may possibly be multimodal. The gamma distribution may be applicable and we could test this for the distribution of charges that weren't insurance claims first.
I looked up "Gamma distribution" and found "a continuous, positive-only, unimodal distribution that encodes the time required for «alpha» events to occur in a Poisson process with mean arrival time of «beta»"
There's no time involved here, just unrelated charges, either insured or not.
Why would they choose a gamma distribution?
gamma-distribution
gamma-distribution
asked Sep 29 at 21:53
Vicki BVicki B
2778 bronze badges
2778 bronze badges
add a comment
|
add a comment
|
1 Answer
1
active
oldest
votes
$begingroup$
When you're considering simple parametric models for the conditional distribution of data (i.e. the distribution of each group, or the expected distribution for each combination of predictor variables), and you are dealing with a positive continuous distribution, the two common choices are Gamma and log-Normal. Besides satisfying the specification of the domain of the distribution (real numbers greater than zero), these distributions are computationally convenient and often make mechanistic sense.
- The log-Normal distribution is easily derived by exponentiating a Normal distribution (conversely, log-transforming log-Normal deviates gives Normal deviates). From a mechanistic point of view, the log-Normal arises via the Central Limit Theorem when each observation reflects the product of a large number of iid random variables. Once you've log-transformed the data, you have access to a huge variety of computational and analytical tools (e.g., anything assuming Normality or using least-squares methods).
- As your question points out, one way that a Gamma distribution arises is as the distribution of waiting times until $n$ independent events with a constant waiting time $lambda$ occur. I can't easily find a reference for a mechanistic model of Gamma distributions of insurance claims, but it also makes sense to use a Gamma distribution from a phenomenological (i.e., data description/computational convenience) point of view. The Gamma distribution is part of the exponential family (which includes the Normal but not the log-Normal), which means that all of the machinery of generalized linear models is available; it also has a particularly convenient form for analysis.
There are other reasons one might pick one or the other - for example, the "heaviness" of the tail of the distribution, which might be important in predicting the frequency of extreme events. There are plenty of other positive, continuous distributions (e.g see this list), but they tend to be used in more specialized applications.
Very few of these distributions will capture the multi-modality you see in the marginal distributions above, but multi-modality may be explained by the data being grouped into categories described by observed categorical predictors. If there are no observable predictors that explain the multimodality, one might choose to fit a finite mixture model based on a mixture of a (small, discrete) number of positive continuous distributions.
$endgroup$
1
$begingroup$
also worth noting that gamma and lognormal models give almost always very similar results
$endgroup$
– carlo
Sep 30 at 15:18
2
$begingroup$
I work in health services research. I can confirm that in general, a gamma or lognormal distribution would be an appropriate choice for a model of healthcare spending or claim amounts. The gamma distribution can be used in time to event models, but those aren't applicable here.
$endgroup$
– Weiwen Ng
Sep 30 at 19:35
$begingroup$
Thanks!! This was very helpful.
$endgroup$
– Vicki B
Oct 1 at 0:07
add a comment
|
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "65"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f429271%2fwhy-would-they-pick-a-gamma-distribution-here%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
When you're considering simple parametric models for the conditional distribution of data (i.e. the distribution of each group, or the expected distribution for each combination of predictor variables), and you are dealing with a positive continuous distribution, the two common choices are Gamma and log-Normal. Besides satisfying the specification of the domain of the distribution (real numbers greater than zero), these distributions are computationally convenient and often make mechanistic sense.
- The log-Normal distribution is easily derived by exponentiating a Normal distribution (conversely, log-transforming log-Normal deviates gives Normal deviates). From a mechanistic point of view, the log-Normal arises via the Central Limit Theorem when each observation reflects the product of a large number of iid random variables. Once you've log-transformed the data, you have access to a huge variety of computational and analytical tools (e.g., anything assuming Normality or using least-squares methods).
- As your question points out, one way that a Gamma distribution arises is as the distribution of waiting times until $n$ independent events with a constant waiting time $lambda$ occur. I can't easily find a reference for a mechanistic model of Gamma distributions of insurance claims, but it also makes sense to use a Gamma distribution from a phenomenological (i.e., data description/computational convenience) point of view. The Gamma distribution is part of the exponential family (which includes the Normal but not the log-Normal), which means that all of the machinery of generalized linear models is available; it also has a particularly convenient form for analysis.
There are other reasons one might pick one or the other - for example, the "heaviness" of the tail of the distribution, which might be important in predicting the frequency of extreme events. There are plenty of other positive, continuous distributions (e.g see this list), but they tend to be used in more specialized applications.
Very few of these distributions will capture the multi-modality you see in the marginal distributions above, but multi-modality may be explained by the data being grouped into categories described by observed categorical predictors. If there are no observable predictors that explain the multimodality, one might choose to fit a finite mixture model based on a mixture of a (small, discrete) number of positive continuous distributions.
$endgroup$
1
$begingroup$
also worth noting that gamma and lognormal models give almost always very similar results
$endgroup$
– carlo
Sep 30 at 15:18
2
$begingroup$
I work in health services research. I can confirm that in general, a gamma or lognormal distribution would be an appropriate choice for a model of healthcare spending or claim amounts. The gamma distribution can be used in time to event models, but those aren't applicable here.
$endgroup$
– Weiwen Ng
Sep 30 at 19:35
$begingroup$
Thanks!! This was very helpful.
$endgroup$
– Vicki B
Oct 1 at 0:07
add a comment
|
$begingroup$
When you're considering simple parametric models for the conditional distribution of data (i.e. the distribution of each group, or the expected distribution for each combination of predictor variables), and you are dealing with a positive continuous distribution, the two common choices are Gamma and log-Normal. Besides satisfying the specification of the domain of the distribution (real numbers greater than zero), these distributions are computationally convenient and often make mechanistic sense.
- The log-Normal distribution is easily derived by exponentiating a Normal distribution (conversely, log-transforming log-Normal deviates gives Normal deviates). From a mechanistic point of view, the log-Normal arises via the Central Limit Theorem when each observation reflects the product of a large number of iid random variables. Once you've log-transformed the data, you have access to a huge variety of computational and analytical tools (e.g., anything assuming Normality or using least-squares methods).
- As your question points out, one way that a Gamma distribution arises is as the distribution of waiting times until $n$ independent events with a constant waiting time $lambda$ occur. I can't easily find a reference for a mechanistic model of Gamma distributions of insurance claims, but it also makes sense to use a Gamma distribution from a phenomenological (i.e., data description/computational convenience) point of view. The Gamma distribution is part of the exponential family (which includes the Normal but not the log-Normal), which means that all of the machinery of generalized linear models is available; it also has a particularly convenient form for analysis.
There are other reasons one might pick one or the other - for example, the "heaviness" of the tail of the distribution, which might be important in predicting the frequency of extreme events. There are plenty of other positive, continuous distributions (e.g see this list), but they tend to be used in more specialized applications.
Very few of these distributions will capture the multi-modality you see in the marginal distributions above, but multi-modality may be explained by the data being grouped into categories described by observed categorical predictors. If there are no observable predictors that explain the multimodality, one might choose to fit a finite mixture model based on a mixture of a (small, discrete) number of positive continuous distributions.
$endgroup$
1
$begingroup$
also worth noting that gamma and lognormal models give almost always very similar results
$endgroup$
– carlo
Sep 30 at 15:18
2
$begingroup$
I work in health services research. I can confirm that in general, a gamma or lognormal distribution would be an appropriate choice for a model of healthcare spending or claim amounts. The gamma distribution can be used in time to event models, but those aren't applicable here.
$endgroup$
– Weiwen Ng
Sep 30 at 19:35
$begingroup$
Thanks!! This was very helpful.
$endgroup$
– Vicki B
Oct 1 at 0:07
add a comment
|
$begingroup$
When you're considering simple parametric models for the conditional distribution of data (i.e. the distribution of each group, or the expected distribution for each combination of predictor variables), and you are dealing with a positive continuous distribution, the two common choices are Gamma and log-Normal. Besides satisfying the specification of the domain of the distribution (real numbers greater than zero), these distributions are computationally convenient and often make mechanistic sense.
- The log-Normal distribution is easily derived by exponentiating a Normal distribution (conversely, log-transforming log-Normal deviates gives Normal deviates). From a mechanistic point of view, the log-Normal arises via the Central Limit Theorem when each observation reflects the product of a large number of iid random variables. Once you've log-transformed the data, you have access to a huge variety of computational and analytical tools (e.g., anything assuming Normality or using least-squares methods).
- As your question points out, one way that a Gamma distribution arises is as the distribution of waiting times until $n$ independent events with a constant waiting time $lambda$ occur. I can't easily find a reference for a mechanistic model of Gamma distributions of insurance claims, but it also makes sense to use a Gamma distribution from a phenomenological (i.e., data description/computational convenience) point of view. The Gamma distribution is part of the exponential family (which includes the Normal but not the log-Normal), which means that all of the machinery of generalized linear models is available; it also has a particularly convenient form for analysis.
There are other reasons one might pick one or the other - for example, the "heaviness" of the tail of the distribution, which might be important in predicting the frequency of extreme events. There are plenty of other positive, continuous distributions (e.g see this list), but they tend to be used in more specialized applications.
Very few of these distributions will capture the multi-modality you see in the marginal distributions above, but multi-modality may be explained by the data being grouped into categories described by observed categorical predictors. If there are no observable predictors that explain the multimodality, one might choose to fit a finite mixture model based on a mixture of a (small, discrete) number of positive continuous distributions.
$endgroup$
When you're considering simple parametric models for the conditional distribution of data (i.e. the distribution of each group, or the expected distribution for each combination of predictor variables), and you are dealing with a positive continuous distribution, the two common choices are Gamma and log-Normal. Besides satisfying the specification of the domain of the distribution (real numbers greater than zero), these distributions are computationally convenient and often make mechanistic sense.
- The log-Normal distribution is easily derived by exponentiating a Normal distribution (conversely, log-transforming log-Normal deviates gives Normal deviates). From a mechanistic point of view, the log-Normal arises via the Central Limit Theorem when each observation reflects the product of a large number of iid random variables. Once you've log-transformed the data, you have access to a huge variety of computational and analytical tools (e.g., anything assuming Normality or using least-squares methods).
- As your question points out, one way that a Gamma distribution arises is as the distribution of waiting times until $n$ independent events with a constant waiting time $lambda$ occur. I can't easily find a reference for a mechanistic model of Gamma distributions of insurance claims, but it also makes sense to use a Gamma distribution from a phenomenological (i.e., data description/computational convenience) point of view. The Gamma distribution is part of the exponential family (which includes the Normal but not the log-Normal), which means that all of the machinery of generalized linear models is available; it also has a particularly convenient form for analysis.
There are other reasons one might pick one or the other - for example, the "heaviness" of the tail of the distribution, which might be important in predicting the frequency of extreme events. There are plenty of other positive, continuous distributions (e.g see this list), but they tend to be used in more specialized applications.
Very few of these distributions will capture the multi-modality you see in the marginal distributions above, but multi-modality may be explained by the data being grouped into categories described by observed categorical predictors. If there are no observable predictors that explain the multimodality, one might choose to fit a finite mixture model based on a mixture of a (small, discrete) number of positive continuous distributions.
edited Sep 30 at 0:51
answered Sep 29 at 23:46
Ben BolkerBen Bolker
27.1k2 gold badges75 silver badges104 bronze badges
27.1k2 gold badges75 silver badges104 bronze badges
1
$begingroup$
also worth noting that gamma and lognormal models give almost always very similar results
$endgroup$
– carlo
Sep 30 at 15:18
2
$begingroup$
I work in health services research. I can confirm that in general, a gamma or lognormal distribution would be an appropriate choice for a model of healthcare spending or claim amounts. The gamma distribution can be used in time to event models, but those aren't applicable here.
$endgroup$
– Weiwen Ng
Sep 30 at 19:35
$begingroup$
Thanks!! This was very helpful.
$endgroup$
– Vicki B
Oct 1 at 0:07
add a comment
|
1
$begingroup$
also worth noting that gamma and lognormal models give almost always very similar results
$endgroup$
– carlo
Sep 30 at 15:18
2
$begingroup$
I work in health services research. I can confirm that in general, a gamma or lognormal distribution would be an appropriate choice for a model of healthcare spending or claim amounts. The gamma distribution can be used in time to event models, but those aren't applicable here.
$endgroup$
– Weiwen Ng
Sep 30 at 19:35
$begingroup$
Thanks!! This was very helpful.
$endgroup$
– Vicki B
Oct 1 at 0:07
1
1
$begingroup$
also worth noting that gamma and lognormal models give almost always very similar results
$endgroup$
– carlo
Sep 30 at 15:18
$begingroup$
also worth noting that gamma and lognormal models give almost always very similar results
$endgroup$
– carlo
Sep 30 at 15:18
2
2
$begingroup$
I work in health services research. I can confirm that in general, a gamma or lognormal distribution would be an appropriate choice for a model of healthcare spending or claim amounts. The gamma distribution can be used in time to event models, but those aren't applicable here.
$endgroup$
– Weiwen Ng
Sep 30 at 19:35
$begingroup$
I work in health services research. I can confirm that in general, a gamma or lognormal distribution would be an appropriate choice for a model of healthcare spending or claim amounts. The gamma distribution can be used in time to event models, but those aren't applicable here.
$endgroup$
– Weiwen Ng
Sep 30 at 19:35
$begingroup$
Thanks!! This was very helpful.
$endgroup$
– Vicki B
Oct 1 at 0:07
$begingroup$
Thanks!! This was very helpful.
$endgroup$
– Vicki B
Oct 1 at 0:07
add a comment
|
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f429271%2fwhy-would-they-pick-a-gamma-distribution-here%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown