Why do we not always use the closed testing principle for multiple comparisons?Multiple hypothesis testing and F tests?What does independence between comparisons in multiple comparisons mean?Why aren't multiple hypothesis corrections applied to all experiments since the dawn of time?FWER, FDR and multiple comparisons for BeginnersPlease convince me why I should bother correcting p values with multiple comparisons, because I don't see the pointAfter discovering alpha error inflation, how to estimate severity of the effect?
Propagator of a real scalar field does not give an unambiguous result
Why do Russian names transliterated into English have unpronounceable 'k's before 'h's (e.g. 'Mikhail' instead of just 'Mihail')?
How to pay less tax on a high salary?
Why do amateur radio operators call an RF choke a balun?
How can resurrecting a person make them evil?
Found that newly hired employee wasn't completely frank. What would be good to do in that situation?
Are soldered electrical connections code-compliant?
Is a triangle waveform a type of pulse width modulation?
"Don't invest now because the market is high"
Feeling of forcing oneself to do something
Did the computer mouse always output relative x/y and not absolute?
Mostly One Way Travel : Says Grandpa
Intersection of sorted lists
If I buy a super off-peak ticket, can I travel on a off-peak train and pay the difference?
How can I convince my department that I have the academic freedom to select textbooks and use multiple-choice tests in my courses?
Thoughts on using user stories to define business/platform needs?
How can I get a ride in the jump seat as a non-professional pilot?
Keep password in macro?
Whats wrong with this model theoretic proof of the twin primes conjecture?
Clonezilla made a smaller image than actual drive size
Brake disc and pads corrosion, do they need replacement?
Why should I use an ~/extensions directory rather than the default ~/ext for new extensions?
How do electric hot water heaters explode and what can be done to prevent that from happening?
Why do cargo airlines frequently choose passenger aircraft rather than aircraft designed specifically for cargo?
Why do we not always use the closed testing principle for multiple comparisons?
Multiple hypothesis testing and F tests?What does independence between comparisons in multiple comparisons mean?Why aren't multiple hypothesis corrections applied to all experiments since the dawn of time?FWER, FDR and multiple comparisons for BeginnersPlease convince me why I should bother correcting p values with multiple comparisons, because I don't see the pointAfter discovering alpha error inflation, how to estimate severity of the effect?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty
margin-bottom:0;
$begingroup$
From the Wikipedia entry:
Suppose there are k hypotheses $H_1,..., H_k$ to be tested and the overall
type I error rate is $alpha$. The closed testing principle allows the
rejection of any one of these elementary hypotheses, say $H_i$, if all
possible intersection hypotheses involving $H_i$ can be rejected by using
valid local level $alpha$ tests. It controls the familywise error rate for
all the k hypotheses at level α in the strong sense.
By using the closed testing principle, we ensure all the false positive rate of all hypothesis tests in a family will be controlled at $alpha$. This seems too good to be true. Why do we bother with Bonferonni or other corrections for multiple comparisons, when we can just test all the intersection hypotheses? The only downside seems to be that you need to conduct $2^n$ tests.
EDIT: I am thinking about a relatively small number of individual hypothesis tests, say $n = 3 textor 5$. Then you would have a reasonable number of tests.
hypothesis-testing statistical-significance p-value multiple-comparisons
$endgroup$
|
show 1 more comment
$begingroup$
From the Wikipedia entry:
Suppose there are k hypotheses $H_1,..., H_k$ to be tested and the overall
type I error rate is $alpha$. The closed testing principle allows the
rejection of any one of these elementary hypotheses, say $H_i$, if all
possible intersection hypotheses involving $H_i$ can be rejected by using
valid local level $alpha$ tests. It controls the familywise error rate for
all the k hypotheses at level α in the strong sense.
By using the closed testing principle, we ensure all the false positive rate of all hypothesis tests in a family will be controlled at $alpha$. This seems too good to be true. Why do we bother with Bonferonni or other corrections for multiple comparisons, when we can just test all the intersection hypotheses? The only downside seems to be that you need to conduct $2^n$ tests.
EDIT: I am thinking about a relatively small number of individual hypothesis tests, say $n = 3 textor 5$. Then you would have a reasonable number of tests.
hypothesis-testing statistical-significance p-value multiple-comparisons
$endgroup$
2
$begingroup$
Your "only" downside is a huge one! It limits practical procedures to $nll 30$ or so, and in cases where the tests are onerous to compute (think bootstrapping, for instance) to $nll 10.$
$endgroup$
– whuber♦
Sep 27 at 13:33
$begingroup$
@whuber Not really true, is it? The closed testing principle is used all the time.
$endgroup$
– Björn
Sep 27 at 13:39
$begingroup$
@whuber, I had in mind a relatively small number of hypothesis tests. For example, I'm curious why didn't I learn this for ANOVA with two treatments and an interaction.
$endgroup$
– EliK
Sep 27 at 13:41
$begingroup$
I wonder if this sucks power from the test. Sure, you control $alpha$, but at what cost? If even the Bonferroni correction is more powerful, then I think that this would be an inadmissable testing procedure. (Confession: I don't know this to be the case, and it might be totaly wrong.)
$endgroup$
– Dave
Sep 27 at 13:53
$begingroup$
@Dave I don't think Bonferroni is more powerful, but I'm not sure.
$endgroup$
– EliK
Sep 27 at 14:12
|
show 1 more comment
$begingroup$
From the Wikipedia entry:
Suppose there are k hypotheses $H_1,..., H_k$ to be tested and the overall
type I error rate is $alpha$. The closed testing principle allows the
rejection of any one of these elementary hypotheses, say $H_i$, if all
possible intersection hypotheses involving $H_i$ can be rejected by using
valid local level $alpha$ tests. It controls the familywise error rate for
all the k hypotheses at level α in the strong sense.
By using the closed testing principle, we ensure all the false positive rate of all hypothesis tests in a family will be controlled at $alpha$. This seems too good to be true. Why do we bother with Bonferonni or other corrections for multiple comparisons, when we can just test all the intersection hypotheses? The only downside seems to be that you need to conduct $2^n$ tests.
EDIT: I am thinking about a relatively small number of individual hypothesis tests, say $n = 3 textor 5$. Then you would have a reasonable number of tests.
hypothesis-testing statistical-significance p-value multiple-comparisons
$endgroup$
From the Wikipedia entry:
Suppose there are k hypotheses $H_1,..., H_k$ to be tested and the overall
type I error rate is $alpha$. The closed testing principle allows the
rejection of any one of these elementary hypotheses, say $H_i$, if all
possible intersection hypotheses involving $H_i$ can be rejected by using
valid local level $alpha$ tests. It controls the familywise error rate for
all the k hypotheses at level α in the strong sense.
By using the closed testing principle, we ensure all the false positive rate of all hypothesis tests in a family will be controlled at $alpha$. This seems too good to be true. Why do we bother with Bonferonni or other corrections for multiple comparisons, when we can just test all the intersection hypotheses? The only downside seems to be that you need to conduct $2^n$ tests.
EDIT: I am thinking about a relatively small number of individual hypothesis tests, say $n = 3 textor 5$. Then you would have a reasonable number of tests.
hypothesis-testing statistical-significance p-value multiple-comparisons
hypothesis-testing statistical-significance p-value multiple-comparisons
edited Sep 27 at 13:40
EliK
asked Sep 27 at 13:29
EliKEliK
6573 silver badges16 bronze badges
6573 silver badges16 bronze badges
2
$begingroup$
Your "only" downside is a huge one! It limits practical procedures to $nll 30$ or so, and in cases where the tests are onerous to compute (think bootstrapping, for instance) to $nll 10.$
$endgroup$
– whuber♦
Sep 27 at 13:33
$begingroup$
@whuber Not really true, is it? The closed testing principle is used all the time.
$endgroup$
– Björn
Sep 27 at 13:39
$begingroup$
@whuber, I had in mind a relatively small number of hypothesis tests. For example, I'm curious why didn't I learn this for ANOVA with two treatments and an interaction.
$endgroup$
– EliK
Sep 27 at 13:41
$begingroup$
I wonder if this sucks power from the test. Sure, you control $alpha$, but at what cost? If even the Bonferroni correction is more powerful, then I think that this would be an inadmissable testing procedure. (Confession: I don't know this to be the case, and it might be totaly wrong.)
$endgroup$
– Dave
Sep 27 at 13:53
$begingroup$
@Dave I don't think Bonferroni is more powerful, but I'm not sure.
$endgroup$
– EliK
Sep 27 at 14:12
|
show 1 more comment
2
$begingroup$
Your "only" downside is a huge one! It limits practical procedures to $nll 30$ or so, and in cases where the tests are onerous to compute (think bootstrapping, for instance) to $nll 10.$
$endgroup$
– whuber♦
Sep 27 at 13:33
$begingroup$
@whuber Not really true, is it? The closed testing principle is used all the time.
$endgroup$
– Björn
Sep 27 at 13:39
$begingroup$
@whuber, I had in mind a relatively small number of hypothesis tests. For example, I'm curious why didn't I learn this for ANOVA with two treatments and an interaction.
$endgroup$
– EliK
Sep 27 at 13:41
$begingroup$
I wonder if this sucks power from the test. Sure, you control $alpha$, but at what cost? If even the Bonferroni correction is more powerful, then I think that this would be an inadmissable testing procedure. (Confession: I don't know this to be the case, and it might be totaly wrong.)
$endgroup$
– Dave
Sep 27 at 13:53
$begingroup$
@Dave I don't think Bonferroni is more powerful, but I'm not sure.
$endgroup$
– EliK
Sep 27 at 14:12
2
2
$begingroup$
Your "only" downside is a huge one! It limits practical procedures to $nll 30$ or so, and in cases where the tests are onerous to compute (think bootstrapping, for instance) to $nll 10.$
$endgroup$
– whuber♦
Sep 27 at 13:33
$begingroup$
Your "only" downside is a huge one! It limits practical procedures to $nll 30$ or so, and in cases where the tests are onerous to compute (think bootstrapping, for instance) to $nll 10.$
$endgroup$
– whuber♦
Sep 27 at 13:33
$begingroup$
@whuber Not really true, is it? The closed testing principle is used all the time.
$endgroup$
– Björn
Sep 27 at 13:39
$begingroup$
@whuber Not really true, is it? The closed testing principle is used all the time.
$endgroup$
– Björn
Sep 27 at 13:39
$begingroup$
@whuber, I had in mind a relatively small number of hypothesis tests. For example, I'm curious why didn't I learn this for ANOVA with two treatments and an interaction.
$endgroup$
– EliK
Sep 27 at 13:41
$begingroup$
@whuber, I had in mind a relatively small number of hypothesis tests. For example, I'm curious why didn't I learn this for ANOVA with two treatments and an interaction.
$endgroup$
– EliK
Sep 27 at 13:41
$begingroup$
I wonder if this sucks power from the test. Sure, you control $alpha$, but at what cost? If even the Bonferroni correction is more powerful, then I think that this would be an inadmissable testing procedure. (Confession: I don't know this to be the case, and it might be totaly wrong.)
$endgroup$
– Dave
Sep 27 at 13:53
$begingroup$
I wonder if this sucks power from the test. Sure, you control $alpha$, but at what cost? If even the Bonferroni correction is more powerful, then I think that this would be an inadmissable testing procedure. (Confession: I don't know this to be the case, and it might be totaly wrong.)
$endgroup$
– Dave
Sep 27 at 13:53
$begingroup$
@Dave I don't think Bonferroni is more powerful, but I'm not sure.
$endgroup$
– EliK
Sep 27 at 14:12
$begingroup$
@Dave I don't think Bonferroni is more powerful, but I'm not sure.
$endgroup$
– EliK
Sep 27 at 14:12
|
show 1 more comment
1 Answer
1
active
oldest
votes
$begingroup$
It is actually used quite often (in fact, Bonferroni and Bonferroni-Holm can be shown to be valid tests using the closed testing principle - see below). Part of the reason why some simple procedures like Bonferroni are still so popular is of course, that they are very easy to implement and it is easy to communicate what you did.
E.g. even for the simple Bonferroni or Bonferroni-Holm writing down the tests for the intersection null hypotheses is a bit tedious. That's why the graphical approach to constructing closed testing procedures is very popular - at least in the clinical trials setting (where there is usually a trained statistician at hand that can help pre-specify a tailored testing procedure for a specific trial). This graphical approach provides a really easy way to construct valid testing procedures and makes communication easier. Various variants, extensions and improvements such as exploiting correlations, prioritization of certain hypotheses depending on which hypotheses are already rejected etc. have been proposed over the years.
Bonferroni vs. Bonferroni-Holm as closed testing procedures example
The Bonferroni test can be easily written down in terms of the closed testing principle, e.g. in the case of two null hypotheses as
- Test for $H_1 cap H_2$: $p_1 leq alpha/2$ and $p_2 leq alpha/2$
- Test for elementary null hypothesis $H_1$: $p_1 leq alpha/2$
- Test for elementary null hypothesis $H_2$: $p_2 leq alpha/2$
with $H_1$ rejected overall, if both $H_1$ and $H_1 cap H_2$ are rejected. Thus, we know it is a valid test. You also immediately see that for the elementary null hypotheses $H_1$ and $H_2$ you are using a test that is not exhausting the significance level and that the Bonferroni-Holm test
- Test for $H_1$ and $H_2$: $p_1 leq alpha/2$ and $p_2 leq alpha/2$
- Test for $H_1$: $p_1 leq alpha$
- Test for $H_2$: $p_2 leq alpha$
is uniformly more powerful.
Figure 1: Illustration of the graphical approach for displaying these two procedures
Power considerations and other considerations
Usually, there's no downside in terms of power to using a closed testing procedure - if anything it tends to be easier to make sure the test fully exhausts the significance level. It can sometimes be a bit tricky to build in some desired features (e.g. exploiting a known correlation) while keep some other desired contraints (e.g. that the closed testing procedure can be easily displayed graphically due to the consistency requirement on the testing procedure).
$endgroup$
$begingroup$
Thank you. I have a followup question. Let's ignore changing the importance of particular test or accounting for correlation. If I can construct a uniformally most powerful test (UMP) for every hypothesis test in the family, is that the most powerful procedure I can construct that still controls the Family wise Error Rate.
$endgroup$
– EliK
Sep 27 at 14:18
$begingroup$
I didn't see your addition when I asked my question. I think your addition covers my question. Thank you.
$endgroup$
– EliK
Sep 27 at 14:19
add a comment
|
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "65"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f428996%2fwhy-do-we-not-always-use-the-closed-testing-principle-for-multiple-comparisons%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
It is actually used quite often (in fact, Bonferroni and Bonferroni-Holm can be shown to be valid tests using the closed testing principle - see below). Part of the reason why some simple procedures like Bonferroni are still so popular is of course, that they are very easy to implement and it is easy to communicate what you did.
E.g. even for the simple Bonferroni or Bonferroni-Holm writing down the tests for the intersection null hypotheses is a bit tedious. That's why the graphical approach to constructing closed testing procedures is very popular - at least in the clinical trials setting (where there is usually a trained statistician at hand that can help pre-specify a tailored testing procedure for a specific trial). This graphical approach provides a really easy way to construct valid testing procedures and makes communication easier. Various variants, extensions and improvements such as exploiting correlations, prioritization of certain hypotheses depending on which hypotheses are already rejected etc. have been proposed over the years.
Bonferroni vs. Bonferroni-Holm as closed testing procedures example
The Bonferroni test can be easily written down in terms of the closed testing principle, e.g. in the case of two null hypotheses as
- Test for $H_1 cap H_2$: $p_1 leq alpha/2$ and $p_2 leq alpha/2$
- Test for elementary null hypothesis $H_1$: $p_1 leq alpha/2$
- Test for elementary null hypothesis $H_2$: $p_2 leq alpha/2$
with $H_1$ rejected overall, if both $H_1$ and $H_1 cap H_2$ are rejected. Thus, we know it is a valid test. You also immediately see that for the elementary null hypotheses $H_1$ and $H_2$ you are using a test that is not exhausting the significance level and that the Bonferroni-Holm test
- Test for $H_1$ and $H_2$: $p_1 leq alpha/2$ and $p_2 leq alpha/2$
- Test for $H_1$: $p_1 leq alpha$
- Test for $H_2$: $p_2 leq alpha$
is uniformly more powerful.
Figure 1: Illustration of the graphical approach for displaying these two procedures
Power considerations and other considerations
Usually, there's no downside in terms of power to using a closed testing procedure - if anything it tends to be easier to make sure the test fully exhausts the significance level. It can sometimes be a bit tricky to build in some desired features (e.g. exploiting a known correlation) while keep some other desired contraints (e.g. that the closed testing procedure can be easily displayed graphically due to the consistency requirement on the testing procedure).
$endgroup$
$begingroup$
Thank you. I have a followup question. Let's ignore changing the importance of particular test or accounting for correlation. If I can construct a uniformally most powerful test (UMP) for every hypothesis test in the family, is that the most powerful procedure I can construct that still controls the Family wise Error Rate.
$endgroup$
– EliK
Sep 27 at 14:18
$begingroup$
I didn't see your addition when I asked my question. I think your addition covers my question. Thank you.
$endgroup$
– EliK
Sep 27 at 14:19
add a comment
|
$begingroup$
It is actually used quite often (in fact, Bonferroni and Bonferroni-Holm can be shown to be valid tests using the closed testing principle - see below). Part of the reason why some simple procedures like Bonferroni are still so popular is of course, that they are very easy to implement and it is easy to communicate what you did.
E.g. even for the simple Bonferroni or Bonferroni-Holm writing down the tests for the intersection null hypotheses is a bit tedious. That's why the graphical approach to constructing closed testing procedures is very popular - at least in the clinical trials setting (where there is usually a trained statistician at hand that can help pre-specify a tailored testing procedure for a specific trial). This graphical approach provides a really easy way to construct valid testing procedures and makes communication easier. Various variants, extensions and improvements such as exploiting correlations, prioritization of certain hypotheses depending on which hypotheses are already rejected etc. have been proposed over the years.
Bonferroni vs. Bonferroni-Holm as closed testing procedures example
The Bonferroni test can be easily written down in terms of the closed testing principle, e.g. in the case of two null hypotheses as
- Test for $H_1 cap H_2$: $p_1 leq alpha/2$ and $p_2 leq alpha/2$
- Test for elementary null hypothesis $H_1$: $p_1 leq alpha/2$
- Test for elementary null hypothesis $H_2$: $p_2 leq alpha/2$
with $H_1$ rejected overall, if both $H_1$ and $H_1 cap H_2$ are rejected. Thus, we know it is a valid test. You also immediately see that for the elementary null hypotheses $H_1$ and $H_2$ you are using a test that is not exhausting the significance level and that the Bonferroni-Holm test
- Test for $H_1$ and $H_2$: $p_1 leq alpha/2$ and $p_2 leq alpha/2$
- Test for $H_1$: $p_1 leq alpha$
- Test for $H_2$: $p_2 leq alpha$
is uniformly more powerful.
Figure 1: Illustration of the graphical approach for displaying these two procedures
Power considerations and other considerations
Usually, there's no downside in terms of power to using a closed testing procedure - if anything it tends to be easier to make sure the test fully exhausts the significance level. It can sometimes be a bit tricky to build in some desired features (e.g. exploiting a known correlation) while keep some other desired contraints (e.g. that the closed testing procedure can be easily displayed graphically due to the consistency requirement on the testing procedure).
$endgroup$
$begingroup$
Thank you. I have a followup question. Let's ignore changing the importance of particular test or accounting for correlation. If I can construct a uniformally most powerful test (UMP) for every hypothesis test in the family, is that the most powerful procedure I can construct that still controls the Family wise Error Rate.
$endgroup$
– EliK
Sep 27 at 14:18
$begingroup$
I didn't see your addition when I asked my question. I think your addition covers my question. Thank you.
$endgroup$
– EliK
Sep 27 at 14:19
add a comment
|
$begingroup$
It is actually used quite often (in fact, Bonferroni and Bonferroni-Holm can be shown to be valid tests using the closed testing principle - see below). Part of the reason why some simple procedures like Bonferroni are still so popular is of course, that they are very easy to implement and it is easy to communicate what you did.
E.g. even for the simple Bonferroni or Bonferroni-Holm writing down the tests for the intersection null hypotheses is a bit tedious. That's why the graphical approach to constructing closed testing procedures is very popular - at least in the clinical trials setting (where there is usually a trained statistician at hand that can help pre-specify a tailored testing procedure for a specific trial). This graphical approach provides a really easy way to construct valid testing procedures and makes communication easier. Various variants, extensions and improvements such as exploiting correlations, prioritization of certain hypotheses depending on which hypotheses are already rejected etc. have been proposed over the years.
Bonferroni vs. Bonferroni-Holm as closed testing procedures example
The Bonferroni test can be easily written down in terms of the closed testing principle, e.g. in the case of two null hypotheses as
- Test for $H_1 cap H_2$: $p_1 leq alpha/2$ and $p_2 leq alpha/2$
- Test for elementary null hypothesis $H_1$: $p_1 leq alpha/2$
- Test for elementary null hypothesis $H_2$: $p_2 leq alpha/2$
with $H_1$ rejected overall, if both $H_1$ and $H_1 cap H_2$ are rejected. Thus, we know it is a valid test. You also immediately see that for the elementary null hypotheses $H_1$ and $H_2$ you are using a test that is not exhausting the significance level and that the Bonferroni-Holm test
- Test for $H_1$ and $H_2$: $p_1 leq alpha/2$ and $p_2 leq alpha/2$
- Test for $H_1$: $p_1 leq alpha$
- Test for $H_2$: $p_2 leq alpha$
is uniformly more powerful.
Figure 1: Illustration of the graphical approach for displaying these two procedures
Power considerations and other considerations
Usually, there's no downside in terms of power to using a closed testing procedure - if anything it tends to be easier to make sure the test fully exhausts the significance level. It can sometimes be a bit tricky to build in some desired features (e.g. exploiting a known correlation) while keep some other desired contraints (e.g. that the closed testing procedure can be easily displayed graphically due to the consistency requirement on the testing procedure).
$endgroup$
It is actually used quite often (in fact, Bonferroni and Bonferroni-Holm can be shown to be valid tests using the closed testing principle - see below). Part of the reason why some simple procedures like Bonferroni are still so popular is of course, that they are very easy to implement and it is easy to communicate what you did.
E.g. even for the simple Bonferroni or Bonferroni-Holm writing down the tests for the intersection null hypotheses is a bit tedious. That's why the graphical approach to constructing closed testing procedures is very popular - at least in the clinical trials setting (where there is usually a trained statistician at hand that can help pre-specify a tailored testing procedure for a specific trial). This graphical approach provides a really easy way to construct valid testing procedures and makes communication easier. Various variants, extensions and improvements such as exploiting correlations, prioritization of certain hypotheses depending on which hypotheses are already rejected etc. have been proposed over the years.
Bonferroni vs. Bonferroni-Holm as closed testing procedures example
The Bonferroni test can be easily written down in terms of the closed testing principle, e.g. in the case of two null hypotheses as
- Test for $H_1 cap H_2$: $p_1 leq alpha/2$ and $p_2 leq alpha/2$
- Test for elementary null hypothesis $H_1$: $p_1 leq alpha/2$
- Test for elementary null hypothesis $H_2$: $p_2 leq alpha/2$
with $H_1$ rejected overall, if both $H_1$ and $H_1 cap H_2$ are rejected. Thus, we know it is a valid test. You also immediately see that for the elementary null hypotheses $H_1$ and $H_2$ you are using a test that is not exhausting the significance level and that the Bonferroni-Holm test
- Test for $H_1$ and $H_2$: $p_1 leq alpha/2$ and $p_2 leq alpha/2$
- Test for $H_1$: $p_1 leq alpha$
- Test for $H_2$: $p_2 leq alpha$
is uniformly more powerful.
Figure 1: Illustration of the graphical approach for displaying these two procedures
Power considerations and other considerations
Usually, there's no downside in terms of power to using a closed testing procedure - if anything it tends to be easier to make sure the test fully exhausts the significance level. It can sometimes be a bit tricky to build in some desired features (e.g. exploiting a known correlation) while keep some other desired contraints (e.g. that the closed testing procedure can be easily displayed graphically due to the consistency requirement on the testing procedure).
edited Sep 27 at 14:13
answered Sep 27 at 14:03
BjörnBjörn
13.1k2 gold badges13 silver badges46 bronze badges
13.1k2 gold badges13 silver badges46 bronze badges
$begingroup$
Thank you. I have a followup question. Let's ignore changing the importance of particular test or accounting for correlation. If I can construct a uniformally most powerful test (UMP) for every hypothesis test in the family, is that the most powerful procedure I can construct that still controls the Family wise Error Rate.
$endgroup$
– EliK
Sep 27 at 14:18
$begingroup$
I didn't see your addition when I asked my question. I think your addition covers my question. Thank you.
$endgroup$
– EliK
Sep 27 at 14:19
add a comment
|
$begingroup$
Thank you. I have a followup question. Let's ignore changing the importance of particular test or accounting for correlation. If I can construct a uniformally most powerful test (UMP) for every hypothesis test in the family, is that the most powerful procedure I can construct that still controls the Family wise Error Rate.
$endgroup$
– EliK
Sep 27 at 14:18
$begingroup$
I didn't see your addition when I asked my question. I think your addition covers my question. Thank you.
$endgroup$
– EliK
Sep 27 at 14:19
$begingroup$
Thank you. I have a followup question. Let's ignore changing the importance of particular test or accounting for correlation. If I can construct a uniformally most powerful test (UMP) for every hypothesis test in the family, is that the most powerful procedure I can construct that still controls the Family wise Error Rate.
$endgroup$
– EliK
Sep 27 at 14:18
$begingroup$
Thank you. I have a followup question. Let's ignore changing the importance of particular test or accounting for correlation. If I can construct a uniformally most powerful test (UMP) for every hypothesis test in the family, is that the most powerful procedure I can construct that still controls the Family wise Error Rate.
$endgroup$
– EliK
Sep 27 at 14:18
$begingroup$
I didn't see your addition when I asked my question. I think your addition covers my question. Thank you.
$endgroup$
– EliK
Sep 27 at 14:19
$begingroup$
I didn't see your addition when I asked my question. I think your addition covers my question. Thank you.
$endgroup$
– EliK
Sep 27 at 14:19
add a comment
|
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f428996%2fwhy-do-we-not-always-use-the-closed-testing-principle-for-multiple-comparisons%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
$begingroup$
Your "only" downside is a huge one! It limits practical procedures to $nll 30$ or so, and in cases where the tests are onerous to compute (think bootstrapping, for instance) to $nll 10.$
$endgroup$
– whuber♦
Sep 27 at 13:33
$begingroup$
@whuber Not really true, is it? The closed testing principle is used all the time.
$endgroup$
– Björn
Sep 27 at 13:39
$begingroup$
@whuber, I had in mind a relatively small number of hypothesis tests. For example, I'm curious why didn't I learn this for ANOVA with two treatments and an interaction.
$endgroup$
– EliK
Sep 27 at 13:41
$begingroup$
I wonder if this sucks power from the test. Sure, you control $alpha$, but at what cost? If even the Bonferroni correction is more powerful, then I think that this would be an inadmissable testing procedure. (Confession: I don't know this to be the case, and it might be totaly wrong.)
$endgroup$
– Dave
Sep 27 at 13:53
$begingroup$
@Dave I don't think Bonferroni is more powerful, but I'm not sure.
$endgroup$
– EliK
Sep 27 at 14:12