Extrapolation v. InterpolationWhat is wrong with extrapolation?Fourier/trigonometric interpolationUncertainty propagation in linear interpolationInterpolating binned data such that bin average is preservedWhat are the theoretical reasons for why extrapolation “less reliable” than interpolation?Difference between extrapolation and interpolation in higher dimensions

Parsing C-like code to extract info

What does the 1.-5. notation mean at the top right corner?

Iodine tablet correct use and efficiency

What happened to Sophie in her last encounter with Arthur?

What's an "add" chord?

How do the Martian rebels defeat Earth when they're grossly outnumbered and outgunned?

Using footnotes in fiction: children's book which can be enjoyed by adults

Extract all keys from an object (json)

How much money would I need to feel secure in my job?

Photographic Companions

Unexpected Code Coverage Reduction

Identify the Eeveelutions

How to present boolean options along with selecting exactly 1 of them as "primary"?

CO₂ level is high enough that it reduces cognitive ability. Isn't that a reason to worry?

Is current (November 2019) polling about Democrats lead over Trump trustworthy?

What to do with a bent but not broken aluminum seat stay

How to manage publications on local computer

What DC should I use for someone trying to survive indefinitely solely with an alchemy jug as their only source of food and water? (survival campaign)

Theme or Topic - what's the difference?

I have stack-exchanged through my undergrad math program. Am I likely to succeed in mathematics PhD programs?

Is there any point in adding more than 6 months' runway in savings instead of investing everything after that?

How to identify the default sender IP address?

Can a 'Second Referendum' on Brexit contain three options?

Does any country have free college & open admissions?

Extrapolation v. Interpolation

What is wrong with extrapolation?Fourier/trigonometric interpolationUncertainty propagation in linear interpolationInterpolating binned data such that bin average is preservedWhat are the theoretical reasons for why extrapolation “less reliable” than interpolation?Difference between extrapolation and interpolation in higher dimensions

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty
margin-bottom:0;

What is the difference between extrapolation and interpolation, and what is the most precise way of using these terms?

For example, I have seen a statement in a paper using interpolation as:

"The procedure interpolates the shape of the estimated function between the bin points"

A sentence that uses both extrapolation and interpolation is, for example:

The previous step where we extrapolated the interpolated function using the Kernel method to the left and right temperature tails.

Can someone provide a clear and easy way to distinguish them and guide how to use these terms correctly with an example?

edited Jul 23 at 15:20

S. Kolassa - Reinstate Monica

58.2k10 gold badges113 silver badges215 bronze badges

asked Jul 23 at 15:13

Frank Swanton

2703 silver badges7 bronze badges

1

$begingroup$
A related question.
$endgroup$
– J. M. is not a statistician
Jul 25 at 7:18

1

$begingroup$
Possible duplicate of What is wrong with extrapolation?
$endgroup$
– usεr11852 says Reinstate Monic
Jul 25 at 8:21

$begingroup$
@usεr11852 I think the two questions cover similar ground but are different because this one asks for the contrast with interpolation.
$endgroup$
– mkt - Reinstate Monica
Jul 25 at 11:55

$begingroup$
Has this distinction between interpolation and extrapolation been formalized rigorously in a generally agreed upon way, (e.g., via convex hulls) or are these terms still subject to human judgement and interpretation?
$endgroup$
– Nick Alger
Jul 25 at 16:19

add a comment
|

What is the difference between extrapolation and interpolation, and what is the most precise way of using these terms?

For example, I have seen a statement in a paper using interpolation as:

"The procedure interpolates the shape of the estimated function between the bin points"

A sentence that uses both extrapolation and interpolation is, for example:

The previous step where we extrapolated the interpolated function using the Kernel method to the left and right temperature tails.

Can someone provide a clear and easy way to distinguish them and guide how to use these terms correctly with an example?

edited Jul 23 at 15:20

S. Kolassa - Reinstate Monica

58.2k10 gold badges113 silver badges215 bronze badges

asked Jul 23 at 15:13

Frank Swanton

2703 silver badges7 bronze badges

1

$begingroup$
A related question.
$endgroup$
– J. M. is not a statistician
Jul 25 at 7:18

1

$begingroup$
Possible duplicate of What is wrong with extrapolation?
$endgroup$
– usεr11852 says Reinstate Monic
Jul 25 at 8:21

$begingroup$
@usεr11852 I think the two questions cover similar ground but are different because this one asks for the contrast with interpolation.
$endgroup$
– mkt - Reinstate Monica
Jul 25 at 11:55

$begingroup$
Has this distinction between interpolation and extrapolation been formalized rigorously in a generally agreed upon way, (e.g., via convex hulls) or are these terms still subject to human judgement and interpretation?
$endgroup$
– Nick Alger
Jul 25 at 16:19

add a comment
|

What is the difference between extrapolation and interpolation, and what is the most precise way of using these terms?

For example, I have seen a statement in a paper using interpolation as:

"The procedure interpolates the shape of the estimated function between the bin points"

A sentence that uses both extrapolation and interpolation is, for example:

The previous step where we extrapolated the interpolated function using the Kernel method to the left and right temperature tails.

Can someone provide a clear and easy way to distinguish them and guide how to use these terms correctly with an example?

edited Jul 23 at 15:20

S. Kolassa - Reinstate Monica

58.2k10 gold badges113 silver badges215 bronze badges

asked Jul 23 at 15:13

Frank Swanton

2703 silver badges7 bronze badges

What is the difference between extrapolation and interpolation, and what is the most precise way of using these terms?

For example, I have seen a statement in a paper using interpolation as:

"The procedure interpolates the shape of the estimated function between the bin points"

A sentence that uses both extrapolation and interpolation is, for example:

The previous step where we extrapolated the interpolated function using the Kernel method to the left and right temperature tails.

Can someone provide a clear and easy way to distinguish them and guide how to use these terms correctly with an example?

terminology interpolation extrapolation

edited Jul 23 at 15:20

S. Kolassa - Reinstate Monica

58.2k10 gold badges113 silver badges215 bronze badges

asked Jul 23 at 15:13

Frank Swanton

2703 silver badges7 bronze badges

edited Jul 23 at 15:20

S. Kolassa - Reinstate Monica

58.2k10 gold badges113 silver badges215 bronze badges

asked Jul 23 at 15:13

Frank Swanton

2703 silver badges7 bronze badges

edited Jul 23 at 15:20

S. Kolassa - Reinstate Monica

58.2k10 gold badges113 silver badges215 bronze badges

edited Jul 23 at 15:20

S. Kolassa - Reinstate Monica

58.2k10 gold badges113 silver badges215 bronze badges

edited Jul 23 at 15:20

S. Kolassa - Reinstate Monica

58.2k10 gold badges113 silver badges215 bronze badges

asked Jul 23 at 15:13

Frank Swanton

2703 silver badges7 bronze badges

asked Jul 23 at 15:13

Frank Swanton

2703 silver badges7 bronze badges

asked Jul 23 at 15:13

Frank Swanton

2703 silver badges7 bronze badges

1

$begingroup$
A related question.
$endgroup$
– J. M. is not a statistician
Jul 25 at 7:18

1

$begingroup$
Possible duplicate of What is wrong with extrapolation?
$endgroup$
– usεr11852 says Reinstate Monic
Jul 25 at 8:21

$begingroup$
@usεr11852 I think the two questions cover similar ground but are different because this one asks for the contrast with interpolation.
$endgroup$
– mkt - Reinstate Monica
Jul 25 at 11:55

$begingroup$
Has this distinction between interpolation and extrapolation been formalized rigorously in a generally agreed upon way, (e.g., via convex hulls) or are these terms still subject to human judgement and interpretation?
$endgroup$
– Nick Alger
Jul 25 at 16:19

add a comment
|

1

$begingroup$
A related question.
$endgroup$
– J. M. is not a statistician
Jul 25 at 7:18

1

$begingroup$
Possible duplicate of What is wrong with extrapolation?
$endgroup$
– usεr11852 says Reinstate Monic
Jul 25 at 8:21

$begingroup$
@usεr11852 I think the two questions cover similar ground but are different because this one asks for the contrast with interpolation.
$endgroup$
– mkt - Reinstate Monica
Jul 25 at 11:55

$begingroup$
Has this distinction between interpolation and extrapolation been formalized rigorously in a generally agreed upon way, (e.g., via convex hulls) or are these terms still subject to human judgement and interpretation?
$endgroup$
– Nick Alger
Jul 25 at 16:19

A related question.

– J. M. is not a statistician
Jul 25 at 7:18

Possible duplicate of What is wrong with extrapolation?

– usεr11852 says Reinstate Monic
Jul 25 at 8:21

@usεr11852 I think the two questions cover similar ground but are different because this one asks for the contrast with interpolation.

– mkt - Reinstate Monica
Jul 25 at 11:55

Has this distinction between interpolation and extrapolation been formalized rigorously in a generally agreed upon way, (e.g., via convex hulls) or are these terms still subject to human judgement and interpretation?

– Nick Alger
Jul 25 at 16:19

add a comment
|

4 Answers
4

active

oldest

votes

To add a visual explanation to this: let's consider a few points that you plan to model.

enter image description here

They look like they could be described well with a straight line, so you fit a linear regression to them:

enter image description here

This regression line lets you both interpolate (generate expected values in between your data points) and extrapolate (generate expected values outside the range of your data points). I've highlighted the extrapolation in red and the biggest region of interpolation in blue. To be clear, even the tiny regions between the points are interpolated, but I'm only highlighting the big one here.

enter image description here

Why is extrapolation generally more of a concern? Because you're usually much less certain about the shape of the relationship outside the range of your data. Consider what might happen when you collect a few more data points (hollow circles):

enter image description here

It turns out that the relationship was not captured well with your hypothesized relationship after all. The predictions in the extrapolated region are way off. Even if you had guessed the precise function that describes this nonlinear relationship correctly, your data did not extend over enough of a range for you to capture the nonlinearity well, so you may still have been pretty far off. Note that this is a problem not just for linear regression, but for any relationship at all - this is why extrapolation is considered dangerous.

Predictions in the interpolated region are also incorrect because of the lack of nonlinearity in the fit, but their prediction error is much lower. There's no guarantee that you won't have an unexpected relationship in between your points (i.e. the region of interpolation), but it's generally less likely.

I will add that extrapolation is not always a terrible idea - if you extrapolate a tiny bit outside the range of your data, you're probably not going to be very wrong (though it is possible!). Ancients who had no good scientific model of the world would not have been far wrong if they forecast that the sun would rise again the next day and the day after that (though one day far into the future, even this will fail).

And sometimes, extrapolation can even be informative - for example, simple short-term extrapolations of the exponential increase in atmospheric CO$_2$ have been reasonably accurate over the past few decades. If you were a student who didn't have scientific expertise but wanted a rough, short-term forecast, this would have given you fairly reasonable results. But the farther away from your data you extrapolate, the more likely your prediction is likely to fail, and fail disastrously, as described very nicely in this great thread: What is wrong with extrapolation? (thanks to @J.M.isnotastatistician for reminding me of that).

Edit based on comments: whether interpolating or extrapolating, it's always best to have some theory to ground expectations. If theory-free modelling must be done, the risk from interpolation is usually less than that from extrapolation. That said, as the gap between data points increases in magnitude, interpolation also becomes more and more fraught with risk.

edited Aug 2 at 9:51

answered Jul 23 at 15:42

mkt - Reinstate Monica

9,4516 gold badges33 silver badges95 bronze badges

5

$begingroup$
I like your answer, and regard it as complementary to mine and in no sense competing. But a small point, important for some readers, is that red and green are hard for quite a few people to distinguish visually.
$endgroup$
– Nick Cox
Jul 24 at 10:27

1

$begingroup$
@NickCox Good point, thank you for raising that - I've now changed the colour scheme.
$endgroup$
– mkt - Reinstate Monica
Jul 24 at 11:06

$begingroup$
I don't like this answer very much. The fact that your example data is just two blobs means (or at least it could mean; it's hard to be confident with only six sample points) that interpolating between them could also be considered extrapolating outside of the data support. And it would be quite conceivable for the extra data points to diverge strongly in that region. There might actually be a better quadratic model. If you're going to interpolate in such a situation, you'd better had an a-priori reason for preferring a particular model. And that is also the case for your CO₂ example!
$endgroup$
– leftaroundabout
Jul 24 at 14:28

1

$begingroup$
@leftaroundabout My point was that the Keeling curve pattern is so strong that extrapolations ignoring economics & physics are still reasonably accurate on the scale of years to a few decades. I noted 'past few decades' precisely because that's the time scale on which we have had high-resolution measurements. This is an example where extrapolation would not have led you badly wrong and I think that's worth noting. I think it would take wilful misreading to claim that this answer is advocating theory-free extrapolation.
$endgroup$
– mkt - Reinstate Monica
Jul 24 at 15:23

1

$begingroup$
Relatedly, I gave Taleb's "turkey example" in this answer as a warning for people who use extrapolation.
$endgroup$
– J. M. is not a statistician
Jul 25 at 7:18

|
show 7 more comments

In essence interpolation is an operation within the data support, or between existing known data points; extrapolation is beyond the data support. Otherwise put, the criterion is: where are the missing values?

One reason for the distinction is that extrapolation is usually more difficult to do well, and even dangerous, statistically if not practically. That is not always true: for example, river floods may overwhelm the means of measuring discharge or even stage (vertical level), tearing a hole in the measured record. In those circumstances, interpolation of discharge or stage is difficult too and being within the data support does not help much.

In the long run, qualitative change usually supersedes quantitative change. Around 1900 there was much concern that growth in horse-drawn traffic would swamp cities with mostly unwanted excrement. The exponential in excrement was superseded by the internal combustion engine and its different exponentials.

A trend is a trend is a trend,

But the question is, will it bend?

Will it alter its course

Through some unforeseen force

And come to a premature end?

-- Alexander Cairncross

Cairncross, A. 1969. Economic forecasting. The Economic Journal, 79: 797-812. doi:10.2307/2229792 (quotation on p.797)

edited Jul 26 at 11:24

amoeba says Reinstate Monica

68.2k19 gold badges231 silver badges280 bronze badges

answered Jul 23 at 15:28

Nick Cox

41.3k6 gold badges93 silver badges137 bronze badges

1

$begingroup$
Good answer. The interpretation is right there in the name - interpolation = to smooth within, extrapolation = to smooth beyond.
$endgroup$
– Nuclear Wang
Jul 23 at 15:33

1

$begingroup$
IMO this is the correct answer. “Data support” is the crucial bit; even if the point you want to go is between two measured ones then it may still lie outside the data support. For example, if you have prosperity data for people in the Roman antiquity and from the modern day, but not in between, then interpolating into the middle ages would be very problematic. I'd call this extrapolation. OTOH, if you have data scattered sparsely but uniformly through the entire time span, then interpolating to a particular year is much more plausible.
$endgroup$
– leftaroundabout
Jul 24 at 14:35

1

$begingroup$
@leftaroundabout Just because interpolation may be done over a huge gap in data does not make it extrapolation. You're mistaking the advisability of the procedure for the procedure itself. Sometimes interpolation is a bad idea too.
$endgroup$
– mkt - Reinstate Monica
Jul 24 at 15:41

1

$begingroup$
@mkt: I'm going to side with leftaroundabout that his first example could be considered extrapolation, as interpolation vs extrapolation isn't really as well defined as we may want to think. A simple transformation of variables can turn interpolation into extrapolation. In his example, using something like distance functions instead of raw time means that while in raw time we are interpolating, in distances we are extrapolating...and using raw times would probably be a bad idea.
$endgroup$
– Cliff AB
Jul 24 at 17:45

1

$begingroup$
This is my answer. I don't feel the need to qualify it. A broad distinction between interpolation and extrapolation doesn't rule it out being a little difficult to decide which is being undertaken. If you have a big hole in the middle of the data space, labelling could go either way. As some wag pointed out, the fact that the end of the day and the beginning of the night blur into one another doesn't make the distinction between day and night pointless or useless.
$endgroup$
– Nick Cox
Jul 25 at 13:21

|
show 3 more comments

TL;DR version:

Interpolation takes place between existing data points.

Extrapolation takes place beyond them.

Mnemonic: interpolation => inside.

FWIW: The prefix inter- means between, and extra- means beyond. Think also of interstate highways which go between states, or extraterrestrials from beyond our planet.

answered Jul 24 at 1:02

A C

2211 silver badge4 bronze badges

add a comment
|

Example:

Study: Want to fit a simple linear regression on the height on the age for girls of age 6-15 years old. Sample size is 100, age is calculated by (date of measuring - date of birth)/365.25.

After data collection, model is fit and get the estimate of intercept b0 and slope b1. it means we have E(height|age) = b0 + b1*age.

When you want the mean height for age 13, you find that there is no 15 year old girl in your sample of 100 girls, one f them is 12.83 years old and one is 13.24.

Now you plug in age = 13 into formula E(height|age) = b0 + b1*age. It is called interpolation because 13 year old is covered by the range of your data used to fit model.

If you want to get mean height for age 30 use that formula, it is called extrapolation, because age 30 is out of the range of the age covered by your data.

If the model has several covariates, need to be careful because it is hard to draw the border that data covered.

In statistics, we do not advocate the extrapolation.

answered Jul 23 at 15:34

user158565

6,4622 gold badges6 silver badges18 bronze badges

add a comment
|

protected by gung - Reinstate Monica♦ Jul 24 at 1:22

Thank you for your interest in this question.
Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).

Would you like to answer one of these unanswered questions instead?

4 Answers
4

active

oldest

votes

4 Answers
4

active

oldest

votes

To add a visual explanation to this: let's consider a few points that you plan to model.

enter image description here

They look like they could be described well with a straight line, so you fit a linear regression to them:

enter image description here

edited Aug 2 at 9:51

answered Jul 23 at 15:42

mkt - Reinstate Monica

9,4516 gold badges33 silver badges95 bronze badges

5

$begingroup$
I like your answer, and regard it as complementary to mine and in no sense competing. But a small point, important for some readers, is that red and green are hard for quite a few people to distinguish visually.
$endgroup$
– Nick Cox
Jul 24 at 10:27

1

$begingroup$
@NickCox Good point, thank you for raising that - I've now changed the colour scheme.
$endgroup$
– mkt - Reinstate Monica
Jul 24 at 11:06

$begingroup$
I don't like this answer very much. The fact that your example data is just two blobs means (or at least it could mean; it's hard to be confident with only six sample points) that interpolating between them could also be considered extrapolating outside of the data support. And it would be quite conceivable for the extra data points to diverge strongly in that region. There might actually be a better quadratic model. If you're going to interpolate in such a situation, you'd better had an a-priori reason for preferring a particular model. And that is also the case for your CO₂ example!
$endgroup$
– leftaroundabout
Jul 24 at 14:28

1

$begingroup$
@leftaroundabout My point was that the Keeling curve pattern is so strong that extrapolations ignoring economics & physics are still reasonably accurate on the scale of years to a few decades. I noted 'past few decades' precisely because that's the time scale on which we have had high-resolution measurements. This is an example where extrapolation would not have led you badly wrong and I think that's worth noting. I think it would take wilful misreading to claim that this answer is advocating theory-free extrapolation.
$endgroup$
– mkt - Reinstate Monica
Jul 24 at 15:23

1

$begingroup$
Relatedly, I gave Taleb's "turkey example" in this answer as a warning for people who use extrapolation.
$endgroup$
– J. M. is not a statistician
Jul 25 at 7:18

|
show 7 more comments

To add a visual explanation to this: let's consider a few points that you plan to model.

enter image description here

They look like they could be described well with a straight line, so you fit a linear regression to them:

enter image description here

edited Aug 2 at 9:51

answered Jul 23 at 15:42

mkt - Reinstate Monica

9,4516 gold badges33 silver badges95 bronze badges

5

$begingroup$
I like your answer, and regard it as complementary to mine and in no sense competing. But a small point, important for some readers, is that red and green are hard for quite a few people to distinguish visually.
$endgroup$
– Nick Cox
Jul 24 at 10:27

1

$begingroup$
@NickCox Good point, thank you for raising that - I've now changed the colour scheme.
$endgroup$
– mkt - Reinstate Monica
Jul 24 at 11:06

$begingroup$
I don't like this answer very much. The fact that your example data is just two blobs means (or at least it could mean; it's hard to be confident with only six sample points) that interpolating between them could also be considered extrapolating outside of the data support. And it would be quite conceivable for the extra data points to diverge strongly in that region. There might actually be a better quadratic model. If you're going to interpolate in such a situation, you'd better had an a-priori reason for preferring a particular model. And that is also the case for your CO₂ example!
$endgroup$
– leftaroundabout
Jul 24 at 14:28

1

$begingroup$
@leftaroundabout My point was that the Keeling curve pattern is so strong that extrapolations ignoring economics & physics are still reasonably accurate on the scale of years to a few decades. I noted 'past few decades' precisely because that's the time scale on which we have had high-resolution measurements. This is an example where extrapolation would not have led you badly wrong and I think that's worth noting. I think it would take wilful misreading to claim that this answer is advocating theory-free extrapolation.
$endgroup$
– mkt - Reinstate Monica
Jul 24 at 15:23

1

$begingroup$
Relatedly, I gave Taleb's "turkey example" in this answer as a warning for people who use extrapolation.
$endgroup$
– J. M. is not a statistician
Jul 25 at 7:18

|
show 7 more comments

To add a visual explanation to this: let's consider a few points that you plan to model.

enter image description here

They look like they could be described well with a straight line, so you fit a linear regression to them:

enter image description here

edited Aug 2 at 9:51

answered Jul 23 at 15:42

mkt - Reinstate Monica

9,4516 gold badges33 silver badges95 bronze badges

To add a visual explanation to this: let's consider a few points that you plan to model.

enter image description here

They look like they could be described well with a straight line, so you fit a linear regression to them:

enter image description here

edited Aug 2 at 9:51

answered Jul 23 at 15:42

mkt - Reinstate Monica

9,4516 gold badges33 silver badges95 bronze badges

edited Aug 2 at 9:51

answered Jul 23 at 15:42

mkt - Reinstate Monica

9,4516 gold badges33 silver badges95 bronze badges

answered Jul 23 at 15:42

mkt - Reinstate Monica

9,4516 gold badges33 silver badges95 bronze badges

answered Jul 23 at 15:42

mkt - Reinstate Monica

9,4516 gold badges33 silver badges95 bronze badges

5

$begingroup$
I like your answer, and regard it as complementary to mine and in no sense competing. But a small point, important for some readers, is that red and green are hard for quite a few people to distinguish visually.
$endgroup$
– Nick Cox
Jul 24 at 10:27

1

$begingroup$
@NickCox Good point, thank you for raising that - I've now changed the colour scheme.
$endgroup$
– mkt - Reinstate Monica
Jul 24 at 11:06

$begingroup$
I don't like this answer very much. The fact that your example data is just two blobs means (or at least it could mean; it's hard to be confident with only six sample points) that interpolating between them could also be considered extrapolating outside of the data support. And it would be quite conceivable for the extra data points to diverge strongly in that region. There might actually be a better quadratic model. If you're going to interpolate in such a situation, you'd better had an a-priori reason for preferring a particular model. And that is also the case for your CO₂ example!
$endgroup$
– leftaroundabout
Jul 24 at 14:28

1

$begingroup$
@leftaroundabout My point was that the Keeling curve pattern is so strong that extrapolations ignoring economics & physics are still reasonably accurate on the scale of years to a few decades. I noted 'past few decades' precisely because that's the time scale on which we have had high-resolution measurements. This is an example where extrapolation would not have led you badly wrong and I think that's worth noting. I think it would take wilful misreading to claim that this answer is advocating theory-free extrapolation.
$endgroup$
– mkt - Reinstate Monica
Jul 24 at 15:23

1

$begingroup$
Relatedly, I gave Taleb's "turkey example" in this answer as a warning for people who use extrapolation.
$endgroup$
– J. M. is not a statistician
Jul 25 at 7:18

|
show 7 more comments

5

$begingroup$
I like your answer, and regard it as complementary to mine and in no sense competing. But a small point, important for some readers, is that red and green are hard for quite a few people to distinguish visually.
$endgroup$
– Nick Cox
Jul 24 at 10:27

1

$begingroup$
@NickCox Good point, thank you for raising that - I've now changed the colour scheme.
$endgroup$
– mkt - Reinstate Monica
Jul 24 at 11:06

$begingroup$
I don't like this answer very much. The fact that your example data is just two blobs means (or at least it could mean; it's hard to be confident with only six sample points) that interpolating between them could also be considered extrapolating outside of the data support. And it would be quite conceivable for the extra data points to diverge strongly in that region. There might actually be a better quadratic model. If you're going to interpolate in such a situation, you'd better had an a-priori reason for preferring a particular model. And that is also the case for your CO₂ example!
$endgroup$
– leftaroundabout
Jul 24 at 14:28

1

$begingroup$
@leftaroundabout My point was that the Keeling curve pattern is so strong that extrapolations ignoring economics & physics are still reasonably accurate on the scale of years to a few decades. I noted 'past few decades' precisely because that's the time scale on which we have had high-resolution measurements. This is an example where extrapolation would not have led you badly wrong and I think that's worth noting. I think it would take wilful misreading to claim that this answer is advocating theory-free extrapolation.
$endgroup$
– mkt - Reinstate Monica
Jul 24 at 15:23

1

$begingroup$
Relatedly, I gave Taleb's "turkey example" in this answer as a warning for people who use extrapolation.
$endgroup$
– J. M. is not a statistician
Jul 25 at 7:18

I like your answer, and regard it as complementary to mine and in no sense competing. But a small point, important for some readers, is that red and green are hard for quite a few people to distinguish visually.

– Nick Cox
Jul 24 at 10:27

@NickCox Good point, thank you for raising that - I've now changed the colour scheme.

– mkt - Reinstate Monica
Jul 24 at 11:06

I don't like this answer very much. The fact that your example data is just two blobs means (or at least it could mean; it's hard to be confident with only six sample points) that interpolating between them could also be considered extrapolating outside of the data support. And it would be quite conceivable for the extra data points to diverge strongly in that region. There might actually be a better quadratic model. If you're going to interpolate in such a situation, you'd better had an a-priori reason for preferring a particular model. And that is also the case for your CO₂ example!

– leftaroundabout
Jul 24 at 14:28

@leftaroundabout My point was that the Keeling curve pattern is so strong that extrapolations ignoring economics & physics are still reasonably accurate on the scale of years to a few decades. I noted 'past few decades' precisely because that's the time scale on which we have had high-resolution measurements. This is an example where extrapolation would not have led you badly wrong and I think that's worth noting. I think it would take wilful misreading to claim that this answer is advocating theory-free extrapolation.

– mkt - Reinstate Monica
Jul 24 at 15:23

Relatedly, I gave Taleb's "turkey example" in this answer as a warning for people who use extrapolation.

– J. M. is not a statistician
Jul 25 at 7:18

|
show 7 more comments

A trend is a trend is a trend,

But the question is, will it bend?

Will it alter its course

Through some unforeseen force

And come to a premature end?

-- Alexander Cairncross

Cairncross, A. 1969. Economic forecasting. The Economic Journal, 79: 797-812. doi:10.2307/2229792 (quotation on p.797)

edited Jul 26 at 11:24

amoeba says Reinstate Monica

68.2k19 gold badges231 silver badges280 bronze badges

answered Jul 23 at 15:28

Nick Cox

41.3k6 gold badges93 silver badges137 bronze badges

1

$begingroup$
Good answer. The interpretation is right there in the name - interpolation = to smooth within, extrapolation = to smooth beyond.
$endgroup$
– Nuclear Wang
Jul 23 at 15:33

1

$begingroup$
IMO this is the correct answer. “Data support” is the crucial bit; even if the point you want to go is between two measured ones then it may still lie outside the data support. For example, if you have prosperity data for people in the Roman antiquity and from the modern day, but not in between, then interpolating into the middle ages would be very problematic. I'd call this extrapolation. OTOH, if you have data scattered sparsely but uniformly through the entire time span, then interpolating to a particular year is much more plausible.
$endgroup$
– leftaroundabout
Jul 24 at 14:35

1

$begingroup$
@leftaroundabout Just because interpolation may be done over a huge gap in data does not make it extrapolation. You're mistaking the advisability of the procedure for the procedure itself. Sometimes interpolation is a bad idea too.
$endgroup$
– mkt - Reinstate Monica
Jul 24 at 15:41

1

$begingroup$
@mkt: I'm going to side with leftaroundabout that his first example could be considered extrapolation, as interpolation vs extrapolation isn't really as well defined as we may want to think. A simple transformation of variables can turn interpolation into extrapolation. In his example, using something like distance functions instead of raw time means that while in raw time we are interpolating, in distances we are extrapolating...and using raw times would probably be a bad idea.
$endgroup$
– Cliff AB
Jul 24 at 17:45

1

$begingroup$
This is my answer. I don't feel the need to qualify it. A broad distinction between interpolation and extrapolation doesn't rule it out being a little difficult to decide which is being undertaken. If you have a big hole in the middle of the data space, labelling could go either way. As some wag pointed out, the fact that the end of the day and the beginning of the night blur into one another doesn't make the distinction between day and night pointless or useless.
$endgroup$
– Nick Cox
Jul 25 at 13:21

|
show 3 more comments

A trend is a trend is a trend,

But the question is, will it bend?

Will it alter its course

Through some unforeseen force

And come to a premature end?

-- Alexander Cairncross

Cairncross, A. 1969. Economic forecasting. The Economic Journal, 79: 797-812. doi:10.2307/2229792 (quotation on p.797)

edited Jul 26 at 11:24

amoeba says Reinstate Monica

68.2k19 gold badges231 silver badges280 bronze badges

answered Jul 23 at 15:28

Nick Cox

41.3k6 gold badges93 silver badges137 bronze badges

1

$begingroup$
Good answer. The interpretation is right there in the name - interpolation = to smooth within, extrapolation = to smooth beyond.
$endgroup$
– Nuclear Wang
Jul 23 at 15:33

1

$begingroup$
IMO this is the correct answer. “Data support” is the crucial bit; even if the point you want to go is between two measured ones then it may still lie outside the data support. For example, if you have prosperity data for people in the Roman antiquity and from the modern day, but not in between, then interpolating into the middle ages would be very problematic. I'd call this extrapolation. OTOH, if you have data scattered sparsely but uniformly through the entire time span, then interpolating to a particular year is much more plausible.
$endgroup$
– leftaroundabout
Jul 24 at 14:35

1

$begingroup$
@leftaroundabout Just because interpolation may be done over a huge gap in data does not make it extrapolation. You're mistaking the advisability of the procedure for the procedure itself. Sometimes interpolation is a bad idea too.
$endgroup$
– mkt - Reinstate Monica
Jul 24 at 15:41

1

$begingroup$
@mkt: I'm going to side with leftaroundabout that his first example could be considered extrapolation, as interpolation vs extrapolation isn't really as well defined as we may want to think. A simple transformation of variables can turn interpolation into extrapolation. In his example, using something like distance functions instead of raw time means that while in raw time we are interpolating, in distances we are extrapolating...and using raw times would probably be a bad idea.
$endgroup$
– Cliff AB
Jul 24 at 17:45

1

$begingroup$
This is my answer. I don't feel the need to qualify it. A broad distinction between interpolation and extrapolation doesn't rule it out being a little difficult to decide which is being undertaken. If you have a big hole in the middle of the data space, labelling could go either way. As some wag pointed out, the fact that the end of the day and the beginning of the night blur into one another doesn't make the distinction between day and night pointless or useless.
$endgroup$
– Nick Cox
Jul 25 at 13:21

|
show 3 more comments

A trend is a trend is a trend,

But the question is, will it bend?

Will it alter its course

Through some unforeseen force

And come to a premature end?

-- Alexander Cairncross

Cairncross, A. 1969. Economic forecasting. The Economic Journal, 79: 797-812. doi:10.2307/2229792 (quotation on p.797)

edited Jul 26 at 11:24

amoeba says Reinstate Monica

68.2k19 gold badges231 silver badges280 bronze badges

answered Jul 23 at 15:28

Nick Cox

41.3k6 gold badges93 silver badges137 bronze badges

A trend is a trend is a trend,

But the question is, will it bend?

Will it alter its course

Through some unforeseen force

And come to a premature end?

-- Alexander Cairncross

Cairncross, A. 1969. Economic forecasting. The Economic Journal, 79: 797-812. doi:10.2307/2229792 (quotation on p.797)

edited Jul 26 at 11:24

amoeba says Reinstate Monica

68.2k19 gold badges231 silver badges280 bronze badges

answered Jul 23 at 15:28

Nick Cox

41.3k6 gold badges93 silver badges137 bronze badges

edited Jul 26 at 11:24

amoeba says Reinstate Monica

68.2k19 gold badges231 silver badges280 bronze badges

edited Jul 26 at 11:24

amoeba says Reinstate Monica

68.2k19 gold badges231 silver badges280 bronze badges

edited Jul 26 at 11:24

amoeba says Reinstate Monica

68.2k19 gold badges231 silver badges280 bronze badges

answered Jul 23 at 15:28

Nick Cox

41.3k6 gold badges93 silver badges137 bronze badges

answered Jul 23 at 15:28

Nick Cox

41.3k6 gold badges93 silver badges137 bronze badges

answered Jul 23 at 15:28

Nick Cox

41.3k6 gold badges93 silver badges137 bronze badges

1

$begingroup$
Good answer. The interpretation is right there in the name - interpolation = to smooth within, extrapolation = to smooth beyond.
$endgroup$
– Nuclear Wang
Jul 23 at 15:33

1

$begingroup$
IMO this is the correct answer. “Data support” is the crucial bit; even if the point you want to go is between two measured ones then it may still lie outside the data support. For example, if you have prosperity data for people in the Roman antiquity and from the modern day, but not in between, then interpolating into the middle ages would be very problematic. I'd call this extrapolation. OTOH, if you have data scattered sparsely but uniformly through the entire time span, then interpolating to a particular year is much more plausible.
$endgroup$
– leftaroundabout
Jul 24 at 14:35

1

$begingroup$
@leftaroundabout Just because interpolation may be done over a huge gap in data does not make it extrapolation. You're mistaking the advisability of the procedure for the procedure itself. Sometimes interpolation is a bad idea too.
$endgroup$
– mkt - Reinstate Monica
Jul 24 at 15:41

1

$begingroup$
@mkt: I'm going to side with leftaroundabout that his first example could be considered extrapolation, as interpolation vs extrapolation isn't really as well defined as we may want to think. A simple transformation of variables can turn interpolation into extrapolation. In his example, using something like distance functions instead of raw time means that while in raw time we are interpolating, in distances we are extrapolating...and using raw times would probably be a bad idea.
$endgroup$
– Cliff AB
Jul 24 at 17:45

1

$begingroup$
This is my answer. I don't feel the need to qualify it. A broad distinction between interpolation and extrapolation doesn't rule it out being a little difficult to decide which is being undertaken. If you have a big hole in the middle of the data space, labelling could go either way. As some wag pointed out, the fact that the end of the day and the beginning of the night blur into one another doesn't make the distinction between day and night pointless or useless.
$endgroup$
– Nick Cox
Jul 25 at 13:21

|
show 3 more comments

1

$begingroup$
Good answer. The interpretation is right there in the name - interpolation = to smooth within, extrapolation = to smooth beyond.
$endgroup$
– Nuclear Wang
Jul 23 at 15:33

1

$begingroup$
IMO this is the correct answer. “Data support” is the crucial bit; even if the point you want to go is between two measured ones then it may still lie outside the data support. For example, if you have prosperity data for people in the Roman antiquity and from the modern day, but not in between, then interpolating into the middle ages would be very problematic. I'd call this extrapolation. OTOH, if you have data scattered sparsely but uniformly through the entire time span, then interpolating to a particular year is much more plausible.
$endgroup$
– leftaroundabout
Jul 24 at 14:35

1

$begingroup$
@leftaroundabout Just because interpolation may be done over a huge gap in data does not make it extrapolation. You're mistaking the advisability of the procedure for the procedure itself. Sometimes interpolation is a bad idea too.
$endgroup$
– mkt - Reinstate Monica
Jul 24 at 15:41

1

$begingroup$
@mkt: I'm going to side with leftaroundabout that his first example could be considered extrapolation, as interpolation vs extrapolation isn't really as well defined as we may want to think. A simple transformation of variables can turn interpolation into extrapolation. In his example, using something like distance functions instead of raw time means that while in raw time we are interpolating, in distances we are extrapolating...and using raw times would probably be a bad idea.
$endgroup$
– Cliff AB
Jul 24 at 17:45

1

$begingroup$
This is my answer. I don't feel the need to qualify it. A broad distinction between interpolation and extrapolation doesn't rule it out being a little difficult to decide which is being undertaken. If you have a big hole in the middle of the data space, labelling could go either way. As some wag pointed out, the fact that the end of the day and the beginning of the night blur into one another doesn't make the distinction between day and night pointless or useless.
$endgroup$
– Nick Cox
Jul 25 at 13:21

Good answer. The interpretation is right there in the name - interpolation = to smooth within, extrapolation = to smooth beyond.

– Nuclear Wang
Jul 23 at 15:33

IMO this is the correct answer. “Data support” is the crucial bit; even if the point you want to go is between two measured ones then it may still lie outside the data support. For example, if you have prosperity data for people in the Roman antiquity and from the modern day, but not in between, then interpolating into the middle ages would be very problematic. I'd call this extrapolation. OTOH, if you have data scattered sparsely but uniformly through the entire time span, then interpolating to a particular year is much more plausible.

– leftaroundabout
Jul 24 at 14:35

@leftaroundabout Just because interpolation may be done over a huge gap in data does not make it extrapolation. You're mistaking the advisability of the procedure for the procedure itself. Sometimes interpolation is a bad idea too.

– mkt - Reinstate Monica
Jul 24 at 15:41

@mkt: I'm going to side with leftaroundabout that his first example could be considered extrapolation, as interpolation vs extrapolation isn't really as well defined as we may want to think. A simple transformation of variables can turn interpolation into extrapolation. In his example, using something like distance functions instead of raw time means that while in raw time we are interpolating, in distances we are extrapolating...and using raw times would probably be a bad idea.

– Cliff AB
Jul 24 at 17:45

This is my answer. I don't feel the need to qualify it. A broad distinction between interpolation and extrapolation doesn't rule it out being a little difficult to decide which is being undertaken. If you have a big hole in the middle of the data space, labelling could go either way. As some wag pointed out, the fact that the end of the day and the beginning of the night blur into one another doesn't make the distinction between day and night pointless or useless.

– Nick Cox
Jul 25 at 13:21

|
show 3 more comments

TL;DR version:

Interpolation takes place between existing data points.

Extrapolation takes place beyond them.

Mnemonic: interpolation => inside.

FWIW: The prefix inter- means between, and extra- means beyond. Think also of interstate highways which go between states, or extraterrestrials from beyond our planet.

answered Jul 24 at 1:02

A C

2211 silver badge4 bronze badges

add a comment
|

TL;DR version:

Interpolation takes place between existing data points.

Extrapolation takes place beyond them.

Mnemonic: interpolation => inside.

FWIW: The prefix inter- means between, and extra- means beyond. Think also of interstate highways which go between states, or extraterrestrials from beyond our planet.

answered Jul 24 at 1:02

A C

2211 silver badge4 bronze badges

add a comment
|

TL;DR version:

Interpolation takes place between existing data points.

Extrapolation takes place beyond them.

Mnemonic: interpolation => inside.

FWIW: The prefix inter- means between, and extra- means beyond. Think also of interstate highways which go between states, or extraterrestrials from beyond our planet.

answered Jul 24 at 1:02

A C

2211 silver badge4 bronze badges

TL;DR version:

Interpolation takes place between existing data points.

Extrapolation takes place beyond them.

Mnemonic: interpolation => inside.

FWIW: The prefix inter- means between, and extra- means beyond. Think also of interstate highways which go between states, or extraterrestrials from beyond our planet.

answered Jul 24 at 1:02

A C

2211 silver badge4 bronze badges

answered Jul 24 at 1:02

A C

2211 silver badge4 bronze badges

answered Jul 24 at 1:02

A C

2211 silver badge4 bronze badges

answered Jul 24 at 1:02

A C

2211 silver badge4 bronze badges

add a comment
|

Example:

Study: Want to fit a simple linear regression on the height on the age for girls of age 6-15 years old. Sample size is 100, age is calculated by (date of measuring - date of birth)/365.25.

After data collection, model is fit and get the estimate of intercept b0 and slope b1. it means we have E(height|age) = b0 + b1*age.

When you want the mean height for age 13, you find that there is no 15 year old girl in your sample of 100 girls, one f them is 12.83 years old and one is 13.24.

Now you plug in age = 13 into formula E(height|age) = b0 + b1*age. It is called interpolation because 13 year old is covered by the range of your data used to fit model.

If you want to get mean height for age 30 use that formula, it is called extrapolation, because age 30 is out of the range of the age covered by your data.

If the model has several covariates, need to be careful because it is hard to draw the border that data covered.

In statistics, we do not advocate the extrapolation.

answered Jul 23 at 15:34

user158565

6,4622 gold badges6 silver badges18 bronze badges

add a comment
|

Example:

Study: Want to fit a simple linear regression on the height on the age for girls of age 6-15 years old. Sample size is 100, age is calculated by (date of measuring - date of birth)/365.25.

After data collection, model is fit and get the estimate of intercept b0 and slope b1. it means we have E(height|age) = b0 + b1*age.

When you want the mean height for age 13, you find that there is no 15 year old girl in your sample of 100 girls, one f them is 12.83 years old and one is 13.24.

Now you plug in age = 13 into formula E(height|age) = b0 + b1*age. It is called interpolation because 13 year old is covered by the range of your data used to fit model.

If you want to get mean height for age 30 use that formula, it is called extrapolation, because age 30 is out of the range of the age covered by your data.

If the model has several covariates, need to be careful because it is hard to draw the border that data covered.

In statistics, we do not advocate the extrapolation.

answered Jul 23 at 15:34

user158565

6,4622 gold badges6 silver badges18 bronze badges

add a comment
|

Example:

Study: Want to fit a simple linear regression on the height on the age for girls of age 6-15 years old. Sample size is 100, age is calculated by (date of measuring - date of birth)/365.25.

After data collection, model is fit and get the estimate of intercept b0 and slope b1. it means we have E(height|age) = b0 + b1*age.

When you want the mean height for age 13, you find that there is no 15 year old girl in your sample of 100 girls, one f them is 12.83 years old and one is 13.24.

Now you plug in age = 13 into formula E(height|age) = b0 + b1*age. It is called interpolation because 13 year old is covered by the range of your data used to fit model.

If you want to get mean height for age 30 use that formula, it is called extrapolation, because age 30 is out of the range of the age covered by your data.

If the model has several covariates, need to be careful because it is hard to draw the border that data covered.

In statistics, we do not advocate the extrapolation.

answered Jul 23 at 15:34

user158565

6,4622 gold badges6 silver badges18 bronze badges

Example:

Study: Want to fit a simple linear regression on the height on the age for girls of age 6-15 years old. Sample size is 100, age is calculated by (date of measuring - date of birth)/365.25.

After data collection, model is fit and get the estimate of intercept b0 and slope b1. it means we have E(height|age) = b0 + b1*age.

When you want the mean height for age 13, you find that there is no 15 year old girl in your sample of 100 girls, one f them is 12.83 years old and one is 13.24.

Now you plug in age = 13 into formula E(height|age) = b0 + b1*age. It is called interpolation because 13 year old is covered by the range of your data used to fit model.

If you want to get mean height for age 30 use that formula, it is called extrapolation, because age 30 is out of the range of the age covered by your data.

If the model has several covariates, need to be careful because it is hard to draw the border that data covered.

In statistics, we do not advocate the extrapolation.

answered Jul 23 at 15:34

user158565

6,4622 gold badges6 silver badges18 bronze badges

answered Jul 23 at 15:34

user158565

6,4622 gold badges6 silver badges18 bronze badges

answered Jul 23 at 15:34

user158565

6,4622 gold badges6 silver badges18 bronze badges

answered Jul 23 at 15:34

user158565

6,4622 gold badges6 silver badges18 bronze badges

add a comment
|

protected by gung - Reinstate Monica♦ Jul 24 at 1:22

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Bsrgvty

4 Answers
4

protected by gung - Reinstate Monica♦ Jul 24 at 1:22

4 Answers
4

4 Answers
4

protected by gung - Reinstate Monica♦ Jul 24 at 1:22

Popular posts from this blog

4 Answers 4

protected by gung - Reinstate Monica♦ Jul 24 at 1:22

4 Answers 4

4 Answers 4

protected by gung - Reinstate Monica♦ Jul 24 at 1:22

Popular posts from this blog

4 Answers
4

4 Answers
4

4 Answers
4