RANK used in 'where' returns invalid column, but exists in results setUsing column alias in a WHERE clause doesn't workDelete duplicate records with no change in betweenPostgreSQL DELETE FROM fails with `Error: attempted to delete invisible tuple`search_path set and seen but not used!Index not used with many WHERE LOWER(column) IN (…) conditions, works fine with just a fewMulti-Column index not used for index-only scan, but partial index isPostgreSQL 9.5 query performance depends on JOINed column in SELECT clauseWhich data model/schema to apply to a timeseries datastore for data sources with varying fieldspostgresql random function returns duplicate results when order by random used in queryQuery returns results on pgAdmin running on remote client but not directly through psql on db server
Are conjugate vectors unique?
Grammar explanation for ~よし
Isn't any conversation with the US president quid-pro-quo?
Unable to Style the checkboxes in Sitecore Forms
How should I handle a player attacking from the top of a tree?
How to handle a colleague who appears helpful in front of manager but doesn't help in private?
Days in indexed month
Steampunk book about a bounty hunter teen girl in London
Is "montäglich" commonly used?
N-Dimensional Cartesian Product
Can a UK passport valid for two months travel to Germany post-brexit?
Prove that the equation has only one real root.
Twelve Labours #08 - Dark Horse Bookmakers
How to get rid of vertical white lines in a table?
What does Yoda's species eat?
Can the Protection fighting style be used in this way?
Why would prey creatures not hate predator creatures?
How can I improve combat so my players don't always use the strategy of collectively focusing fire on one enemy at a time until it's dead?
If a photon truly goes through both slits (at the same time), then why can't we detect it at both slits (at the same time)?
Is it possible to change paper title after send back from reviewers for revising?
Why are there notes in my guitar chord book that are not part of the chord's formula?
Ran out of space of newly installed HDD?
Is it a mistake to use a password that has previously been used (by anyone ever)?
Stalemate situation with all pieces on the board
RANK used in 'where' returns invalid column, but exists in results set
Using column alias in a WHERE clause doesn't workDelete duplicate records with no change in betweenPostgreSQL DELETE FROM fails with `Error: attempted to delete invisible tuple`search_path set and seen but not used!Index not used with many WHERE LOWER(column) IN (…) conditions, works fine with just a fewMulti-Column index not used for index-only scan, but partial index isPostgreSQL 9.5 query performance depends on JOINed column in SELECT clauseWhich data model/schema to apply to a timeseries datastore for data sources with varying fieldspostgresql random function returns duplicate results when order by random used in queryQuery returns results on pgAdmin running on remote client but not directly through psql on db server
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty
margin-bottom:0;
I basically have a table with date
, timestamp
, DID
, coordinates
.
I want a query that will return rows with the last coordinate from day X, and the first coord from day X+1, and the coordinates. So it will only return results where there are 2 consecutive dates.
This is what I came up with. Been trying to get this query to work, it's almost perfect, but I just need to add the commented out where
conditions and it'll do exactly what I want. But when I uncomment, I get an error "column doesn't exist":
SELECT a.timestamp_intersecting_date d1,b.timestamp_intersecting_date d2,
a."DID",
a.timestamp_intersecting_max t1, b.timestamp_intersecting_min t2,
RANK () OVER (
PARTITION BY a.timestamp_intersecting_date
ORDER BY a.timestamp_intersecting_max DESC
) timestamp_d1_rank ,
RANK () OVER (
PARTITION BY b.timestamp_intersecting_date
ORDER BY b.timestamp_intersecting_max ASC
) timestamp_d2_rank,
a.coords_centroid, b.coords_centroid
FROM
signals a
INNER JOIN signals b ON (a."DID" = b."DID")
WHERE (b.timestamp_intersecting_date = a.timestamp_intersecting_date + INTERVAL '1 DAY')
AND a."DID" = b."DID"
--AND timestamp_d1_rank = 1
--AND timestamp_d2_rank = 1
ORDER BY a."DID", t1 desc, t2 asc
How to solve this?
postgresql
add a comment
|
I basically have a table with date
, timestamp
, DID
, coordinates
.
I want a query that will return rows with the last coordinate from day X, and the first coord from day X+1, and the coordinates. So it will only return results where there are 2 consecutive dates.
This is what I came up with. Been trying to get this query to work, it's almost perfect, but I just need to add the commented out where
conditions and it'll do exactly what I want. But when I uncomment, I get an error "column doesn't exist":
SELECT a.timestamp_intersecting_date d1,b.timestamp_intersecting_date d2,
a."DID",
a.timestamp_intersecting_max t1, b.timestamp_intersecting_min t2,
RANK () OVER (
PARTITION BY a.timestamp_intersecting_date
ORDER BY a.timestamp_intersecting_max DESC
) timestamp_d1_rank ,
RANK () OVER (
PARTITION BY b.timestamp_intersecting_date
ORDER BY b.timestamp_intersecting_max ASC
) timestamp_d2_rank,
a.coords_centroid, b.coords_centroid
FROM
signals a
INNER JOIN signals b ON (a."DID" = b."DID")
WHERE (b.timestamp_intersecting_date = a.timestamp_intersecting_date + INTERVAL '1 DAY')
AND a."DID" = b."DID"
--AND timestamp_d1_rank = 1
--AND timestamp_d2_rank = 1
ORDER BY a."DID", t1 desc, t2 asc
How to solve this?
postgresql
Yes, you can’t refer to them in the where clause as window functions are evaluated after the where clause. You can put your query into a with clause, or wrap it as a subquery.
– Colin 't Hart
Sep 8 at 12:10
Please always provide a table definition asCREATE TABLE
statement showing exact data types and constraints. And your version of Postgres. Helps to avoid misunderstandings and makes it so much easier to help. You have a timestamp and a date?timestamp_intersecting_date
? A lot can go wrong right here already ...
– Erwin Brandstetter
Sep 8 at 22:49
Data distribution matters a lot here. If eachDID
has a (at most) a single row per day, you would use a different query than when there can be many. If time zones are involved, the definition of "day" may need some attention.
– Erwin Brandstetter
Sep 8 at 22:56
add a comment
|
I basically have a table with date
, timestamp
, DID
, coordinates
.
I want a query that will return rows with the last coordinate from day X, and the first coord from day X+1, and the coordinates. So it will only return results where there are 2 consecutive dates.
This is what I came up with. Been trying to get this query to work, it's almost perfect, but I just need to add the commented out where
conditions and it'll do exactly what I want. But when I uncomment, I get an error "column doesn't exist":
SELECT a.timestamp_intersecting_date d1,b.timestamp_intersecting_date d2,
a."DID",
a.timestamp_intersecting_max t1, b.timestamp_intersecting_min t2,
RANK () OVER (
PARTITION BY a.timestamp_intersecting_date
ORDER BY a.timestamp_intersecting_max DESC
) timestamp_d1_rank ,
RANK () OVER (
PARTITION BY b.timestamp_intersecting_date
ORDER BY b.timestamp_intersecting_max ASC
) timestamp_d2_rank,
a.coords_centroid, b.coords_centroid
FROM
signals a
INNER JOIN signals b ON (a."DID" = b."DID")
WHERE (b.timestamp_intersecting_date = a.timestamp_intersecting_date + INTERVAL '1 DAY')
AND a."DID" = b."DID"
--AND timestamp_d1_rank = 1
--AND timestamp_d2_rank = 1
ORDER BY a."DID", t1 desc, t2 asc
How to solve this?
postgresql
I basically have a table with date
, timestamp
, DID
, coordinates
.
I want a query that will return rows with the last coordinate from day X, and the first coord from day X+1, and the coordinates. So it will only return results where there are 2 consecutive dates.
This is what I came up with. Been trying to get this query to work, it's almost perfect, but I just need to add the commented out where
conditions and it'll do exactly what I want. But when I uncomment, I get an error "column doesn't exist":
SELECT a.timestamp_intersecting_date d1,b.timestamp_intersecting_date d2,
a."DID",
a.timestamp_intersecting_max t1, b.timestamp_intersecting_min t2,
RANK () OVER (
PARTITION BY a.timestamp_intersecting_date
ORDER BY a.timestamp_intersecting_max DESC
) timestamp_d1_rank ,
RANK () OVER (
PARTITION BY b.timestamp_intersecting_date
ORDER BY b.timestamp_intersecting_max ASC
) timestamp_d2_rank,
a.coords_centroid, b.coords_centroid
FROM
signals a
INNER JOIN signals b ON (a."DID" = b."DID")
WHERE (b.timestamp_intersecting_date = a.timestamp_intersecting_date + INTERVAL '1 DAY')
AND a."DID" = b."DID"
--AND timestamp_d1_rank = 1
--AND timestamp_d2_rank = 1
ORDER BY a."DID", t1 desc, t2 asc
How to solve this?
postgresql
postgresql
edited Sep 8 at 22:47
Erwin Brandstetter
107k11 gold badges229 silver badges357 bronze badges
107k11 gold badges229 silver badges357 bronze badges
asked Sep 8 at 1:27
roy naufalroy naufal
1674 bronze badges
1674 bronze badges
Yes, you can’t refer to them in the where clause as window functions are evaluated after the where clause. You can put your query into a with clause, or wrap it as a subquery.
– Colin 't Hart
Sep 8 at 12:10
Please always provide a table definition asCREATE TABLE
statement showing exact data types and constraints. And your version of Postgres. Helps to avoid misunderstandings and makes it so much easier to help. You have a timestamp and a date?timestamp_intersecting_date
? A lot can go wrong right here already ...
– Erwin Brandstetter
Sep 8 at 22:49
Data distribution matters a lot here. If eachDID
has a (at most) a single row per day, you would use a different query than when there can be many. If time zones are involved, the definition of "day" may need some attention.
– Erwin Brandstetter
Sep 8 at 22:56
add a comment
|
Yes, you can’t refer to them in the where clause as window functions are evaluated after the where clause. You can put your query into a with clause, or wrap it as a subquery.
– Colin 't Hart
Sep 8 at 12:10
Please always provide a table definition asCREATE TABLE
statement showing exact data types and constraints. And your version of Postgres. Helps to avoid misunderstandings and makes it so much easier to help. You have a timestamp and a date?timestamp_intersecting_date
? A lot can go wrong right here already ...
– Erwin Brandstetter
Sep 8 at 22:49
Data distribution matters a lot here. If eachDID
has a (at most) a single row per day, you would use a different query than when there can be many. If time zones are involved, the definition of "day" may need some attention.
– Erwin Brandstetter
Sep 8 at 22:56
Yes, you can’t refer to them in the where clause as window functions are evaluated after the where clause. You can put your query into a with clause, or wrap it as a subquery.
– Colin 't Hart
Sep 8 at 12:10
Yes, you can’t refer to them in the where clause as window functions are evaluated after the where clause. You can put your query into a with clause, or wrap it as a subquery.
– Colin 't Hart
Sep 8 at 12:10
Please always provide a table definition as
CREATE TABLE
statement showing exact data types and constraints. And your version of Postgres. Helps to avoid misunderstandings and makes it so much easier to help. You have a timestamp and a date? timestamp_intersecting_date
? A lot can go wrong right here already ...– Erwin Brandstetter
Sep 8 at 22:49
Please always provide a table definition as
CREATE TABLE
statement showing exact data types and constraints. And your version of Postgres. Helps to avoid misunderstandings and makes it so much easier to help. You have a timestamp and a date? timestamp_intersecting_date
? A lot can go wrong right here already ...– Erwin Brandstetter
Sep 8 at 22:49
Data distribution matters a lot here. If each
DID
has a (at most) a single row per day, you would use a different query than when there can be many. If time zones are involved, the definition of "day" may need some attention.– Erwin Brandstetter
Sep 8 at 22:56
Data distribution matters a lot here. If each
DID
has a (at most) a single row per day, you would use a different query than when there can be many. If time zones are involved, the definition of "day" may need some attention.– Erwin Brandstetter
Sep 8 at 22:56
add a comment
|
3 Answers
3
active
oldest
votes
You cannot use window/ranking functions or their aliases in the WHERE
clause because they are evaluated in SELECT
, after the WHERE
has been evaluated.
You can use a subquery (derived table or CTE) to process a second WHERE
clause, after the window functions have been evaluated:
SELECT d1, d2,
"DID",
t1, t2,
coords_centroid_a,
coords_centroid_b
FROM
(
SELECT a.timestamp_intersecting_date d1, b.timestamp_intersecting_date d2,
a."DID",
a.timestamp_intersecting_max t1, b.timestamp_intersecting_min t2,
RANK () OVER (
PARTITION BY a.timestamp_intersecting_date
ORDER BY a.timestamp_intersecting_max DESC
) timestamp_d1_rank ,
RANK () OVER (
PARTITION BY b.timestamp_intersecting_date
ORDER BY b.timestamp_intersecting_max ASC
) timestamp_d2_rank,
a.coords_centroid AS coords_centroid_a,
b.coords_centroid AS coords_centroid_b
FROM
signals a
INNER JOIN signals b ON (a."DID" = b."DID")
WHERE (b.timestamp_intersecting_date = a.timestamp_intersecting_date + INTERVAL '1 DAY')
AND a."DID" = b."DID"
) AS t
WHERE t.timestamp_d1_rank = 1
AND t.timestamp_d2_rank = 1
ORDER BY "DID", t1 DESC, t2 ASC ;
thanks, that worked...i tried something similar but using WITH t AS ( slect ..., what you put in th from statement) select t.*, from t then joins where ranks=1/2, it was forcing me to add the columsn to group by or use aggregates, it ended up looking like this, mind telling me my mess up ? thanks pastebin.com/5aN3E29k
– roy naufal
Sep 8 at 14:23
Ah, you have bothGROUP BY a."DID"
to get MAX/MIN and window functions there. There are ways that that's valid but it's usually complicated.
– ypercubeᵀᴹ
Sep 8 at 15:36
If my query above solved the issue, use it. You cab use a CTE of course but that may result in different performance, as Postgres materializes CTE results (at least in versions less than 12).
– ypercubeᵀᴹ
Sep 8 at 15:37
add a comment
|
You cannot use aliases in WHERE
clauses, nor can you use Window functions.
To demonstrate, consider this example (fiddle).
CREATE TABLE payment (amount INTEGER, pay_date DATE);
INSERT INTO payment VALUES (54, '2019-09-01'), (56, '2019-09-01'), (154, '2019-09-02'),
(156, '2019-09-02'), (254, '2019-09-03'), (256, '2019-09-03');
Then run the two following queries:
SELECT
*,
SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
FROM payment
WHERE the_sums > 200; -- ERROR: column "the_sums" does not exist
and
SELECT
*,
SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
FROM payment
WHERE SUM(amount) OVER (PARTITION BY pay_date) > 200;
-- ERROR: window functions are not allowed in WHERE
You didn't post your error message or PostgreSQL version, but your problem is that timestamp_d1_rank
and timestamp_d2_rank
are aliases.
You have two options:
- First Option: use the aliases/Window functions within a subquery (also in fiddle)
SELECT * FROM
(
SELECT
*,
SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
FROM payment
) AS tab
WHERE the_sums < 200;
Result:
amount pay_date the_sums
54 2019-09-01 110
56 2019-09-01 110
- Second Option: you can use a
CTE
(aka theWITH
clause as outlined here or as in @ypercubeᵀᴹ 's answer above or also see the fiddle.
For a good explanation of this whole area, see here.
add a comment
|
The confusion with syntax rules concerning input and output column names has been addressed properly by existing answers. This is about a lurking performance problem.
There is an transient CROSS JOIN
in your query that scales terribly with growing number of rows per ("DID", date_col)
. Example: If a "DID"
has 100 coordinates per day, the query ends up processing 10.000 (!) combinations with the next day.
Based on this assumed table definition:
CREATE TABLE signals (
"DID" serial PRIMARY KEY
, date_col date NOT NULL -- ?
, time_col time NOT NULL -- ?
, coords_centroid geography -- ?
);
And assuming there are only few rows per ("DID", date_col)
, this query should be much faster already:
SELECT a."DID", a.date_col, t1, coords1, t2, coords2
FROM ( -- first row per ("DID", date_col)
SELECT DISTINCT ON ("DID", date_col)
"DID", date_col, time_col AS t2, coords_centroid AS coords2
FROM signals
ORDER BY "DID", date_col, time_col
) b
CROSS JOIN LATERAL ( -- corresponding last row from previous day
SELECT "DID", date_col, time_col AS t1, coords_centroid AS coords1
FROM signals a
WHERE a."DID" = b."DID"
AND a.date_col = b.date_col - 1
ORDER BY time_col DESC
) a;
Needs an index on ("DID", date_col, time_col)
.
More might be done here, depending on your actual setup and actual data distribution. I'll elaborate if you are interested and provide missing information ...
add a comment
|
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "182"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f247275%2frank-used-in-where-returns-invalid-column-but-exists-in-results-set%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
You cannot use window/ranking functions or their aliases in the WHERE
clause because they are evaluated in SELECT
, after the WHERE
has been evaluated.
You can use a subquery (derived table or CTE) to process a second WHERE
clause, after the window functions have been evaluated:
SELECT d1, d2,
"DID",
t1, t2,
coords_centroid_a,
coords_centroid_b
FROM
(
SELECT a.timestamp_intersecting_date d1, b.timestamp_intersecting_date d2,
a."DID",
a.timestamp_intersecting_max t1, b.timestamp_intersecting_min t2,
RANK () OVER (
PARTITION BY a.timestamp_intersecting_date
ORDER BY a.timestamp_intersecting_max DESC
) timestamp_d1_rank ,
RANK () OVER (
PARTITION BY b.timestamp_intersecting_date
ORDER BY b.timestamp_intersecting_max ASC
) timestamp_d2_rank,
a.coords_centroid AS coords_centroid_a,
b.coords_centroid AS coords_centroid_b
FROM
signals a
INNER JOIN signals b ON (a."DID" = b."DID")
WHERE (b.timestamp_intersecting_date = a.timestamp_intersecting_date + INTERVAL '1 DAY')
AND a."DID" = b."DID"
) AS t
WHERE t.timestamp_d1_rank = 1
AND t.timestamp_d2_rank = 1
ORDER BY "DID", t1 DESC, t2 ASC ;
thanks, that worked...i tried something similar but using WITH t AS ( slect ..., what you put in th from statement) select t.*, from t then joins where ranks=1/2, it was forcing me to add the columsn to group by or use aggregates, it ended up looking like this, mind telling me my mess up ? thanks pastebin.com/5aN3E29k
– roy naufal
Sep 8 at 14:23
Ah, you have bothGROUP BY a."DID"
to get MAX/MIN and window functions there. There are ways that that's valid but it's usually complicated.
– ypercubeᵀᴹ
Sep 8 at 15:36
If my query above solved the issue, use it. You cab use a CTE of course but that may result in different performance, as Postgres materializes CTE results (at least in versions less than 12).
– ypercubeᵀᴹ
Sep 8 at 15:37
add a comment
|
You cannot use window/ranking functions or their aliases in the WHERE
clause because they are evaluated in SELECT
, after the WHERE
has been evaluated.
You can use a subquery (derived table or CTE) to process a second WHERE
clause, after the window functions have been evaluated:
SELECT d1, d2,
"DID",
t1, t2,
coords_centroid_a,
coords_centroid_b
FROM
(
SELECT a.timestamp_intersecting_date d1, b.timestamp_intersecting_date d2,
a."DID",
a.timestamp_intersecting_max t1, b.timestamp_intersecting_min t2,
RANK () OVER (
PARTITION BY a.timestamp_intersecting_date
ORDER BY a.timestamp_intersecting_max DESC
) timestamp_d1_rank ,
RANK () OVER (
PARTITION BY b.timestamp_intersecting_date
ORDER BY b.timestamp_intersecting_max ASC
) timestamp_d2_rank,
a.coords_centroid AS coords_centroid_a,
b.coords_centroid AS coords_centroid_b
FROM
signals a
INNER JOIN signals b ON (a."DID" = b."DID")
WHERE (b.timestamp_intersecting_date = a.timestamp_intersecting_date + INTERVAL '1 DAY')
AND a."DID" = b."DID"
) AS t
WHERE t.timestamp_d1_rank = 1
AND t.timestamp_d2_rank = 1
ORDER BY "DID", t1 DESC, t2 ASC ;
thanks, that worked...i tried something similar but using WITH t AS ( slect ..., what you put in th from statement) select t.*, from t then joins where ranks=1/2, it was forcing me to add the columsn to group by or use aggregates, it ended up looking like this, mind telling me my mess up ? thanks pastebin.com/5aN3E29k
– roy naufal
Sep 8 at 14:23
Ah, you have bothGROUP BY a."DID"
to get MAX/MIN and window functions there. There are ways that that's valid but it's usually complicated.
– ypercubeᵀᴹ
Sep 8 at 15:36
If my query above solved the issue, use it. You cab use a CTE of course but that may result in different performance, as Postgres materializes CTE results (at least in versions less than 12).
– ypercubeᵀᴹ
Sep 8 at 15:37
add a comment
|
You cannot use window/ranking functions or their aliases in the WHERE
clause because they are evaluated in SELECT
, after the WHERE
has been evaluated.
You can use a subquery (derived table or CTE) to process a second WHERE
clause, after the window functions have been evaluated:
SELECT d1, d2,
"DID",
t1, t2,
coords_centroid_a,
coords_centroid_b
FROM
(
SELECT a.timestamp_intersecting_date d1, b.timestamp_intersecting_date d2,
a."DID",
a.timestamp_intersecting_max t1, b.timestamp_intersecting_min t2,
RANK () OVER (
PARTITION BY a.timestamp_intersecting_date
ORDER BY a.timestamp_intersecting_max DESC
) timestamp_d1_rank ,
RANK () OVER (
PARTITION BY b.timestamp_intersecting_date
ORDER BY b.timestamp_intersecting_max ASC
) timestamp_d2_rank,
a.coords_centroid AS coords_centroid_a,
b.coords_centroid AS coords_centroid_b
FROM
signals a
INNER JOIN signals b ON (a."DID" = b."DID")
WHERE (b.timestamp_intersecting_date = a.timestamp_intersecting_date + INTERVAL '1 DAY')
AND a."DID" = b."DID"
) AS t
WHERE t.timestamp_d1_rank = 1
AND t.timestamp_d2_rank = 1
ORDER BY "DID", t1 DESC, t2 ASC ;
You cannot use window/ranking functions or their aliases in the WHERE
clause because they are evaluated in SELECT
, after the WHERE
has been evaluated.
You can use a subquery (derived table or CTE) to process a second WHERE
clause, after the window functions have been evaluated:
SELECT d1, d2,
"DID",
t1, t2,
coords_centroid_a,
coords_centroid_b
FROM
(
SELECT a.timestamp_intersecting_date d1, b.timestamp_intersecting_date d2,
a."DID",
a.timestamp_intersecting_max t1, b.timestamp_intersecting_min t2,
RANK () OVER (
PARTITION BY a.timestamp_intersecting_date
ORDER BY a.timestamp_intersecting_max DESC
) timestamp_d1_rank ,
RANK () OVER (
PARTITION BY b.timestamp_intersecting_date
ORDER BY b.timestamp_intersecting_max ASC
) timestamp_d2_rank,
a.coords_centroid AS coords_centroid_a,
b.coords_centroid AS coords_centroid_b
FROM
signals a
INNER JOIN signals b ON (a."DID" = b."DID")
WHERE (b.timestamp_intersecting_date = a.timestamp_intersecting_date + INTERVAL '1 DAY')
AND a."DID" = b."DID"
) AS t
WHERE t.timestamp_d1_rank = 1
AND t.timestamp_d2_rank = 1
ORDER BY "DID", t1 DESC, t2 ASC ;
answered Sep 8 at 11:28
ypercubeᵀᴹypercubeᵀᴹ
82.8k11 gold badges140 silver badges233 bronze badges
82.8k11 gold badges140 silver badges233 bronze badges
thanks, that worked...i tried something similar but using WITH t AS ( slect ..., what you put in th from statement) select t.*, from t then joins where ranks=1/2, it was forcing me to add the columsn to group by or use aggregates, it ended up looking like this, mind telling me my mess up ? thanks pastebin.com/5aN3E29k
– roy naufal
Sep 8 at 14:23
Ah, you have bothGROUP BY a."DID"
to get MAX/MIN and window functions there. There are ways that that's valid but it's usually complicated.
– ypercubeᵀᴹ
Sep 8 at 15:36
If my query above solved the issue, use it. You cab use a CTE of course but that may result in different performance, as Postgres materializes CTE results (at least in versions less than 12).
– ypercubeᵀᴹ
Sep 8 at 15:37
add a comment
|
thanks, that worked...i tried something similar but using WITH t AS ( slect ..., what you put in th from statement) select t.*, from t then joins where ranks=1/2, it was forcing me to add the columsn to group by or use aggregates, it ended up looking like this, mind telling me my mess up ? thanks pastebin.com/5aN3E29k
– roy naufal
Sep 8 at 14:23
Ah, you have bothGROUP BY a."DID"
to get MAX/MIN and window functions there. There are ways that that's valid but it's usually complicated.
– ypercubeᵀᴹ
Sep 8 at 15:36
If my query above solved the issue, use it. You cab use a CTE of course but that may result in different performance, as Postgres materializes CTE results (at least in versions less than 12).
– ypercubeᵀᴹ
Sep 8 at 15:37
thanks, that worked...i tried something similar but using WITH t AS ( slect ..., what you put in th from statement) select t.*, from t then joins where ranks=1/2, it was forcing me to add the columsn to group by or use aggregates, it ended up looking like this, mind telling me my mess up ? thanks pastebin.com/5aN3E29k
– roy naufal
Sep 8 at 14:23
thanks, that worked...i tried something similar but using WITH t AS ( slect ..., what you put in th from statement) select t.*, from t then joins where ranks=1/2, it was forcing me to add the columsn to group by or use aggregates, it ended up looking like this, mind telling me my mess up ? thanks pastebin.com/5aN3E29k
– roy naufal
Sep 8 at 14:23
Ah, you have both
GROUP BY a."DID"
to get MAX/MIN and window functions there. There are ways that that's valid but it's usually complicated.– ypercubeᵀᴹ
Sep 8 at 15:36
Ah, you have both
GROUP BY a."DID"
to get MAX/MIN and window functions there. There are ways that that's valid but it's usually complicated.– ypercubeᵀᴹ
Sep 8 at 15:36
If my query above solved the issue, use it. You cab use a CTE of course but that may result in different performance, as Postgres materializes CTE results (at least in versions less than 12).
– ypercubeᵀᴹ
Sep 8 at 15:37
If my query above solved the issue, use it. You cab use a CTE of course but that may result in different performance, as Postgres materializes CTE results (at least in versions less than 12).
– ypercubeᵀᴹ
Sep 8 at 15:37
add a comment
|
You cannot use aliases in WHERE
clauses, nor can you use Window functions.
To demonstrate, consider this example (fiddle).
CREATE TABLE payment (amount INTEGER, pay_date DATE);
INSERT INTO payment VALUES (54, '2019-09-01'), (56, '2019-09-01'), (154, '2019-09-02'),
(156, '2019-09-02'), (254, '2019-09-03'), (256, '2019-09-03');
Then run the two following queries:
SELECT
*,
SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
FROM payment
WHERE the_sums > 200; -- ERROR: column "the_sums" does not exist
and
SELECT
*,
SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
FROM payment
WHERE SUM(amount) OVER (PARTITION BY pay_date) > 200;
-- ERROR: window functions are not allowed in WHERE
You didn't post your error message or PostgreSQL version, but your problem is that timestamp_d1_rank
and timestamp_d2_rank
are aliases.
You have two options:
- First Option: use the aliases/Window functions within a subquery (also in fiddle)
SELECT * FROM
(
SELECT
*,
SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
FROM payment
) AS tab
WHERE the_sums < 200;
Result:
amount pay_date the_sums
54 2019-09-01 110
56 2019-09-01 110
- Second Option: you can use a
CTE
(aka theWITH
clause as outlined here or as in @ypercubeᵀᴹ 's answer above or also see the fiddle.
For a good explanation of this whole area, see here.
add a comment
|
You cannot use aliases in WHERE
clauses, nor can you use Window functions.
To demonstrate, consider this example (fiddle).
CREATE TABLE payment (amount INTEGER, pay_date DATE);
INSERT INTO payment VALUES (54, '2019-09-01'), (56, '2019-09-01'), (154, '2019-09-02'),
(156, '2019-09-02'), (254, '2019-09-03'), (256, '2019-09-03');
Then run the two following queries:
SELECT
*,
SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
FROM payment
WHERE the_sums > 200; -- ERROR: column "the_sums" does not exist
and
SELECT
*,
SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
FROM payment
WHERE SUM(amount) OVER (PARTITION BY pay_date) > 200;
-- ERROR: window functions are not allowed in WHERE
You didn't post your error message or PostgreSQL version, but your problem is that timestamp_d1_rank
and timestamp_d2_rank
are aliases.
You have two options:
- First Option: use the aliases/Window functions within a subquery (also in fiddle)
SELECT * FROM
(
SELECT
*,
SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
FROM payment
) AS tab
WHERE the_sums < 200;
Result:
amount pay_date the_sums
54 2019-09-01 110
56 2019-09-01 110
- Second Option: you can use a
CTE
(aka theWITH
clause as outlined here or as in @ypercubeᵀᴹ 's answer above or also see the fiddle.
For a good explanation of this whole area, see here.
add a comment
|
You cannot use aliases in WHERE
clauses, nor can you use Window functions.
To demonstrate, consider this example (fiddle).
CREATE TABLE payment (amount INTEGER, pay_date DATE);
INSERT INTO payment VALUES (54, '2019-09-01'), (56, '2019-09-01'), (154, '2019-09-02'),
(156, '2019-09-02'), (254, '2019-09-03'), (256, '2019-09-03');
Then run the two following queries:
SELECT
*,
SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
FROM payment
WHERE the_sums > 200; -- ERROR: column "the_sums" does not exist
and
SELECT
*,
SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
FROM payment
WHERE SUM(amount) OVER (PARTITION BY pay_date) > 200;
-- ERROR: window functions are not allowed in WHERE
You didn't post your error message or PostgreSQL version, but your problem is that timestamp_d1_rank
and timestamp_d2_rank
are aliases.
You have two options:
- First Option: use the aliases/Window functions within a subquery (also in fiddle)
SELECT * FROM
(
SELECT
*,
SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
FROM payment
) AS tab
WHERE the_sums < 200;
Result:
amount pay_date the_sums
54 2019-09-01 110
56 2019-09-01 110
- Second Option: you can use a
CTE
(aka theWITH
clause as outlined here or as in @ypercubeᵀᴹ 's answer above or also see the fiddle.
For a good explanation of this whole area, see here.
You cannot use aliases in WHERE
clauses, nor can you use Window functions.
To demonstrate, consider this example (fiddle).
CREATE TABLE payment (amount INTEGER, pay_date DATE);
INSERT INTO payment VALUES (54, '2019-09-01'), (56, '2019-09-01'), (154, '2019-09-02'),
(156, '2019-09-02'), (254, '2019-09-03'), (256, '2019-09-03');
Then run the two following queries:
SELECT
*,
SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
FROM payment
WHERE the_sums > 200; -- ERROR: column "the_sums" does not exist
and
SELECT
*,
SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
FROM payment
WHERE SUM(amount) OVER (PARTITION BY pay_date) > 200;
-- ERROR: window functions are not allowed in WHERE
You didn't post your error message or PostgreSQL version, but your problem is that timestamp_d1_rank
and timestamp_d2_rank
are aliases.
You have two options:
- First Option: use the aliases/Window functions within a subquery (also in fiddle)
SELECT * FROM
(
SELECT
*,
SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
FROM payment
) AS tab
WHERE the_sums < 200;
Result:
amount pay_date the_sums
54 2019-09-01 110
56 2019-09-01 110
- Second Option: you can use a
CTE
(aka theWITH
clause as outlined here or as in @ypercubeᵀᴹ 's answer above or also see the fiddle.
For a good explanation of this whole area, see here.
edited Sep 9 at 13:41
answered Sep 8 at 6:30
VéraceVérace
18k3 gold badges39 silver badges58 bronze badges
18k3 gold badges39 silver badges58 bronze badges
add a comment
|
add a comment
|
The confusion with syntax rules concerning input and output column names has been addressed properly by existing answers. This is about a lurking performance problem.
There is an transient CROSS JOIN
in your query that scales terribly with growing number of rows per ("DID", date_col)
. Example: If a "DID"
has 100 coordinates per day, the query ends up processing 10.000 (!) combinations with the next day.
Based on this assumed table definition:
CREATE TABLE signals (
"DID" serial PRIMARY KEY
, date_col date NOT NULL -- ?
, time_col time NOT NULL -- ?
, coords_centroid geography -- ?
);
And assuming there are only few rows per ("DID", date_col)
, this query should be much faster already:
SELECT a."DID", a.date_col, t1, coords1, t2, coords2
FROM ( -- first row per ("DID", date_col)
SELECT DISTINCT ON ("DID", date_col)
"DID", date_col, time_col AS t2, coords_centroid AS coords2
FROM signals
ORDER BY "DID", date_col, time_col
) b
CROSS JOIN LATERAL ( -- corresponding last row from previous day
SELECT "DID", date_col, time_col AS t1, coords_centroid AS coords1
FROM signals a
WHERE a."DID" = b."DID"
AND a.date_col = b.date_col - 1
ORDER BY time_col DESC
) a;
Needs an index on ("DID", date_col, time_col)
.
More might be done here, depending on your actual setup and actual data distribution. I'll elaborate if you are interested and provide missing information ...
add a comment
|
The confusion with syntax rules concerning input and output column names has been addressed properly by existing answers. This is about a lurking performance problem.
There is an transient CROSS JOIN
in your query that scales terribly with growing number of rows per ("DID", date_col)
. Example: If a "DID"
has 100 coordinates per day, the query ends up processing 10.000 (!) combinations with the next day.
Based on this assumed table definition:
CREATE TABLE signals (
"DID" serial PRIMARY KEY
, date_col date NOT NULL -- ?
, time_col time NOT NULL -- ?
, coords_centroid geography -- ?
);
And assuming there are only few rows per ("DID", date_col)
, this query should be much faster already:
SELECT a."DID", a.date_col, t1, coords1, t2, coords2
FROM ( -- first row per ("DID", date_col)
SELECT DISTINCT ON ("DID", date_col)
"DID", date_col, time_col AS t2, coords_centroid AS coords2
FROM signals
ORDER BY "DID", date_col, time_col
) b
CROSS JOIN LATERAL ( -- corresponding last row from previous day
SELECT "DID", date_col, time_col AS t1, coords_centroid AS coords1
FROM signals a
WHERE a."DID" = b."DID"
AND a.date_col = b.date_col - 1
ORDER BY time_col DESC
) a;
Needs an index on ("DID", date_col, time_col)
.
More might be done here, depending on your actual setup and actual data distribution. I'll elaborate if you are interested and provide missing information ...
add a comment
|
The confusion with syntax rules concerning input and output column names has been addressed properly by existing answers. This is about a lurking performance problem.
There is an transient CROSS JOIN
in your query that scales terribly with growing number of rows per ("DID", date_col)
. Example: If a "DID"
has 100 coordinates per day, the query ends up processing 10.000 (!) combinations with the next day.
Based on this assumed table definition:
CREATE TABLE signals (
"DID" serial PRIMARY KEY
, date_col date NOT NULL -- ?
, time_col time NOT NULL -- ?
, coords_centroid geography -- ?
);
And assuming there are only few rows per ("DID", date_col)
, this query should be much faster already:
SELECT a."DID", a.date_col, t1, coords1, t2, coords2
FROM ( -- first row per ("DID", date_col)
SELECT DISTINCT ON ("DID", date_col)
"DID", date_col, time_col AS t2, coords_centroid AS coords2
FROM signals
ORDER BY "DID", date_col, time_col
) b
CROSS JOIN LATERAL ( -- corresponding last row from previous day
SELECT "DID", date_col, time_col AS t1, coords_centroid AS coords1
FROM signals a
WHERE a."DID" = b."DID"
AND a.date_col = b.date_col - 1
ORDER BY time_col DESC
) a;
Needs an index on ("DID", date_col, time_col)
.
More might be done here, depending on your actual setup and actual data distribution. I'll elaborate if you are interested and provide missing information ...
The confusion with syntax rules concerning input and output column names has been addressed properly by existing answers. This is about a lurking performance problem.
There is an transient CROSS JOIN
in your query that scales terribly with growing number of rows per ("DID", date_col)
. Example: If a "DID"
has 100 coordinates per day, the query ends up processing 10.000 (!) combinations with the next day.
Based on this assumed table definition:
CREATE TABLE signals (
"DID" serial PRIMARY KEY
, date_col date NOT NULL -- ?
, time_col time NOT NULL -- ?
, coords_centroid geography -- ?
);
And assuming there are only few rows per ("DID", date_col)
, this query should be much faster already:
SELECT a."DID", a.date_col, t1, coords1, t2, coords2
FROM ( -- first row per ("DID", date_col)
SELECT DISTINCT ON ("DID", date_col)
"DID", date_col, time_col AS t2, coords_centroid AS coords2
FROM signals
ORDER BY "DID", date_col, time_col
) b
CROSS JOIN LATERAL ( -- corresponding last row from previous day
SELECT "DID", date_col, time_col AS t1, coords_centroid AS coords1
FROM signals a
WHERE a."DID" = b."DID"
AND a.date_col = b.date_col - 1
ORDER BY time_col DESC
) a;
Needs an index on ("DID", date_col, time_col)
.
More might be done here, depending on your actual setup and actual data distribution. I'll elaborate if you are interested and provide missing information ...
answered Sep 8 at 23:47
Erwin BrandstetterErwin Brandstetter
107k11 gold badges229 silver badges357 bronze badges
107k11 gold badges229 silver badges357 bronze badges
add a comment
|
add a comment
|
Thanks for contributing an answer to Database Administrators Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f247275%2frank-used-in-where-returns-invalid-column-but-exists-in-results-set%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Yes, you can’t refer to them in the where clause as window functions are evaluated after the where clause. You can put your query into a with clause, or wrap it as a subquery.
– Colin 't Hart
Sep 8 at 12:10
Please always provide a table definition as
CREATE TABLE
statement showing exact data types and constraints. And your version of Postgres. Helps to avoid misunderstandings and makes it so much easier to help. You have a timestamp and a date?timestamp_intersecting_date
? A lot can go wrong right here already ...– Erwin Brandstetter
Sep 8 at 22:49
Data distribution matters a lot here. If each
DID
has a (at most) a single row per day, you would use a different query than when there can be many. If time zones are involved, the definition of "day" may need some attention.– Erwin Brandstetter
Sep 8 at 22:56