RANK used in 'where' returns invalid column, but exists in results setUsing column alias in a WHERE clause doesn't workDelete duplicate records with no change in betweenPostgreSQL DELETE FROM fails with `Error: attempted to delete invisible tuple`search_path set and seen but not used!Index not used with many WHERE LOWER(column) IN (…) conditions, works fine with just a fewMulti-Column index not used for index-only scan, but partial index isPostgreSQL 9.5 query performance depends on JOINed column in SELECT clauseWhich data model/schema to apply to a timeseries datastore for data sources with varying fieldspostgresql random function returns duplicate results when order by random used in queryQuery returns results on pgAdmin running on remote client but not directly through psql on db server

Are conjugate vectors unique?

Grammar explanation for ~よし

Isn't any conversation with the US president quid-pro-quo?

Unable to Style the checkboxes in Sitecore Forms

How should I handle a player attacking from the top of a tree?

How to handle a colleague who appears helpful in front of manager but doesn't help in private?

Days in indexed month

Steampunk book about a bounty hunter teen girl in London

Is "montäglich" commonly used?

N-Dimensional Cartesian Product

Can a UK passport valid for two months travel to Germany post-brexit?

Prove that the equation has only one real root.

Twelve Labours #08 - Dark Horse Bookmakers

How to get rid of vertical white lines in a table?

What does Yoda's species eat?

Can the Protection fighting style be used in this way?

Why would prey creatures not hate predator creatures?

How can I improve combat so my players don't always use the strategy of collectively focusing fire on one enemy at a time until it's dead?

If a photon truly goes through both slits (at the same time), then why can't we detect it at both slits (at the same time)?

Is it possible to change paper title after send back from reviewers for revising?

Why are there notes in my guitar chord book that are not part of the chord's formula?

Ran out of space of newly installed HDD?

Is it a mistake to use a password that has previously been used (by anyone ever)?

Stalemate situation with all pieces on the board



RANK used in 'where' returns invalid column, but exists in results set


Using column alias in a WHERE clause doesn't workDelete duplicate records with no change in betweenPostgreSQL DELETE FROM fails with `Error: attempted to delete invisible tuple`search_path set and seen but not used!Index not used with many WHERE LOWER(column) IN (…) conditions, works fine with just a fewMulti-Column index not used for index-only scan, but partial index isPostgreSQL 9.5 query performance depends on JOINed column in SELECT clauseWhich data model/schema to apply to a timeseries datastore for data sources with varying fieldspostgresql random function returns duplicate results when order by random used in queryQuery returns results on pgAdmin running on remote client but not directly through psql on db server






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty
margin-bottom:0;









4


















I basically have a table with date, timestamp, DID, coordinates.



I want a query that will return rows with the last coordinate from day X, and the first coord from day X+1, and the coordinates. So it will only return results where there are 2 consecutive dates.



This is what I came up with. Been trying to get this query to work, it's almost perfect, but I just need to add the commented out where conditions and it'll do exactly what I want. But when I uncomment, I get an error "column doesn't exist":



SELECT a.timestamp_intersecting_date d1,b.timestamp_intersecting_date d2, 
a."DID",
a.timestamp_intersecting_max t1, b.timestamp_intersecting_min t2,
RANK () OVER (
PARTITION BY a.timestamp_intersecting_date
ORDER BY a.timestamp_intersecting_max DESC
) timestamp_d1_rank ,
RANK () OVER (
PARTITION BY b.timestamp_intersecting_date
ORDER BY b.timestamp_intersecting_max ASC
) timestamp_d2_rank,
a.coords_centroid, b.coords_centroid
FROM
signals a
INNER JOIN signals b ON (a."DID" = b."DID")
WHERE (b.timestamp_intersecting_date = a.timestamp_intersecting_date + INTERVAL '1 DAY')
AND a."DID" = b."DID"
--AND timestamp_d1_rank = 1
--AND timestamp_d2_rank = 1
ORDER BY a."DID", t1 desc, t2 asc


How to solve this?










share|improve this question



























  • Yes, you can’t refer to them in the where clause as window functions are evaluated after the where clause. You can put your query into a with clause, or wrap it as a subquery.

    – Colin 't Hart
    Sep 8 at 12:10











  • Please always provide a table definition as CREATE TABLE statement showing exact data types and constraints. And your version of Postgres. Helps to avoid misunderstandings and makes it so much easier to help. You have a timestamp and a date? timestamp_intersecting_date? A lot can go wrong right here already ...

    – Erwin Brandstetter
    Sep 8 at 22:49











  • Data distribution matters a lot here. If each DID has a (at most) a single row per day, you would use a different query than when there can be many. If time zones are involved, the definition of "day" may need some attention.

    – Erwin Brandstetter
    Sep 8 at 22:56

















4


















I basically have a table with date, timestamp, DID, coordinates.



I want a query that will return rows with the last coordinate from day X, and the first coord from day X+1, and the coordinates. So it will only return results where there are 2 consecutive dates.



This is what I came up with. Been trying to get this query to work, it's almost perfect, but I just need to add the commented out where conditions and it'll do exactly what I want. But when I uncomment, I get an error "column doesn't exist":



SELECT a.timestamp_intersecting_date d1,b.timestamp_intersecting_date d2, 
a."DID",
a.timestamp_intersecting_max t1, b.timestamp_intersecting_min t2,
RANK () OVER (
PARTITION BY a.timestamp_intersecting_date
ORDER BY a.timestamp_intersecting_max DESC
) timestamp_d1_rank ,
RANK () OVER (
PARTITION BY b.timestamp_intersecting_date
ORDER BY b.timestamp_intersecting_max ASC
) timestamp_d2_rank,
a.coords_centroid, b.coords_centroid
FROM
signals a
INNER JOIN signals b ON (a."DID" = b."DID")
WHERE (b.timestamp_intersecting_date = a.timestamp_intersecting_date + INTERVAL '1 DAY')
AND a."DID" = b."DID"
--AND timestamp_d1_rank = 1
--AND timestamp_d2_rank = 1
ORDER BY a."DID", t1 desc, t2 asc


How to solve this?










share|improve this question



























  • Yes, you can’t refer to them in the where clause as window functions are evaluated after the where clause. You can put your query into a with clause, or wrap it as a subquery.

    – Colin 't Hart
    Sep 8 at 12:10











  • Please always provide a table definition as CREATE TABLE statement showing exact data types and constraints. And your version of Postgres. Helps to avoid misunderstandings and makes it so much easier to help. You have a timestamp and a date? timestamp_intersecting_date? A lot can go wrong right here already ...

    – Erwin Brandstetter
    Sep 8 at 22:49











  • Data distribution matters a lot here. If each DID has a (at most) a single row per day, you would use a different query than when there can be many. If time zones are involved, the definition of "day" may need some attention.

    – Erwin Brandstetter
    Sep 8 at 22:56













4













4









4








I basically have a table with date, timestamp, DID, coordinates.



I want a query that will return rows with the last coordinate from day X, and the first coord from day X+1, and the coordinates. So it will only return results where there are 2 consecutive dates.



This is what I came up with. Been trying to get this query to work, it's almost perfect, but I just need to add the commented out where conditions and it'll do exactly what I want. But when I uncomment, I get an error "column doesn't exist":



SELECT a.timestamp_intersecting_date d1,b.timestamp_intersecting_date d2, 
a."DID",
a.timestamp_intersecting_max t1, b.timestamp_intersecting_min t2,
RANK () OVER (
PARTITION BY a.timestamp_intersecting_date
ORDER BY a.timestamp_intersecting_max DESC
) timestamp_d1_rank ,
RANK () OVER (
PARTITION BY b.timestamp_intersecting_date
ORDER BY b.timestamp_intersecting_max ASC
) timestamp_d2_rank,
a.coords_centroid, b.coords_centroid
FROM
signals a
INNER JOIN signals b ON (a."DID" = b."DID")
WHERE (b.timestamp_intersecting_date = a.timestamp_intersecting_date + INTERVAL '1 DAY')
AND a."DID" = b."DID"
--AND timestamp_d1_rank = 1
--AND timestamp_d2_rank = 1
ORDER BY a."DID", t1 desc, t2 asc


How to solve this?










share|improve this question
















I basically have a table with date, timestamp, DID, coordinates.



I want a query that will return rows with the last coordinate from day X, and the first coord from day X+1, and the coordinates. So it will only return results where there are 2 consecutive dates.



This is what I came up with. Been trying to get this query to work, it's almost perfect, but I just need to add the commented out where conditions and it'll do exactly what I want. But when I uncomment, I get an error "column doesn't exist":



SELECT a.timestamp_intersecting_date d1,b.timestamp_intersecting_date d2, 
a."DID",
a.timestamp_intersecting_max t1, b.timestamp_intersecting_min t2,
RANK () OVER (
PARTITION BY a.timestamp_intersecting_date
ORDER BY a.timestamp_intersecting_max DESC
) timestamp_d1_rank ,
RANK () OVER (
PARTITION BY b.timestamp_intersecting_date
ORDER BY b.timestamp_intersecting_max ASC
) timestamp_d2_rank,
a.coords_centroid, b.coords_centroid
FROM
signals a
INNER JOIN signals b ON (a."DID" = b."DID")
WHERE (b.timestamp_intersecting_date = a.timestamp_intersecting_date + INTERVAL '1 DAY')
AND a."DID" = b."DID"
--AND timestamp_d1_rank = 1
--AND timestamp_d2_rank = 1
ORDER BY a."DID", t1 desc, t2 asc


How to solve this?







postgresql






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Sep 8 at 22:47









Erwin Brandstetter

107k11 gold badges229 silver badges357 bronze badges




107k11 gold badges229 silver badges357 bronze badges










asked Sep 8 at 1:27









roy naufalroy naufal

1674 bronze badges




1674 bronze badges















  • Yes, you can’t refer to them in the where clause as window functions are evaluated after the where clause. You can put your query into a with clause, or wrap it as a subquery.

    – Colin 't Hart
    Sep 8 at 12:10











  • Please always provide a table definition as CREATE TABLE statement showing exact data types and constraints. And your version of Postgres. Helps to avoid misunderstandings and makes it so much easier to help. You have a timestamp and a date? timestamp_intersecting_date? A lot can go wrong right here already ...

    – Erwin Brandstetter
    Sep 8 at 22:49











  • Data distribution matters a lot here. If each DID has a (at most) a single row per day, you would use a different query than when there can be many. If time zones are involved, the definition of "day" may need some attention.

    – Erwin Brandstetter
    Sep 8 at 22:56

















  • Yes, you can’t refer to them in the where clause as window functions are evaluated after the where clause. You can put your query into a with clause, or wrap it as a subquery.

    – Colin 't Hart
    Sep 8 at 12:10











  • Please always provide a table definition as CREATE TABLE statement showing exact data types and constraints. And your version of Postgres. Helps to avoid misunderstandings and makes it so much easier to help. You have a timestamp and a date? timestamp_intersecting_date? A lot can go wrong right here already ...

    – Erwin Brandstetter
    Sep 8 at 22:49











  • Data distribution matters a lot here. If each DID has a (at most) a single row per day, you would use a different query than when there can be many. If time zones are involved, the definition of "day" may need some attention.

    – Erwin Brandstetter
    Sep 8 at 22:56
















Yes, you can’t refer to them in the where clause as window functions are evaluated after the where clause. You can put your query into a with clause, or wrap it as a subquery.

– Colin 't Hart
Sep 8 at 12:10





Yes, you can’t refer to them in the where clause as window functions are evaluated after the where clause. You can put your query into a with clause, or wrap it as a subquery.

– Colin 't Hart
Sep 8 at 12:10













Please always provide a table definition as CREATE TABLE statement showing exact data types and constraints. And your version of Postgres. Helps to avoid misunderstandings and makes it so much easier to help. You have a timestamp and a date? timestamp_intersecting_date? A lot can go wrong right here already ...

– Erwin Brandstetter
Sep 8 at 22:49





Please always provide a table definition as CREATE TABLE statement showing exact data types and constraints. And your version of Postgres. Helps to avoid misunderstandings and makes it so much easier to help. You have a timestamp and a date? timestamp_intersecting_date? A lot can go wrong right here already ...

– Erwin Brandstetter
Sep 8 at 22:49













Data distribution matters a lot here. If each DID has a (at most) a single row per day, you would use a different query than when there can be many. If time zones are involved, the definition of "day" may need some attention.

– Erwin Brandstetter
Sep 8 at 22:56





Data distribution matters a lot here. If each DID has a (at most) a single row per day, you would use a different query than when there can be many. If time zones are involved, the definition of "day" may need some attention.

– Erwin Brandstetter
Sep 8 at 22:56










3 Answers
3






active

oldest

votes


















3



















You cannot use window/ranking functions or their aliases in the WHERE clause because they are evaluated in SELECT, after the WHERE has been evaluated.



You can use a subquery (derived table or CTE) to process a second WHERE clause, after the window functions have been evaluated:



SELECT d1, d2,
"DID",
t1, t2,
coords_centroid_a,
coords_centroid_b
FROM
(
SELECT a.timestamp_intersecting_date d1, b.timestamp_intersecting_date d2,
a."DID",
a.timestamp_intersecting_max t1, b.timestamp_intersecting_min t2,
RANK () OVER (
PARTITION BY a.timestamp_intersecting_date
ORDER BY a.timestamp_intersecting_max DESC
) timestamp_d1_rank ,
RANK () OVER (
PARTITION BY b.timestamp_intersecting_date
ORDER BY b.timestamp_intersecting_max ASC
) timestamp_d2_rank,
a.coords_centroid AS coords_centroid_a,
b.coords_centroid AS coords_centroid_b
FROM
signals a
INNER JOIN signals b ON (a."DID" = b."DID")
WHERE (b.timestamp_intersecting_date = a.timestamp_intersecting_date + INTERVAL '1 DAY')
AND a."DID" = b."DID"
) AS t
WHERE t.timestamp_d1_rank = 1
AND t.timestamp_d2_rank = 1
ORDER BY "DID", t1 DESC, t2 ASC ;





share|improve this answer


























  • thanks, that worked...i tried something similar but using WITH t AS ( slect ..., what you put in th from statement) select t.*, from t then joins where ranks=1/2, it was forcing me to add the columsn to group by or use aggregates, it ended up looking like this, mind telling me my mess up ? thanks pastebin.com/5aN3E29k

    – roy naufal
    Sep 8 at 14:23











  • Ah, you have both GROUP BY a."DID" to get MAX/MIN and window functions there. There are ways that that's valid but it's usually complicated.

    – ypercubeᵀᴹ
    Sep 8 at 15:36











  • If my query above solved the issue, use it. You cab use a CTE of course but that may result in different performance, as Postgres materializes CTE results (at least in versions less than 12).

    – ypercubeᵀᴹ
    Sep 8 at 15:37


















3



















You cannot use aliases in WHERE clauses, nor can you use Window functions.



To demonstrate, consider this example (fiddle).



CREATE TABLE payment (amount INTEGER, pay_date DATE);

INSERT INTO payment VALUES (54, '2019-09-01'), (56, '2019-09-01'), (154, '2019-09-02'),
(156, '2019-09-02'), (254, '2019-09-03'), (256, '2019-09-03');


Then run the two following queries:



SELECT 
*,
SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
FROM payment
WHERE the_sums > 200; -- ERROR: column "the_sums" does not exist


and



SELECT 
*,
SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
FROM payment
WHERE SUM(amount) OVER (PARTITION BY pay_date) > 200;
-- ERROR: window functions are not allowed in WHERE


You didn't post your error message or PostgreSQL version, but your problem is that timestamp_d1_rank and timestamp_d2_rank are aliases.



You have two options:



  • First Option: use the aliases/Window functions within a subquery (also in fiddle)




SELECT * FROM
(
SELECT
*,
SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
FROM payment
) AS tab
WHERE the_sums < 200;


Result:



amount pay_date the_sums
54 2019-09-01 110
56 2019-09-01 110


  • Second Option: you can use a CTE (aka the WITH clause as outlined here or as in @ypercubeᵀᴹ 's answer above or also see the fiddle.

For a good explanation of this whole area, see here.






share|improve this answer



































    2



















    The confusion with syntax rules concerning input and output column names has been addressed properly by existing answers. This is about a lurking performance problem.



    There is an transient CROSS JOIN in your query that scales terribly with growing number of rows per ("DID", date_col). Example: If a "DID" has 100 coordinates per day, the query ends up processing 10.000 (!) combinations with the next day.



    Based on this assumed table definition:



    CREATE TABLE signals (
    "DID" serial PRIMARY KEY
    , date_col date NOT NULL -- ?
    , time_col time NOT NULL -- ?
    , coords_centroid geography -- ?
    );


    And assuming there are only few rows per ("DID", date_col), this query should be much faster already:



    SELECT a."DID", a.date_col, t1, coords1, t2, coords2
    FROM ( -- first row per ("DID", date_col)
    SELECT DISTINCT ON ("DID", date_col)
    "DID", date_col, time_col AS t2, coords_centroid AS coords2
    FROM signals
    ORDER BY "DID", date_col, time_col
    ) b
    CROSS JOIN LATERAL ( -- corresponding last row from previous day
    SELECT "DID", date_col, time_col AS t1, coords_centroid AS coords1
    FROM signals a
    WHERE a."DID" = b."DID"
    AND a.date_col = b.date_col - 1
    ORDER BY time_col DESC
    ) a;


    Needs an index on ("DID", date_col, time_col).



    More might be done here, depending on your actual setup and actual data distribution. I'll elaborate if you are interested and provide missing information ...






    share|improve this answer



























      Your Answer








      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "182"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: false,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      imageUploader:
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      ,
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );














      draft saved

      draft discarded
















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f247275%2frank-used-in-where-returns-invalid-column-but-exists-in-results-set%23new-answer', 'question_page');

      );

      Post as a guest















      Required, but never shown


























      3 Answers
      3






      active

      oldest

      votes








      3 Answers
      3






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      3



















      You cannot use window/ranking functions or their aliases in the WHERE clause because they are evaluated in SELECT, after the WHERE has been evaluated.



      You can use a subquery (derived table or CTE) to process a second WHERE clause, after the window functions have been evaluated:



      SELECT d1, d2,
      "DID",
      t1, t2,
      coords_centroid_a,
      coords_centroid_b
      FROM
      (
      SELECT a.timestamp_intersecting_date d1, b.timestamp_intersecting_date d2,
      a."DID",
      a.timestamp_intersecting_max t1, b.timestamp_intersecting_min t2,
      RANK () OVER (
      PARTITION BY a.timestamp_intersecting_date
      ORDER BY a.timestamp_intersecting_max DESC
      ) timestamp_d1_rank ,
      RANK () OVER (
      PARTITION BY b.timestamp_intersecting_date
      ORDER BY b.timestamp_intersecting_max ASC
      ) timestamp_d2_rank,
      a.coords_centroid AS coords_centroid_a,
      b.coords_centroid AS coords_centroid_b
      FROM
      signals a
      INNER JOIN signals b ON (a."DID" = b."DID")
      WHERE (b.timestamp_intersecting_date = a.timestamp_intersecting_date + INTERVAL '1 DAY')
      AND a."DID" = b."DID"
      ) AS t
      WHERE t.timestamp_d1_rank = 1
      AND t.timestamp_d2_rank = 1
      ORDER BY "DID", t1 DESC, t2 ASC ;





      share|improve this answer


























      • thanks, that worked...i tried something similar but using WITH t AS ( slect ..., what you put in th from statement) select t.*, from t then joins where ranks=1/2, it was forcing me to add the columsn to group by or use aggregates, it ended up looking like this, mind telling me my mess up ? thanks pastebin.com/5aN3E29k

        – roy naufal
        Sep 8 at 14:23











      • Ah, you have both GROUP BY a."DID" to get MAX/MIN and window functions there. There are ways that that's valid but it's usually complicated.

        – ypercubeᵀᴹ
        Sep 8 at 15:36











      • If my query above solved the issue, use it. You cab use a CTE of course but that may result in different performance, as Postgres materializes CTE results (at least in versions less than 12).

        – ypercubeᵀᴹ
        Sep 8 at 15:37















      3



















      You cannot use window/ranking functions or their aliases in the WHERE clause because they are evaluated in SELECT, after the WHERE has been evaluated.



      You can use a subquery (derived table or CTE) to process a second WHERE clause, after the window functions have been evaluated:



      SELECT d1, d2,
      "DID",
      t1, t2,
      coords_centroid_a,
      coords_centroid_b
      FROM
      (
      SELECT a.timestamp_intersecting_date d1, b.timestamp_intersecting_date d2,
      a."DID",
      a.timestamp_intersecting_max t1, b.timestamp_intersecting_min t2,
      RANK () OVER (
      PARTITION BY a.timestamp_intersecting_date
      ORDER BY a.timestamp_intersecting_max DESC
      ) timestamp_d1_rank ,
      RANK () OVER (
      PARTITION BY b.timestamp_intersecting_date
      ORDER BY b.timestamp_intersecting_max ASC
      ) timestamp_d2_rank,
      a.coords_centroid AS coords_centroid_a,
      b.coords_centroid AS coords_centroid_b
      FROM
      signals a
      INNER JOIN signals b ON (a."DID" = b."DID")
      WHERE (b.timestamp_intersecting_date = a.timestamp_intersecting_date + INTERVAL '1 DAY')
      AND a."DID" = b."DID"
      ) AS t
      WHERE t.timestamp_d1_rank = 1
      AND t.timestamp_d2_rank = 1
      ORDER BY "DID", t1 DESC, t2 ASC ;





      share|improve this answer


























      • thanks, that worked...i tried something similar but using WITH t AS ( slect ..., what you put in th from statement) select t.*, from t then joins where ranks=1/2, it was forcing me to add the columsn to group by or use aggregates, it ended up looking like this, mind telling me my mess up ? thanks pastebin.com/5aN3E29k

        – roy naufal
        Sep 8 at 14:23











      • Ah, you have both GROUP BY a."DID" to get MAX/MIN and window functions there. There are ways that that's valid but it's usually complicated.

        – ypercubeᵀᴹ
        Sep 8 at 15:36











      • If my query above solved the issue, use it. You cab use a CTE of course but that may result in different performance, as Postgres materializes CTE results (at least in versions less than 12).

        – ypercubeᵀᴹ
        Sep 8 at 15:37













      3















      3











      3









      You cannot use window/ranking functions or their aliases in the WHERE clause because they are evaluated in SELECT, after the WHERE has been evaluated.



      You can use a subquery (derived table or CTE) to process a second WHERE clause, after the window functions have been evaluated:



      SELECT d1, d2,
      "DID",
      t1, t2,
      coords_centroid_a,
      coords_centroid_b
      FROM
      (
      SELECT a.timestamp_intersecting_date d1, b.timestamp_intersecting_date d2,
      a."DID",
      a.timestamp_intersecting_max t1, b.timestamp_intersecting_min t2,
      RANK () OVER (
      PARTITION BY a.timestamp_intersecting_date
      ORDER BY a.timestamp_intersecting_max DESC
      ) timestamp_d1_rank ,
      RANK () OVER (
      PARTITION BY b.timestamp_intersecting_date
      ORDER BY b.timestamp_intersecting_max ASC
      ) timestamp_d2_rank,
      a.coords_centroid AS coords_centroid_a,
      b.coords_centroid AS coords_centroid_b
      FROM
      signals a
      INNER JOIN signals b ON (a."DID" = b."DID")
      WHERE (b.timestamp_intersecting_date = a.timestamp_intersecting_date + INTERVAL '1 DAY')
      AND a."DID" = b."DID"
      ) AS t
      WHERE t.timestamp_d1_rank = 1
      AND t.timestamp_d2_rank = 1
      ORDER BY "DID", t1 DESC, t2 ASC ;





      share|improve this answer














      You cannot use window/ranking functions or their aliases in the WHERE clause because they are evaluated in SELECT, after the WHERE has been evaluated.



      You can use a subquery (derived table or CTE) to process a second WHERE clause, after the window functions have been evaluated:



      SELECT d1, d2,
      "DID",
      t1, t2,
      coords_centroid_a,
      coords_centroid_b
      FROM
      (
      SELECT a.timestamp_intersecting_date d1, b.timestamp_intersecting_date d2,
      a."DID",
      a.timestamp_intersecting_max t1, b.timestamp_intersecting_min t2,
      RANK () OVER (
      PARTITION BY a.timestamp_intersecting_date
      ORDER BY a.timestamp_intersecting_max DESC
      ) timestamp_d1_rank ,
      RANK () OVER (
      PARTITION BY b.timestamp_intersecting_date
      ORDER BY b.timestamp_intersecting_max ASC
      ) timestamp_d2_rank,
      a.coords_centroid AS coords_centroid_a,
      b.coords_centroid AS coords_centroid_b
      FROM
      signals a
      INNER JOIN signals b ON (a."DID" = b."DID")
      WHERE (b.timestamp_intersecting_date = a.timestamp_intersecting_date + INTERVAL '1 DAY')
      AND a."DID" = b."DID"
      ) AS t
      WHERE t.timestamp_d1_rank = 1
      AND t.timestamp_d2_rank = 1
      ORDER BY "DID", t1 DESC, t2 ASC ;






      share|improve this answer













      share|improve this answer




      share|improve this answer










      answered Sep 8 at 11:28









      ypercubeᵀᴹypercubeᵀᴹ

      82.8k11 gold badges140 silver badges233 bronze badges




      82.8k11 gold badges140 silver badges233 bronze badges















      • thanks, that worked...i tried something similar but using WITH t AS ( slect ..., what you put in th from statement) select t.*, from t then joins where ranks=1/2, it was forcing me to add the columsn to group by or use aggregates, it ended up looking like this, mind telling me my mess up ? thanks pastebin.com/5aN3E29k

        – roy naufal
        Sep 8 at 14:23











      • Ah, you have both GROUP BY a."DID" to get MAX/MIN and window functions there. There are ways that that's valid but it's usually complicated.

        – ypercubeᵀᴹ
        Sep 8 at 15:36











      • If my query above solved the issue, use it. You cab use a CTE of course but that may result in different performance, as Postgres materializes CTE results (at least in versions less than 12).

        – ypercubeᵀᴹ
        Sep 8 at 15:37

















      • thanks, that worked...i tried something similar but using WITH t AS ( slect ..., what you put in th from statement) select t.*, from t then joins where ranks=1/2, it was forcing me to add the columsn to group by or use aggregates, it ended up looking like this, mind telling me my mess up ? thanks pastebin.com/5aN3E29k

        – roy naufal
        Sep 8 at 14:23











      • Ah, you have both GROUP BY a."DID" to get MAX/MIN and window functions there. There are ways that that's valid but it's usually complicated.

        – ypercubeᵀᴹ
        Sep 8 at 15:36











      • If my query above solved the issue, use it. You cab use a CTE of course but that may result in different performance, as Postgres materializes CTE results (at least in versions less than 12).

        – ypercubeᵀᴹ
        Sep 8 at 15:37
















      thanks, that worked...i tried something similar but using WITH t AS ( slect ..., what you put in th from statement) select t.*, from t then joins where ranks=1/2, it was forcing me to add the columsn to group by or use aggregates, it ended up looking like this, mind telling me my mess up ? thanks pastebin.com/5aN3E29k

      – roy naufal
      Sep 8 at 14:23





      thanks, that worked...i tried something similar but using WITH t AS ( slect ..., what you put in th from statement) select t.*, from t then joins where ranks=1/2, it was forcing me to add the columsn to group by or use aggregates, it ended up looking like this, mind telling me my mess up ? thanks pastebin.com/5aN3E29k

      – roy naufal
      Sep 8 at 14:23













      Ah, you have both GROUP BY a."DID" to get MAX/MIN and window functions there. There are ways that that's valid but it's usually complicated.

      – ypercubeᵀᴹ
      Sep 8 at 15:36





      Ah, you have both GROUP BY a."DID" to get MAX/MIN and window functions there. There are ways that that's valid but it's usually complicated.

      – ypercubeᵀᴹ
      Sep 8 at 15:36













      If my query above solved the issue, use it. You cab use a CTE of course but that may result in different performance, as Postgres materializes CTE results (at least in versions less than 12).

      – ypercubeᵀᴹ
      Sep 8 at 15:37





      If my query above solved the issue, use it. You cab use a CTE of course but that may result in different performance, as Postgres materializes CTE results (at least in versions less than 12).

      – ypercubeᵀᴹ
      Sep 8 at 15:37













      3



















      You cannot use aliases in WHERE clauses, nor can you use Window functions.



      To demonstrate, consider this example (fiddle).



      CREATE TABLE payment (amount INTEGER, pay_date DATE);

      INSERT INTO payment VALUES (54, '2019-09-01'), (56, '2019-09-01'), (154, '2019-09-02'),
      (156, '2019-09-02'), (254, '2019-09-03'), (256, '2019-09-03');


      Then run the two following queries:



      SELECT 
      *,
      SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
      FROM payment
      WHERE the_sums > 200; -- ERROR: column "the_sums" does not exist


      and



      SELECT 
      *,
      SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
      FROM payment
      WHERE SUM(amount) OVER (PARTITION BY pay_date) > 200;
      -- ERROR: window functions are not allowed in WHERE


      You didn't post your error message or PostgreSQL version, but your problem is that timestamp_d1_rank and timestamp_d2_rank are aliases.



      You have two options:



      • First Option: use the aliases/Window functions within a subquery (also in fiddle)




      SELECT * FROM
      (
      SELECT
      *,
      SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
      FROM payment
      ) AS tab
      WHERE the_sums < 200;


      Result:



      amount pay_date the_sums
      54 2019-09-01 110
      56 2019-09-01 110


      • Second Option: you can use a CTE (aka the WITH clause as outlined here or as in @ypercubeᵀᴹ 's answer above or also see the fiddle.

      For a good explanation of this whole area, see here.






      share|improve this answer
































        3



















        You cannot use aliases in WHERE clauses, nor can you use Window functions.



        To demonstrate, consider this example (fiddle).



        CREATE TABLE payment (amount INTEGER, pay_date DATE);

        INSERT INTO payment VALUES (54, '2019-09-01'), (56, '2019-09-01'), (154, '2019-09-02'),
        (156, '2019-09-02'), (254, '2019-09-03'), (256, '2019-09-03');


        Then run the two following queries:



        SELECT 
        *,
        SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
        FROM payment
        WHERE the_sums > 200; -- ERROR: column "the_sums" does not exist


        and



        SELECT 
        *,
        SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
        FROM payment
        WHERE SUM(amount) OVER (PARTITION BY pay_date) > 200;
        -- ERROR: window functions are not allowed in WHERE


        You didn't post your error message or PostgreSQL version, but your problem is that timestamp_d1_rank and timestamp_d2_rank are aliases.



        You have two options:



        • First Option: use the aliases/Window functions within a subquery (also in fiddle)




        SELECT * FROM
        (
        SELECT
        *,
        SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
        FROM payment
        ) AS tab
        WHERE the_sums < 200;


        Result:



        amount pay_date the_sums
        54 2019-09-01 110
        56 2019-09-01 110


        • Second Option: you can use a CTE (aka the WITH clause as outlined here or as in @ypercubeᵀᴹ 's answer above or also see the fiddle.

        For a good explanation of this whole area, see here.






        share|improve this answer






























          3















          3











          3









          You cannot use aliases in WHERE clauses, nor can you use Window functions.



          To demonstrate, consider this example (fiddle).



          CREATE TABLE payment (amount INTEGER, pay_date DATE);

          INSERT INTO payment VALUES (54, '2019-09-01'), (56, '2019-09-01'), (154, '2019-09-02'),
          (156, '2019-09-02'), (254, '2019-09-03'), (256, '2019-09-03');


          Then run the two following queries:



          SELECT 
          *,
          SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
          FROM payment
          WHERE the_sums > 200; -- ERROR: column "the_sums" does not exist


          and



          SELECT 
          *,
          SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
          FROM payment
          WHERE SUM(amount) OVER (PARTITION BY pay_date) > 200;
          -- ERROR: window functions are not allowed in WHERE


          You didn't post your error message or PostgreSQL version, but your problem is that timestamp_d1_rank and timestamp_d2_rank are aliases.



          You have two options:



          • First Option: use the aliases/Window functions within a subquery (also in fiddle)




          SELECT * FROM
          (
          SELECT
          *,
          SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
          FROM payment
          ) AS tab
          WHERE the_sums < 200;


          Result:



          amount pay_date the_sums
          54 2019-09-01 110
          56 2019-09-01 110


          • Second Option: you can use a CTE (aka the WITH clause as outlined here or as in @ypercubeᵀᴹ 's answer above or also see the fiddle.

          For a good explanation of this whole area, see here.






          share|improve this answer
















          You cannot use aliases in WHERE clauses, nor can you use Window functions.



          To demonstrate, consider this example (fiddle).



          CREATE TABLE payment (amount INTEGER, pay_date DATE);

          INSERT INTO payment VALUES (54, '2019-09-01'), (56, '2019-09-01'), (154, '2019-09-02'),
          (156, '2019-09-02'), (254, '2019-09-03'), (256, '2019-09-03');


          Then run the two following queries:



          SELECT 
          *,
          SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
          FROM payment
          WHERE the_sums > 200; -- ERROR: column "the_sums" does not exist


          and



          SELECT 
          *,
          SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
          FROM payment
          WHERE SUM(amount) OVER (PARTITION BY pay_date) > 200;
          -- ERROR: window functions are not allowed in WHERE


          You didn't post your error message or PostgreSQL version, but your problem is that timestamp_d1_rank and timestamp_d2_rank are aliases.



          You have two options:



          • First Option: use the aliases/Window functions within a subquery (also in fiddle)




          SELECT * FROM
          (
          SELECT
          *,
          SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
          FROM payment
          ) AS tab
          WHERE the_sums < 200;


          Result:



          amount pay_date the_sums
          54 2019-09-01 110
          56 2019-09-01 110


          • Second Option: you can use a CTE (aka the WITH clause as outlined here or as in @ypercubeᵀᴹ 's answer above or also see the fiddle.

          For a good explanation of this whole area, see here.







          share|improve this answer















          share|improve this answer




          share|improve this answer








          edited Sep 9 at 13:41

























          answered Sep 8 at 6:30









          VéraceVérace

          18k3 gold badges39 silver badges58 bronze badges




          18k3 gold badges39 silver badges58 bronze badges
























              2



















              The confusion with syntax rules concerning input and output column names has been addressed properly by existing answers. This is about a lurking performance problem.



              There is an transient CROSS JOIN in your query that scales terribly with growing number of rows per ("DID", date_col). Example: If a "DID" has 100 coordinates per day, the query ends up processing 10.000 (!) combinations with the next day.



              Based on this assumed table definition:



              CREATE TABLE signals (
              "DID" serial PRIMARY KEY
              , date_col date NOT NULL -- ?
              , time_col time NOT NULL -- ?
              , coords_centroid geography -- ?
              );


              And assuming there are only few rows per ("DID", date_col), this query should be much faster already:



              SELECT a."DID", a.date_col, t1, coords1, t2, coords2
              FROM ( -- first row per ("DID", date_col)
              SELECT DISTINCT ON ("DID", date_col)
              "DID", date_col, time_col AS t2, coords_centroid AS coords2
              FROM signals
              ORDER BY "DID", date_col, time_col
              ) b
              CROSS JOIN LATERAL ( -- corresponding last row from previous day
              SELECT "DID", date_col, time_col AS t1, coords_centroid AS coords1
              FROM signals a
              WHERE a."DID" = b."DID"
              AND a.date_col = b.date_col - 1
              ORDER BY time_col DESC
              ) a;


              Needs an index on ("DID", date_col, time_col).



              More might be done here, depending on your actual setup and actual data distribution. I'll elaborate if you are interested and provide missing information ...






              share|improve this answer






























                2



















                The confusion with syntax rules concerning input and output column names has been addressed properly by existing answers. This is about a lurking performance problem.



                There is an transient CROSS JOIN in your query that scales terribly with growing number of rows per ("DID", date_col). Example: If a "DID" has 100 coordinates per day, the query ends up processing 10.000 (!) combinations with the next day.



                Based on this assumed table definition:



                CREATE TABLE signals (
                "DID" serial PRIMARY KEY
                , date_col date NOT NULL -- ?
                , time_col time NOT NULL -- ?
                , coords_centroid geography -- ?
                );


                And assuming there are only few rows per ("DID", date_col), this query should be much faster already:



                SELECT a."DID", a.date_col, t1, coords1, t2, coords2
                FROM ( -- first row per ("DID", date_col)
                SELECT DISTINCT ON ("DID", date_col)
                "DID", date_col, time_col AS t2, coords_centroid AS coords2
                FROM signals
                ORDER BY "DID", date_col, time_col
                ) b
                CROSS JOIN LATERAL ( -- corresponding last row from previous day
                SELECT "DID", date_col, time_col AS t1, coords_centroid AS coords1
                FROM signals a
                WHERE a."DID" = b."DID"
                AND a.date_col = b.date_col - 1
                ORDER BY time_col DESC
                ) a;


                Needs an index on ("DID", date_col, time_col).



                More might be done here, depending on your actual setup and actual data distribution. I'll elaborate if you are interested and provide missing information ...






                share|improve this answer




























                  2















                  2











                  2









                  The confusion with syntax rules concerning input and output column names has been addressed properly by existing answers. This is about a lurking performance problem.



                  There is an transient CROSS JOIN in your query that scales terribly with growing number of rows per ("DID", date_col). Example: If a "DID" has 100 coordinates per day, the query ends up processing 10.000 (!) combinations with the next day.



                  Based on this assumed table definition:



                  CREATE TABLE signals (
                  "DID" serial PRIMARY KEY
                  , date_col date NOT NULL -- ?
                  , time_col time NOT NULL -- ?
                  , coords_centroid geography -- ?
                  );


                  And assuming there are only few rows per ("DID", date_col), this query should be much faster already:



                  SELECT a."DID", a.date_col, t1, coords1, t2, coords2
                  FROM ( -- first row per ("DID", date_col)
                  SELECT DISTINCT ON ("DID", date_col)
                  "DID", date_col, time_col AS t2, coords_centroid AS coords2
                  FROM signals
                  ORDER BY "DID", date_col, time_col
                  ) b
                  CROSS JOIN LATERAL ( -- corresponding last row from previous day
                  SELECT "DID", date_col, time_col AS t1, coords_centroid AS coords1
                  FROM signals a
                  WHERE a."DID" = b."DID"
                  AND a.date_col = b.date_col - 1
                  ORDER BY time_col DESC
                  ) a;


                  Needs an index on ("DID", date_col, time_col).



                  More might be done here, depending on your actual setup and actual data distribution. I'll elaborate if you are interested and provide missing information ...






                  share|improve this answer














                  The confusion with syntax rules concerning input and output column names has been addressed properly by existing answers. This is about a lurking performance problem.



                  There is an transient CROSS JOIN in your query that scales terribly with growing number of rows per ("DID", date_col). Example: If a "DID" has 100 coordinates per day, the query ends up processing 10.000 (!) combinations with the next day.



                  Based on this assumed table definition:



                  CREATE TABLE signals (
                  "DID" serial PRIMARY KEY
                  , date_col date NOT NULL -- ?
                  , time_col time NOT NULL -- ?
                  , coords_centroid geography -- ?
                  );


                  And assuming there are only few rows per ("DID", date_col), this query should be much faster already:



                  SELECT a."DID", a.date_col, t1, coords1, t2, coords2
                  FROM ( -- first row per ("DID", date_col)
                  SELECT DISTINCT ON ("DID", date_col)
                  "DID", date_col, time_col AS t2, coords_centroid AS coords2
                  FROM signals
                  ORDER BY "DID", date_col, time_col
                  ) b
                  CROSS JOIN LATERAL ( -- corresponding last row from previous day
                  SELECT "DID", date_col, time_col AS t1, coords_centroid AS coords1
                  FROM signals a
                  WHERE a."DID" = b."DID"
                  AND a.date_col = b.date_col - 1
                  ORDER BY time_col DESC
                  ) a;


                  Needs an index on ("DID", date_col, time_col).



                  More might be done here, depending on your actual setup and actual data distribution. I'll elaborate if you are interested and provide missing information ...







                  share|improve this answer













                  share|improve this answer




                  share|improve this answer










                  answered Sep 8 at 23:47









                  Erwin BrandstetterErwin Brandstetter

                  107k11 gold badges229 silver badges357 bronze badges




                  107k11 gold badges229 silver badges357 bronze badges































                      draft saved

                      draft discarded















































                      Thanks for contributing an answer to Database Administrators Stack Exchange!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid


                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.

                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f247275%2frank-used-in-where-returns-invalid-column-but-exists-in-results-set%23new-answer', 'question_page');

                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown









                      Popular posts from this blog

                      Tamil (spriik) Luke uk diar | Nawigatjuun

                      Align equal signs while including text over equalitiesAMS align: left aligned text/math plus multicolumn alignmentMultiple alignmentsAligning equations in multiple placesNumbering and aligning an equation with multiple columnsHow to align one equation with another multline equationUsing \ in environments inside the begintabularxNumber equations and preserving alignment of equal signsHow can I align equations to the left and to the right?Double equation alignment problem within align enviromentAligned within align: Why are they right-aligned?

                      Training a classifier when some of the features are unknownWhy does Gradient Boosting regression predict negative values when there are no negative y-values in my training set?How to improve an existing (trained) classifier?What is effect when I set up some self defined predisctor variables?Why Matlab neural network classification returns decimal values on prediction dataset?Fitting and transforming text data in training, testing, and validation setsHow to quantify the performance of the classifier (multi-class SVM) using the test data?How do I control for some patients providing multiple samples in my training data?Training and Test setTraining a convolutional neural network for image denoising in MatlabShouldn't an autoencoder with #(neurons in hidden layer) = #(neurons in input layer) be “perfect”?