skip to Main Content

In a Postgres DB, given the following simplified dataset of orders:

ID    STORE_ID    UPDATED_AT
1     "store-1"   2021-01-01
2     "store-2"   2021-01-02
3     "store-3"   2021-01-03
4     "store-1"   2021-01-04
5     "store-2"   2021-01-05
6     "store-3"   2021-01-06
...
1000  "store-1"    2021-02-05

How can I write a query to select the the first 5 unique store id’s, and the first 10 orders per store ordered by oldest to newest, resulting in a maximum of 50 rows returned?
For example:

ID    STORE_ID    UPDATED_AT
1     "store-1"   2021-01-01
4     "store-1"   2021-01-04
...
13    "store-1"   2021-01-12

2     "store-2"   2021-01-02
5     "store-2"   2021-01-05
...
18    "store-2"   2021-01-08

3     "store-3"   2021-01-03
8     "store-3"   2021-01-12
...
22    "store-3"   2021-01-22

My goal is to process orders from oldest to newest, but process the oldest 10 orders per store as I can batch them with the Shopify API which would be more efficient.

For example, in my code I will combine it to something like:

{
  "store-1": [{ /* order */ }, { /* order */ }, { /* order */ }, ...], // Array of 10 orders
  "store-2": [{ /* order */ }, { /* order */ }, { /* order */ }, ...], // Array of 10 orders
  "store-3": [{ /* order */ }, { /* order */ }, { /* order */ }, ...], // Array of 10 orders
  "store-4": [{ /* order */ }, { /* order */ }, { /* order */ }, ...], // Array of 10 orders
  "store-5": [{ /* order */ }, { /* order */ }, { /* order */ }, ...], // Array of 10 orders
}

So I can run 5 API calls in parallel for each store.

I’ve tried multiple queries, including the following:

SELECT a.store_id FROM orders a
    INNER JOIN
        (SELECT store_id FROM orders GROUP BY store_id ORDER BY MIN(updated_at) LIMIT 5) b
    ON a.store_id = b.store_id
    ORDER BY a.updated_at ASC;

But I cannot limit the rows to 10 per store ID

3

Answers


  1. with st as
    (
     select distinct store_id 
     from orders 
     order by <your business rule here> 
     limit 5
    )
    select st.store_id, l.updated_at 
    from st 
    cross join lateral
    (
     select updated_at
     from orders o
     where o.store_id = st.store_id
     order by updated_at desc
     limit 10
    ) l;
    
    Login or Signup to reply.
  2. demo:db<>fiddle

    SELECT
        o.*
    FROM (
        SELECT
            store_id
        FROM (
            SELECT DISTINCT ON (store_id)    -- 1a
                store_id, updated_at
            FROM orders
            ORDER BY store_id, updated_at
        ) s
        ORDER BY updated_at                  -- 1b
        LIMIT 5
    ) s
    CROSS JOIN LATERAL (                     -- 2
        SELECT 
            *
        FROM orders o
        WHERE o.store_id = s.store_id
        ORDER BY updated_at
        LIMIT 10
    ) o
    

    1 a) Return only one record per store using DISTINCT ON b) Identify the top 5 stores ordering by the date.
    2. These stores can be used in a lateral join to filter the orders per store, again using the updated_at order.

    Login or Signup to reply.
  3. One more possible query:

    select ID,    STORE_ID,    UPDATED_AT
    from (
      select 
          orders.* ,
          row_number() over (partition by STORE_ID order by UPDATED_AT desc) rn_order,
          dense_rank() over(order by STORE_ID) store_rank
      from orders
      order by STORE_ID
    ) ranked 
    where store_rank <= [count srores] and rn_order <= [count orders per store];
    

    PostgreSQL fiddle

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search