skip to Main Content

I have an input table like this:

Load_date projected_date current stock consumed demand
2020-01-01 2020-01-01 100 15
2020-01-01 2020-01-02 150 80
2020-01-01 2020-01-03 200 25
2020-01-01 2020-01-04 300 70
2020-01-01 2020-01-05 400 65

My intermediate output will be like this:

For projected date: 2020-01-01

Load_date projected_date current stock consumed demand rolling consumed demand Finished projected date
2020-01-01 2020-01-01 100 15 2020-01-03
2020-01-01 2020-01-02 150 80 80
2020-01-01 2020-01-03 200 25 105
2020-01-01 2020-01-04 300 70 175
2020-01-01 2020-01-05 400 65 240

For projected date: 2020-01-02

Load_date projected_date current stock consumed demand rolling consumed demand Finished projected date
2020-01-01 2020-01-01 100 15
2020-01-01 2020-01-02 150 80 2020-01-05
2020-01-01 2020-01-03 200 25 25
2020-01-01 2020-01-04 300 70 95
2020-01-01 2020-01-05 400 65 160

Similarly we have to derive the finished projected date for other projected dates

Final output will be:

Load_date projected_date current stock consumed demand Finished projected date
2020-01-01 2020-01-01 100 15 2020-01-03
2020-01-01 2020-01-02 150 80 2020-01-05
2020-01-01 2020-01-03 200 25
2020-01-01 2020-01-04 300 70
2020-01-01 2020-01-05 400 65

Can you please help me with the SQL query? I have tried to write query in Databricks SQL, but not getting this output. Actually it is a small part of my entire requirement.

The logic here is to calculate column Finished projected date, whenever the rolling consumed demand will exceed the current stock for a particular projected date, it will return that rolling consumed demand’s corresponding projected date.

Like in the above table, the 2020-01-01 projected date has current stock of 100. The rolling consumed demand calculation starts 1 day after actual projected date i.e. from 2020-01-02 onwards. The rolling consumed demand reaches to 105 on projected date 2020-01-03. So, for 1st row it will return the finished projected date as 2020-01-03.

Similarly for projected date 2020-01-02, the rolling consumed demand exceeds the total demand on 2020-01-05. As 160 > 150. So the finished projected date would be 2020-01-05.

2

Answers


  1. I haven’t try it yet.

    But should be look like this using sample table. You could this as an initial guide.

    SELECT Date,
           Sales,
           SUM(Sales) OVER (ORDER BY Date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) as RollingCumulativeSales
    FROM daily_sales
    ORDER BY Date;
    

    You could find refence and sample on this site. SQL Cumulative Sum: A Comprehensive Guide

    Login or Signup to reply.
  2. To achieve the desired output, you can use a SQL query that involves window functions to compute the rolling consumed demand and then determine the finished projected date based on the logic you provided.

    Here is the SQL query that should work in Databricks SQL:

    WITH CTE_RollingDemand AS (
        SELECT
            Load_date,
            projected_date,
            current_stock,
            consumed_demand,
            SUM(consumed_demand) OVER (PARTITION BY Load_date ORDER BY projected_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS rolling_consumed_demand
        FROM
            input_table
    ),
    CTE_FinishedProjectedDate AS (
        SELECT
            a.Load_date,
            a.projected_date,
            a.current_stock,
            a.consumed_demand,
            MIN(b.projected_date) AS Finished_projected_date
        FROM
            CTE_RollingDemand a
            LEFT JOIN CTE_RollingDemand b ON a.Load_date = b.Load_date AND b.projected_date > a.projected_date
        WHERE
            b.rolling_consumed_demand > a.current_stock
        GROUP BY
            a.Load_date,
            a.projected_date,
            a.current_stock,
            a.consumed_demand
    )
    SELECT
        t.Load_date,
        t.projected_date,
        t.current_stock,
        t.consumed_demand,
        c.Finished_projected_date
    FROM
        input_table t
        LEFT JOIN CTE_FinishedProjectedDate c ON t.Load_date = c.Load_date AND t.projected_date = c.projected_date
    ORDER BY
        t.Load_date,
        t.projected_date;
    

    Explanation:

    1. CTE_RollingDemand: This common table expression calculates the rolling consumed demand for each projected date.

      • SUM(consumed_demand) OVER (PARTITION BY Load_date ORDER BY projected_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) computes the cumulative sum of consumed demand.
    2. CTE_FinishedProjectedDate: This common table expression finds the first projected date where the rolling consumed demand exceeds the current stock.

      • MIN(b.projected_date) AS Finished_projected_date identifies the earliest projected date where the rolling consumed demand surpasses the current stock.
    3. Final Select: The final select statement joins the original input table with the CTE_FinishedProjectedDate to get the Finished_projected_date for each row, ordering the results by Load_date and projected_date.

    Assumptions:

    • The input table name is input_table. Please replace it with the actual name of your table.

    This query should give you the final output with the finished projected date calculated as per your requirements.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search