skip to Main Content

I have a query gets opening and closing balances for each month of customers from their transactions and then get a company total for the month. but if a customer does not have any transactions for multiple months i need to get the last closing and use it to fill all empty months.. right now im using LAG to get the last closing balance before any empty but when i carry that forward i can only carry it forward into 1 empty month which results in incorect calculation of the company’s total… any help will be appreciated.

2

Answers


  1. Chosen as BEST ANSWER

    found that this is line made my query working for anybody that care.

    customer_month_balances AS ( -- Combine customer-month combinations with their balances SELECT pm."customer_id", pm."month", COALESCE(mb."month_opening_balance", NULL) AS "month_opening_balance", COALESCE(mb."month_closing_balance", NULL) AS "month_closing_balance" FROM customer_months pm LEFT JOIN monthly_balances mb ON pm."customer_id" = mb."customer_id" AND pm."month" = mb."month" ), propagated_balances AS (

    SELECT
        customer_id,
        month,
        month_opening_balance,
        month_closing_balance,
        LAST_VALUE(month_closing_balance IGNORE NULLS) OVER (
            PARTITION BY customer_id
            ORDER BY month ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
        ) AS propagated_closing_balance
    FROM
        customer_month_balances
    

    )


  2. I have created sample data and output based on my understanding of the question, but can edit as required.

    INSERT INTO transactions (customer_id, transaction_date, balance) VALUES
    (1, '2024-01-01', 1000),
    (1, '2024-03-01', 500),
    (1, '2024-05-01', 1000),
    (2, '2024-02-01', 500),
    (2, '2024-04-01', 500);
    

    This shows transactions are missing for customer id 1 for months like February,April etc.

    I then created a CTE (month_series) which would create all months based on current date for all customers, you can just use a date dimension (if you have one) and generate all the months.

    The important logic lies in CTE balances_with_gaps, which will populate the closing balance for the latest previous month where it was available, that is the reason for using lt.month <= ms.month and ORDER BY and LIMIT

    WITH month_series AS (
        -- Generate all months for each customer up to the max month
        SELECT DISTINCT customer_id,
                        DATE_FORMAT(transaction_date, '%Y-%m-01') AS month
        FROM transactions
        UNION
        SELECT DISTINCT customer_id, 
                        DATE_FORMAT(CURRENT_DATE - INTERVAL seq MONTH, '%Y-%m-01') AS month
        FROM transactions
        JOIN (SELECT 0 AS seq UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 
              UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9 
              UNION ALL SELECT 10 UNION ALL SELECT 11 UNION ALL SELECT 12) AS seq
        WHERE CURRENT_DATE - INTERVAL seq MONTH >= '2024-01-01'  -- Adjust the range as needed
    )
    ,
    latest_transactions AS (
        -- Get the most recent transaction per customer per month
        SELECT t.customer_id,
               DATE_FORMAT(t.transaction_date, '%Y-%m-01') AS month,
               t.balance
        FROM transactions t
    )
    
    
    ,balances_with_gaps AS (
        -- For each customer and month, get the most recent transaction balance before that month
        SELECT ms.customer_id,
               ms.month,
               COALESCE(
                   (SELECT lt.balance
                    FROM latest_transactions lt
                    WHERE lt.customer_id = ms.customer_id
                      AND lt.month <= ms.month
                    ORDER BY lt.month DESC
                    LIMIT 1), 0) AS closing_balance  -- 0 is the initial balance for months with no transactions
        FROM month_series ms
    )
    SELECT customer_id, month, closing_balance
    FROM balances_with_gaps
    ORDER BY customer_id, month;
    

    Fiddle

    You should be mostly able to convert balances_with_gaps CTE in redshift as it supports limit, I used mysql as an example.

    Output

    enter image description here

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search