skip to Main Content

I’m learning how to query records in sql and i got stuck at one problem that i can’t figure out myself.
So basically i have data base called sql_test and i have to group users based on the intervals that are -1 year, -2 years from 2016-12-31 down and based on revenue that is bigger then 2000.

I got users sorted and grouped but when i check how were they grouped by looking at order creation date it looks like my query didn’t followed my conditions about interval of years.

It looks like CASE WHEN is only using SUM(revenue) > 2000 and completely ignoring the interval condition.

I’m new in the SQL world, and I’m a self-learner, so don’t be too harsh 😉

I work in PSQL, pdAdmin4 and server is based on PostgreSQL 15

What i have to achieve:

  • Group 1: Customers with >2000 total revenue in last 1 year

  • Group 2: Customers with >2000 total revenue in last 2 years

email_id order_number order_creation_date order_type customer_segment product_category_id product_sku quantity revenue
194826 7f598909bbd4144ab23a78c3a0bda9b6 2016-05-28 09:47:52 Web Consumer 8358ab6af9a7542d380e75e839d215a0 1 178.18

Code:

SELECT email_id ,
       COUNT(email_id) as num_of_orders ,
       SUM(revenue)    AS total_money_spent ,
       CASE
         WHEN order_creation_date BETWEEN '2016-01-01' and '2016-12-31'
              AND SUM(revenue) > 2000
                THEN 'Group_1'
         END
FROM sql_test
GROUP BY email_id,
         order_creation_date
ORDER BY num_of_orders DESC ;

Problem is that i want only users from past year from 2016-12-31 which have at least 2000 in revenue

enter image description here

P.S., If you recognize my post, sorry, I made another one because I implemented the wrong code and photo.

3

Answers


  1. Your condition can be set on GROUP BY and COUNT CASE clause as follows :

    SELECT email_id, 
          COUNT(CASE
             WHEN order_creation_date BETWEEN '2016-01-01' and '2016-12-31'
                AND revenue > 2000
                    THEN 1
             END) as num_of_orders_group1, 
          SUM(CASE
             WHEN order_creation_date BETWEEN '2016-01-01' and '2016-12-31'
                AND revenue > 2000
                    THEN revenue
             END) AS total_money_spent_group1,
          COUNT(CASE
             WHEN order_creation_date BETWEEN '2015-01-01' and '2016-12-31'
                AND revenue > 2000
                    THEN 1
             END) as num_of_orders_group1, 
          SUM(CASE
             WHEN order_creation_date BETWEEN '2015-01-01' and '2016-12-31'
                AND revenue > 2000
                    THEN revenue
             END) AS total_money_spent_group1
    FROM sql_test 
    GROUP BY email_id
    
    Login or Signup to reply.
  2. The [conceptual] model for how a SQL query gets executed is this:

    1. Form the Cartesian product of all tables in the from clause.
    2. Apply any applicable join criteria (the on bit in the from clause).
    3. Filter the results by the criteria in the where clause.
    4. Partition the results by the criteria in the group by clause.
    5. Collapse each partition into a single row, evaluating aggregate functions.
    6. Filter that into the results set by applying the criteria in the having clause.
    7. Finally, order the results set by the criteria specified in the order by clause.

    Note that if you are using group by, the columns in the result set may only consist of

    • Columns specified in group by,
    • constants, or
    • aggregate functions

    So, for a single year’s worth of results, you could just say

    select email_id ,
           count(*)     as num_of_orders ,
           sum(revenue) as total_money_spent
    from     sql_test
    where    date_part(year, order_creation_date) = 2016
    group by email_id
    having   sum(revenue) > 2000
    order by num_of_orders desc ;
    
    Login or Signup to reply.
  3. SELECT email_id,
           COUNT(*) as num_of_orders,
           SUM(revenue) AS total_money_spent,
           CASE WHEN SUM(
                CASE WHEN order_creation_date BETWEEN '2016-01-01' and '2016-12-31'
                     THEN revenue END) > 2000
                THEN 'Group_1'
                WHEN SUM(revenue) > 2000
                THEN 'Group_2'
           END as Groups
    FROM sql_test
    WHERE order_creation_date >= `2015-01-01`
    GROUP BY email_id
    ORDER BY num_of_orders DESC
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search