skip to Main Content

I’m trying to formulate a query to get an answer from my data. The data is sales data where the customer may make several orders on a day and each invoice may consist of multiple lines. The data looks roughly like this:

Customer | Invoice Date | Invoice Number | Stock | Total Value
---------------------------------------------------------------
Acme     | 01/01/2023   | 1234           | Cod   | £20
Acme     | 01/01/2023   | 1234           | Hake  | £15
Acme     | 01/01/2023   | 2468           | Cod   | £10
Acme     | 01/01/2023   | 2468           | Hake  | £12
Acme     | 02/01/2023   | 3699           | Cod   | £20
Acme     | 02/01/2023   | 4567           | Hake  | £15
Acme     | 03/01/2023   | 9876           | Cod   | £10
Acme     | 03/01/2023   | 9876           | Hake  | £1
Beta     | 01/01/2023   | 8976           | Cod   | £10
Beta     | 01/01/2023   | 8976           | Hake  | £15
Beta     | 01/01/2023   | 5432           | Cod   | £5
Beta     | 01/01/2023   | 5432           | Hake  | £12
Beta     | 02/01/2023   | 2233           | Cod   | £20
Beta     | 02/01/2023   | 2233           | Hake  | £15
Beta     | 02/01/2023   | 1590           | Cod   | £10
Beta     | 02/01/2023   | 1590           | Hake  | £15

What I’m trying to find is the total sales, the average spend per day and per invoice and a count of the invoices which fall below a given threshold (say £25), something like this

Customer | Sales Total | Days Ordered Count | Invoice Count | Average per Invoice |  Invoice < £25
-------------------------------------------------------------------------------------------------
Acme     |     £103    |           3        |      5        |          £20.60     |      4
Beta     |     £102    |           2        |      4        |          £25.50     |      1

I can get the first five columns ok, but I’m stumped about how to approach the last column. This is the SQL I use for the first five columns, which works.

select Customer,sum(Sales_Total) AS Sales Total,
    count(distinct Invoice_Date) AS Days ordered Count,
    COUNT(distinct Invoice_Number) AS Invoice Count,
    sum(Sales_Total)/count(distinct Invoice_Number) AS Average per Invoice,
from sales_2023
group by Customer
order by Customer;

I’ve tried window functions, but I got rather unexpected results, certainly not what I was looking for. I’ve also tried a subquery, but I don’t know how to aggregate the different lines on an invoice together in the subquery, so that doesn’t work either.

Any pointers on how to approach this would be very welcome.

3

Answers


  1. You can use a subquery to calculate the sales total and invoice total for each invoice by grouping it under Customer and Invoice_Number. Then, in the outer query, it sums up the count of invoices where the Invoice_Total is less than £25 for each customer.

    SELECT 
        Customer,
        SUM(Sales_Total) AS Sales_Total,
        COUNT(DISTINCT Invoice_Date) AS Days_Ordered_Count,
        COUNT(DISTINCT Invoice_Number) AS Invoice_Count,
        SUM(Sales_Total) / COUNT(DISTINCT Invoice_Number) AS Average_per_Invoice,
        SUM(CASE WHEN Invoice_Total < 25 THEN 1 ELSE 0 END) AS Invoice_Less_Than_25
    FROM (
        SELECT 
            Customer,
            Invoice_Number,
            Invoice_Date,
            SUM(Total_Value) AS Sales_Total,
            SUM(Total_Value) AS Invoice_Total
        FROM sales_2023
        GROUP BY Customer, Invoice_Number, Invoice_Date
    ) AS subquery
    GROUP BY Customer
    ORDER BY Customer;
    
    Login or Signup to reply.
  2. Mysql code with two sub queries for better readability. You can merge them into one if needed.

    select subset_one.customer,subset_one.Sales_Total,
           subset_one.Days_ordered_Count,
           subset_one.Invoice_Count,
           subset_one.Average_per_Invoice,
           subset_two.count_less_25
        from
        (
            select  customer,sum(total_value) AS Sales_Total,
                    count(distinct Invoice_Date) AS Days_ordered_Count,
                    COUNT(distinct Invoice_Number) AS Invoice_Count,
                    sum(total_value)/count(distinct Invoice_Number) AS Average_per_Invoice
            from sales_2023 group by Customer
        ) subset_one
            inner join
        (
        select subset_lt.customer, sum(subset_lt.less_than_25) as count_less_25
        from
            (
                select customer,Invoice_Number, (case when sum(total_value) < 25 then 1 else 0 end) as less_than_25
                from    sales_2023 group by Customer,Invoice_Number 
            ) subset_lt group by subset_lt.Customer
        ) subset_two
        on
        subset_one.customer = subset_two.customer
    

    Adding DB fiddle link
    Less than 25

    Login or Signup to reply.
  3. This can be done by using a subquery to first calculate the total value per invoice and then using a conditional aggregation in your main query to count the invoices with a value below the given threshold. Something like:

    SELECT 
        Customer,
        SUM(Invoice_Total) AS Sales_Total,
        COUNT(DISTINCT Invoice_Date) AS Days_Ordered_Count,
        COUNT(DISTINCT Invoice_Number) AS Invoice_Count,
        ROUND(SUM(Invoice_Total) / COUNT(DISTINCT Invoice_Number), 2) AS Average_per_Invoice,
        COUNT(CASE WHEN Invoice_Total < 25 THEN Invoice_Number ELSE NULL END) AS Invoice_Less_Than_25
    FROM (
        SELECT 
            Customer, 
            Invoice_Date, 
            Invoice_Number, 
            SUM(Total_Value) AS Invoice_Total
        FROM sales_2023
        GROUP BY Customer, Invoice_Date, Invoice_Number
    ) AS invoice_totals
    GROUP BY Customer
    ORDER BY Customer;
    

    DBFIDDLE

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search