skip to Main Content

In order to conduct a Market Basket Analysis on the sales data in magento, i need to retrieve the products that are included in each invoice from magento. So far this is what i got.

SELECT 
    tblInvoice.increment_id AS orderId,
    tblLine.product_id AS productId,
    tblLine.sku as productSku,
    tblLine.qty as qty
FROM 
    sales_flat_invoice AS tblInvoice
RIGHT JOIN 
    sales_flat_invoice_item AS tblLine 
        ON tblInvoice.entity_id = tblLine.parent_id

However, the problem is that for the products that consists of both a configuable and a simple product, I get both. I only need the simple product if it is alone.

 orderId    productId   productSku  qty     
100000004   456         mpd00338    1.0000 <-- The simple product in this pair should be removed
100000004   476         mpd00338    1.0000 <-- The simple product in this pair should be removed
100000006   374         abl004      4.0000 <-- This simple product is alone, and should therefore stay
100000006   417         wbk002      1.0000 <-- Once again the simple product in this pair should be removed
100000006   284         wbk002      1.0000 <-- Once again the simple product in this pair should be removed

My initial though would be to use the visibility to remove the redundant entities, however I am not sure if this would work in all situations or if there could be cases where both are either visible or hidden?

So my questions are

  1. How should I alter my sql statement in order to only select one product in case of configuable products ( without remowing those simple products that stands alone?

  2. Is this solution universal for both version 1.7, 1.8 and 1.9?

  3. Are there scenarios where the solution does not work?

NB. For Grouped products and Bundled products I should off course get all the simple products as they each actually represents a real product.

2

Answers


  1. Chosen as BEST ANSWER

    The solution I found was to join the data with the catalog_product_entity and then order the data by catalog_product_entity.type_id.. This way all the simple products was at the bottom, and when using GROUP BY i can then ensure that the only the simple products are selected when they are alone.


  2. Here is the query you need to use to exclude those simple products whose corresponding configurable product is already included in the list

    Just append the below part to your original query

    JOIN
     sales_flat_order_item as sfoi
         ON sfoi.item_id = tblLine.order_item_id 
    WHERE sfoi.parent_item_id IS NULL 
    

    So, your complete query should be this

    SELECT 
            tblInvoice.increment_id AS orderId,
            tblLine.product_id AS productId,
            tblLine.sku as productSku,
            tblLine.qty as qty
        FROM 
            sales_flat_invoice AS tblInvoice
        RIGHT JOIN 
            sales_flat_invoice_item AS tblLine 
                ON tblInvoice.entity_id = tblLine.parent_id
        JOIN
         sales_flat_order_item as sfoi
             ON sfoi.item_id = tblLine.order_item_id 
        WHERE sfoi.parent_item_id IS NULL
    

    The sales_flat_order_item database table has this parent_item_id column which says that it is a child of some parent product ( usually configurable product). All those records in the table sales_flat_order_item with parent_item_id empty are either the parent products or individual products ( simple/grouped/bundled ).

    This MySQL query works across all versions and for all the scenarios.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search