skip to Main Content

I have the following query:

SELECT
    cpe.entity_id AS configurable_product_id,
    cpe.sku AS configurable_product_sku,
    GROUP_CONCAT(DISTINCT cpsa.attribute_id ORDER BY cpsa.attribute_id SEPARATOR ',') AS configurable_super_attribute_ids,
    cp.entity_id AS simple_product_id,
    cp.sku AS simple_product_sku,
    GROUP_CONCAT(cpei.attribute_id, '_', cpei.value ORDER BY cpei.attribute_id SEPARATOR ',') AS simple_product_super_attribute_values
FROM
    catalog_product_entity AS cpe
INNER JOIN catalog_product_super_link AS cpsl ON cpsl.parent_id = cpe.entity_id
INNER JOIN catalog_product_entity AS cp ON cp.entity_id = cpsl.product_id
INNER JOIN catalog_product_super_attribute AS cpsa ON cpsa.product_id = cpe.entity_id
INNER JOIN catalog_product_entity_int AS cpei ON cpei.attribute_id = cpsa.attribute_id AND cpei.`entity_id` = cp.`entity_id`
GROUP BY cpe.entity_id, cp.entity_id
LIMIT 100;

Which results in something like this:

configurable_product_id configurable_product_sku    configurable_super_attribute_ids    simple_product_id   simple_product_sku  simple_product_super_attribute_values
90086   Iphone  93,165  119 093 93_5730,165_7070
90086   Iphone  93,165  124 104 93_5730,165_7071
90086   Iphone  93,165  128 114 93_5707,165_7072
90086   Iphone  93,165  169 156 93_5727,165_7072
90086   Iphone  93,165  181 163 93_5730,165_7070
90086   Iphone  93,165  186 194 93_5727,165_7071
90086   Iphone  93,165  146 023 93_5730,165_7071

What I want my query to do is only show the records with duplicates in the simple_product_super_attributes_values column.

So in this case ditch the 3rd, 4th and 6th row and keep the rest:

configurable_product_id configurable_product_sku    configurable_super_attribute_ids    simple_product_id   simple_product_sku  simple_product_super_attribute_values
90086   Iphone  93,165  119 093 93_5730,165_7070
90086   Iphone  93,165  124 104 93_5730,165_7071
90086   Iphone  93,165  181 163 93_5730,165_7070
90086   Iphone  93,165  146 023 93_5730,165_7071

How to modify my query to get to my result? I tried a count on the group_concat but that is not allowed.

I tried to add a count for that column via a Having statement:

HAVING COUNT(*) > 1 
AND simple_product_super_attribute_values IN (
    SELECT simple_product_super_attribute_values
    FROM (
        SELECT
            cpe.entity_id AS configurable_product_id,
            GROUP_CONCAT(cpei.attribute_id, '_', cpei.value ORDER BY cpei.attribute_id SEPARATOR ',') AS simple_product_super_attribute_values
        FROM
            catalog_product_entity AS cp
        INNER JOIN catalog_product_super_link AS cpsl ON cpsl.product_id = cp.entity_id
        INNER JOIN catalog_product_entity AS cpe ON cpe.entity_id = cpsl.parent_id
        INNER JOIN catalog_product_super_attribute AS cpsa ON cpsa.product_id = cpe.entity_id
        INNER JOIN catalog_product_entity_int AS cpei ON cpei.attribute_id = cpsa.attribute_id AND cpei.`entity_id` = cp.`entity_id`
        GROUP BY cp.entity_id, cpe.entity_id
        HAVING COUNT(*) > 1
    ) AS subquery
    WHERE subquery.configurable_product_id = cpe.entity_id
)

But this makes no sense.

3

Answers


  1. HAVING is a clause like WHERE, but occurs after it and after group by. But testing count(*) there isn’t going to help.

    You want to use the count window function, partitioning by simple_product_super_attribute_values. It isn’t clear to me if you want your duplicate check to apply before or after the LIMIT 100; assuming before, you would do:

    select
        configurable_product_id,
        configurable_product_sku,
        configurable_super_attribute_ids,
        simple_product_id,
        simple_product_sku,
        simple_product_super_attribute_values
    from (
        SELECT
            cpe.entity_id AS configurable_product_id,
            cpe.sku AS configurable_product_sku,
            GROUP_CONCAT(DISTINCT cpsa.attribute_id ORDER BY cpsa.attribute_id SEPARATOR ',') AS configurable_super_attribute_ids,
            cp.entity_id AS simple_product_id,
            cp.sku AS simple_product_sku,
            GROUP_CONCAT(cpei.attribute_id, '_', cpei.value ORDER BY cpei.attribute_id SEPARATOR ',') AS simple_product_super_attribute_values,
            count(*) over (partition by GROUP_CONCAT(cpei.attribute_id, '_', cpei.value ORDER BY cpei.attribute_id SEPARATOR ',')) > 1 duplicate_simple_product_super_attribute_values
        FROM
            catalog_product_entity AS cpe
        INNER JOIN catalog_product_super_link AS cpsl ON cpsl.parent_id = cpe.entity_id
        INNER JOIN catalog_product_entity AS cp ON cp.entity_id = cpsl.product_id
        INNER JOIN catalog_product_super_attribute AS cpsa ON cpsa.product_id = cpe.entity_id
        INNER JOIN catalog_product_entity_int AS cpei ON cpei.attribute_id = cpsa.attribute_id AND cpei.`entity_id` = cp.`entity_id`
        GROUP BY cpe.entity_id, cp.entity_id
    ) as products
    where duplicate_simple_product_super_attributes_values
    limit 100
    
    Login or Signup to reply.
  2. The HAVING clause to do what you need is this:

    HAVING COUNT(DISTINCT whateverid) > 1
    

    This is one of several cool things you can do in the HAVING clause.

    Login or Signup to reply.
  3. You can use your query inside a CTE then in a second CTE use the window function count() over simple_product_super_attribute_values to check for duplicates :

    WITH cte as (
      SELECT
        cpe.entity_id AS configurable_product_id,
        cpe.sku AS configurable_product_sku,
        GROUP_CONCAT(DISTINCT cpsa.attribute_id ORDER BY cpsa.attribute_id SEPARATOR ',') AS configurable_super_attribute_ids,
        cp.entity_id AS simple_product_id,
        cp.sku AS simple_product_sku,
        GROUP_CONCAT(cpei.attribute_id, '_', cpei.value ORDER BY cpei.attribute_id SEPARATOR ',') AS simple_product_super_attribute_values
      FROM
        catalog_product_entity AS cpe
      INNER JOIN catalog_product_super_link AS cpsl ON cpsl.parent_id = cpe.entity_id
      INNER JOIN catalog_product_entity AS cp ON cp.entity_id = cpsl.product_id
      INNER JOIN catalog_product_super_attribute AS cpsa ON cpsa.product_id = cpe.entity_id
      INNER JOIN catalog_product_entity_int AS cpei ON cpei.attribute_id = cpsa.attribute_id AND cpei.`entity_id` = cp.`entity_id`
      GROUP BY cpe.entity_id, cp.entity_id
      LIMIT 100
    ),
    cte2 as (
      select configurable_product_id,
             configurable_product_sku,
             configurable_super_attribute_ids,
             simple_product_super_attribute_values,
             count(1) over( partition by simple_product_super_attribute_values) as count_dups
             from mytable
    )
    select *
    from cte2
    where count_dups > 1
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search