skip to Main Content

I need filter out duplicate rows in my 2d array and in the retained unique rows append an element that contains the count of how many times the unique row existed in the original array.

I wanted to use array_unique($array, SORT_REGULAR), but removing duplicates is not enough — I actually need to get store the count of the duplicated rows per with the unique rows.

I have tried array_search() and loops, but none of my attempts yield the correct results. My project data has upwards of 500,000 entries, but here’s a basic example:

Input:

[
    ['manufacturer' => 'KInd', 'brand' => 'ABC', 'used' => 'true'],
    ['manufacturer' => 'KInd', 'brand' => 'ABC', 'used' => 'true'],
    ['manufacturer' => 'KInd', 'brand' => 'ABC', 'used' => 'false'],
]

Output:

[
    ['manufacturer' => 'KInd', 'brand' => 'ABC', 'used' => 'true', 'count' => 2],
    ['manufacturer' => 'KInd', 'brand' => 'ABC', 'used' => 'false', 'count' => 1],
]

2

Answers


  1. If I understand you correctly, this should help

    function getUniqWithCounts(array $data): array
    {
        $result = [];
        foreach ($data as $item) {
            $hash = md5(serialize($item));
    
            if (isset($result[$hash])) {
                $result[$hash]['count']++;
                continue;
            }
            $item['count'] = 1;
            $result[$hash] = $item;
        }
    
        return array_values($result);
    }
    
    Login or Signup to reply.
  2. You don’t need to use any elaborate serialization or encoding to create composite keys for grouping. Just implode each row’s values (assuming they all contain the same columns in the same order) to create an identifying key for the result array.

    On the first encounter, store the row’s data in the group and set the group’s count to 1; on any subsequent encounter, increment the group’s counter.

    Code: (Demo)

    $result = [];
    foreach ($array as $row) {
        $compositeKey = implode('_', $row);
        if (!isset($result[$compositeKey])) {
            $result[$compositeKey] = $row + ['count' => 1];
        } else {
            ++$result[$compositeKey]['count'];
        }
    }
    var_export(array_values($result));
    

    Output:

    array (
      0 => 
      array (
        'manufacturer' => 'KInd',
        'brand' => 'ABC',
        'used' => 'true',
        'count' => 2,
      ),
      1 => 
      array (
        'manufacturer' => 'KInd',
        'brand' => 'ABC',
        'used' => 'false',
        'count' => 1,
      ),
    )
    

    Other posts that leverage multiple identifying column values for grouping:

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search