skip to Main Content

I’ve spent hours trying to use array_diff, array_unique, writing foreach loops and anything else I can find and I can’t get anything work, the result is always wrong.

Say I have these two arrays:

$arr_a = ['mary', 'mary', 'mary', 'jack', 'jack', 'jack', 'jack', 'fred', 'fred'];
$arr_b = ['mary', 'mary', 'jack', 'jack'];

I need returned:

['mary', 'jack', 'jack', 'fred', 'fred'];

I need the ‘leftover’ values after the matching values are canceled out, along with those that are completely unique. In $arr_a, we have 3 mary and 4 jack and 2 fred. If you subtract the 2 mary and 2 jack from that, you’re left with 1 mary, 2 jack and 2 fred.

My actual use case is comparing thousands of product id’s to thousands of product id’s. Given that there can be anywhere from 2-100+ duplicate product id’s, things like array_diff array_unique etc are not working.

I have tried the following:

array_diff($arr_a, $arr_b);

Doesn’t work. It removes all matching values regardless of how many times they occur.

I have tried:

foreach ($arr_a as $a) {
    $key = array_search($a, $arr_b);
    if ($key) unset($arr_b[$key]);
}
return $arr_b;

When I attempt to use this on very large arrays, the results are never perfect. I’ve tried a series of checks to weed out false positives and negatives but even those don’t always work.

2

Answers


  1. There are many ways to do the job. One of them is to unlink (i.e. delete) both the source ($arr_a) and the target ($arr_b) in case there is a match (array_search)

    So the code is

    <?php
    
    $arr_a = ['mary', 'mary', 'mary', 'jack', 'jack', 'jack', 'jack', 'fred', 'fred'];
    $arr_b = ['mary', 'mary', 'jack', 'jack'];
    
    $index=0;
    
    foreach ($arr_a as $a) {
      $key = array_search($a, $arr_b);
        if ($key !="") {
            unset($arr_b[$key]);
            unset($arr_a[$index]);
        }
      $index++;
    }
    var_dump($arr_a);
    ?>
    

    Please note that you cannot only use if ($key) { } to do the comparison because if the matched position is the 1st position (which means position 0), the $key will be zero, which means false.

    You may see the result thru this sandbox

    Login or Signup to reply.
  2. If you count the number of occurrences in each array (array_count_values()) it will end up with a list with the name and amount of times they occur.

    Then if you loop through the lists and compare the count, and output a result array using array_pad() to repeat the key however many times are left…

    $counta = array_count_values($arr_a);
    $countb = array_count_values($arr_b);
    
    $result = [];
    foreach ($counta as $key => $count) {
        $result = array_merge($result, array_pad([], abs(($countb[$key] ?? 0) - $count, $key)));
    }
    
    print_r($result);
    

    This means you process each list once and output a result through a quick loop.

    Breaking down the one part (with a slight correction of brackets)…

    array_merge($result, array_pad([], abs(($countb[$key] ?? 0) - $count), $key));
    

    The $countb[$key] ?? 0 part just checks for a match in the second count array (`?? 0′ gives it 0 if not found) and the next part takes the count from the first array from it.

    The array_pad([], $count, $key) part then creates a new array with the number of items from the above and the value is the name (the key from the array_count_values result).

    Finally this new list of names is added the the list it is creating.

    You could break it down to

    foreach ($counta as $key => $count) {
        $countToAdd = abs(($countb[$key] ?? 0) - $count);
        $listToAdd = array_pad([], $countToAdd, $key);
        $result = array_merge($result, $listToAdd);
    }
    

    which is more readable.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search