skip to Main Content

I have an array with more than 2000 elements like this:

Array
(
    [0] => Array
        (
            [name] => LILI
            [actual_start] => 2021-11-10T18:34:00+00:00
            [actual_end] => 2021-11-10T21:32:00+00:00
        )

    [1] => Array
        (
            [name] => MILI
            [actual_start] => 2021-11-18T17:33:00+00:00
            [actual_end] => 2022-03-18T19:36:00+00:00
    

    )
.
.
.
    )

My goal is to find the TOP 3 elements (their names) based on the duration from actual_start and actual_end.

First, I wanted to convert the difference of actual_start and actual_end to some number and then use that get the top 3 longest. I tried this:

foreach ($array as $data) {
    $date1 = new DateTime($data['actual_start']);
    $date2 = new DateTime($data['actual_end']);
    $interval = $date1->diff($date2);
    echo "difference " . $interval->y ;
}

This works but it will give me the difference in years or minutes or hours (if I change y to h or m), but using this value I cannot calculate top 3, any ideas?

4

Answers


  1. You can use a user defined function to tell usort that you want your array sorted by the timespans, something like:

    $data = [
        [
            'name' => 'LILI',
            'actual_start' => '2021-11-10T18:34:00+00:00',
            'actual_end' => '2021-11-10T21:32:00+00:00',
        ],
        [
            'name' => 'MILI',
            'actual_start' => '2021-11-18T17:33:00+00:00',
            'actual_end' => '2022-03-18T19:36:00+00:00',
        ],
    
        // ... and more ...
    
    ];
        
    usort($data, function($a, $b){
    
        $startA = new DateTime($a['actual_start']) ;
        $endA = new DateTime($a['actual_end']);
        $lengthA = $startA->diff( $endA );
    
        $startB = new DateTime($b['actual_start']) ;
        $endB = new DateTime($b['actual_end']);
        $lengthB = $startB->diff( $endB );
    
        // DateInterval objects can no longer be compared
        // compare the number of seconds instead
        return $lengthA->s <=> $lengthB->s;
    });
    
    $topThree = array_slice($data, 0, 3);
    $bottomThree = array_slice($data, 0, -3);
    
    Login or Signup to reply.
  2. I’d probably sort the whole thing and grab the top three records for this.

    Something like:

    
    function span($record) {
      $start = new DateTime($record['actual_start']) ;
      $end = new DateTime($record['actual_end']);
      return $start->diff( $end );
    }
    
    function byDuration($a, $b) {
      $diffA = span($a);
      $diffB = span($b);
      if ($diffA == $diffB) return 0;
      return ($diffA < $diffB) ? -1 : 1;
    }
    
    $sorted = uasort($data, byDuration);
    $top3 = array_slice($sorted, 0, 3);
    

    disclaimer: typed in the SO textbook, haven’t touched php since version 5.4 was the hot new thing.

    Login or Signup to reply.
  3. So, just keep 3 numbers, each for max, 2nd max and 3rd max, thereby constituting your top 3.

    Take the difference of actual_end and actual_start for each array using strtotime(although there are many ways to get a diff).

    Keep comparing with those max variables and keep assigning and re-assigning them values. This would make it efficient in terms of time to retrieve the answer in just a single pass.

    Snippet:

    <?php
    
    $max = $secondMax = $thirdMax = 0;
    $maxName = $secondMaxName = $thirdMaxName = "";
    
    foreach($data as $d){
        $diff = strtotime($d['actual_end']) - strtotime($d['actual_start']);
        if($diff > $max){
            $thirdMax = $secondMax;
            $secondMax = $max;
            $max = $diff;
            $thirdMaxName = $secondMaxName;
            $secondMaxName = $maxName;
            $maxName = $d['name'];
        }elseif($diff > $secondMax){
            $thirdMax = $secondMax;
            $secondMax = $diff;
            $thirdMaxName = $secondMaxName;
            $secondMaxName = $d['name'];
        }elseif($diff > $thirdMax){
            $thirdMax = $diff;
            $thirdMaxName = $d['name'];
        }
    }
    
    echo $maxName, " " , $secondMaxName," " ,$thirdMaxName;
    

    Online Fiddle

    Login or Signup to reply.
  4. I fully endorse the efficiency of @nice_dev’s linear solution. Here is an altered version which is easier to adjust — only the $size variable needs to be adjusted for larger/smaller filtered sets instead of hardcoding so many individual variables.

    The unset() calls are safe to make even when the targeted element doesn’t exist and serve to ensure that the temporary and result arrays never grow beyond the declared limit.

    Code: (Demo)

    $size = 3;
    $maxes = array_fill(0, $size, 0);
    $names = [];
    
    foreach ($array as $row) {
        $diff = strtotime($row['actual_end']) - strtotime($row['actual_start']);
        foreach ($maxes as $i => $max) {
            if ($diff > $max) {
                 array_splice($maxes, $i, 0, $diff);
                 array_splice($names, $i, 0, $row['name']);
                 unset($maxes[$size], $names[$size]);
                 break;
            }
        }
    }
    
    echo implode(' ', $names);
    

    A direct, functional approach with slightly worse time complexity is to sort the entire array, then slice off the row data that you desire.

    Code: (Demo)

    array_multisort(
        array_map(
            fn($row) => (new DateTime($row['actual_end']))->getTimestamp()
                        - (new DateTime($row['actual_start']))->getTimestamp(),
            $array
        ),
        SORT_DESC,
        $array
    );
    var_export(
        array_column(
            array_slice($array, 0, 3),
            'name'
        )
    );
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search