Php - Finding any substring of a given lenght inside another string

Gajowy
August 8, 2023
305 views
0 votes
2 Answers

Let assume we have 2 strings ($strA and $strB) of some MBs lenght and we want to find (if exists) any 8-bytes substring of $strA inside $strB. My current approach works however is unacceptable slow:

function FindAnySubstring($strA,$strB)
{
    for( $i=0; $i<strlen($strA)-8; $i++ )
    {
        $toFind=subStr($strA,$i,8);
        $pos=strpos($strB,$toFind);
        if( $pos === FALSE )
            continue;
        return [$i,$pos];
    }
    return FALSE;
}

Can it be done in a better way?

EDIT:
Sample data – well, probably I’m not able to add big files, but let assume that we can use shorter samples just for visualisation:

$strA=’abcdefghijKLMNOPQRstuvxyz’;
$strB=’ldsf32vKLMNOPQR4fa156232543′;

Expected result [11,8] (at 11th pos in $strA and 8th pos in $strB)

Current execution time for 3,321,792 bytes long $strA and $strB it takes between 0 and 2530 seconds to find a solution (!) – depending on position of the substring. I would expect 2 orders faster algorithm.

Considered ideas – I have some ideas:

find shorter substring (substring of the substring, starting from 1 byte) and if found – check the rest; tested with no improvements
use pararelling; always work, however harder to code

Answers

- manout
- August 8, 2023 at 1:34 am
- 0 votes
0
There are a few improvements you can make to potentially optimize its performance:

Early Exit: You can exit the loop as soon as a match is found, rather than continuing to search for additional matches. This can significantly reduce the number of unnecessary iterations.

Precompute Lengths: Instead of repeatedly calling strlen($strA) inside the loop, you can precompute the length of $strA before the loop starts to improve efficiency.

Here’s an updated version of your function with these improvements:
```
function FindAnySubstring($strA, $strB)
{
   $lenA = strlen($strA);
   $lenB = strlen($strB);

   for ($i = 0; $i <= $lenA - 8; $i++)
   {
      $toFind = substr($strA, $i, 8);
      $pos = strpos($strB, $toFind);

      if ($pos !== false)
      {
          return [$i, $pos];
      }
   }

  return false;
}
```
Login or Signup to reply.

- Amir
- August 8, 2023 at 3:03 am
- 0 votes
0
I wrote this , Also you can change it to find all substrings :
```
function FindAnySubstring($strA,$strB) {
    $a2 = str_split($strB);
    foreach (str_split($strA) as $first => $firstw) :
        foreach ($a2 as $second => $secw) :
            if ($first === $secw) :
                return [$firstm+1,$second+1];
            endif;
        endforeach;
    endforeach;
}
```
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Php – Finding any substring of a given lenght inside another string

Answers