skip to Main Content

I need functionality to "dehydrate" some user input, replacing it with placeholders, with the ultimate goal to "rehydrate" it elsewhere. For example:

Visit [my page](http://example.com/posts/)

Needs to have $search='http://example.com' replaced with a placeholder, like so:

Visit the page [my page](%WEBSITE_URL%/posts/)

This will be saved off in a file or something, and transferred to a different website. Then, at other end, it can be "rehydrated" with an arbitrary WEBSITE_URL. if $replace='http://another-site.net', then I need this to be turned into

Visit the page [my page](http://another-site.net/posts/)

The naive solution is to do something like this:

$search = 'http://example.com';
$dehydrated = str_replace($search, '%WEBSITE_URL%', $text);

// then just do it backwards:
$replace = 'http://another-site.net';
$rehydrated = str_replace('%WEBSITE_URL%', $replace, $dehydrated);

The problem is $text is user input, which can contain anything, including the literal string %WEBSITE_URL%. For example, if:

$text = 'Visit [my page](http://example.com/posts/). Placeholders are %WEBSITE_URL%';

// Would be turned into

$rehydrated = 'Visit [my page](http://another-site.net/posts/). Placeholders are http://another-site.net';

// instead of the correct:

$rehydrated = 'Visit [my page](http://another-site.net/posts/). Placeholders are %WEBSITE_URL%';

An improvement would be something like this:

// replace existing % with %% as well to help guard against this:
$search = 'http://example.com';
$dehydrated = str_replace(['%', $search], ['%%', '%WEBSITE_URL%'], $text);

// then we use preg_replace with a negative lookahead, eg:
$replace = 'http://another-site.net';
$rehydrated = preg_replace('/%WEBSITE_URL%(?!%)/', $replace, $dehydrated);
$rehydrated = str_replace('%%', '%', $rehydrated);

This is better and should work for 99.99% of cases, but it can be "defeated" if we had something like:

$text = 'Visit [my page](http://example.com/posts/), %http://example.com%';

How can I make sure this will always work, regardless of what the input might be?

2

Answers


  1. One solution could be to temporarily replace the tags that are present before dehydration with a custom unique tag:

    $text = 'Visit [my page](http://example.com/posts/), Placeholders are %WEBSITE_URL% %http://example.com%';
    
    // Replace hardcoded tag with a temporary one
    $tag = '%WEBSITE_URL%';
    $tempTag = '%SOME_INTERNAL_KEY_' . time() . '%';
    $escaped = str_replace($tag, $tempTag, $text);
    // Visit [my page](http://example.com/posts/), Placeholders are %SOME_INTERNAL_KEY_1686299415% %http://example.com%
    
    // Dehydration
    $search = 'http://example.com';
    $dehydrated = str_replace($search, $tag, $escaped);
    // Visit [my page](%WEBSITE_URL%/posts/), Placeholders are %SOME_INTERNAL_KEY_1686299285% %%WEBSITE_URL%%
    
    // Rehydration
    $replace = 'http://another-site.net';
    $rehydrated = str_replace($tag, $replace, $dehydrated);
    // Visit [my page](http://another-site.net/posts/), Placeholders are %SOME_INTERNAL_KEY_1686299285% %http://another-site.net%
    
    // Remove temporary tag
    $clean = str_replace($tempTag, $tag, $rehydrated);
    // Visit [my page](http://another-site.net/posts/), Placeholders are %WEBSITE_URL% %http://another-site.net%
    

    The only collision possible would happen if someone put %SOME_INTERNAL_KEY_00000% with a timestamp corresponding to the dehydration time (very unlikely ?).

    If the dehydration and rehydration processes are done on different PHP processes (different apps/servers ?), you should only need to transfer the temporary tag(s) with the dehydrated text.

    Login or Signup to reply.
  2. This does not answer the question but, does it not provide a solution for your scenario by simply making the links relative?

    $text = "Visit the page [my page](http://example.com/posts/)";
    $search = 'http://example.com';
    $dehydrated = str_replace($search, '', $text);
    
    var_export($dehydrated);
    // 'Visit the page [my page](/posts/)'
    

    When you hover over the markdown link:
    Visit the page my page
    You see that it links to "https://stackoverflow.com/posts/" (even matches the current protocol)

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search