I need functionality to "dehydrate" some user input, replacing it with placeholders, with the ultimate goal to "rehydrate" it elsewhere. For example:
Visit [my page](http://example.com/posts/)
Needs to have $search='http://example.com'
replaced with a placeholder, like so:
Visit the page [my page](%WEBSITE_URL%/posts/)
This will be saved off in a file or something, and transferred to a different website. Then, at other end, it can be "rehydrated" with an arbitrary WEBSITE_URL. if $replace='http://another-site.net'
, then I need this to be turned into
Visit the page [my page](http://another-site.net/posts/)
The naive solution is to do something like this:
$search = 'http://example.com';
$dehydrated = str_replace($search, '%WEBSITE_URL%', $text);
// then just do it backwards:
$replace = 'http://another-site.net';
$rehydrated = str_replace('%WEBSITE_URL%', $replace, $dehydrated);
The problem is $text
is user input, which can contain anything, including the literal string %WEBSITE_URL%
. For example, if:
$text = 'Visit [my page](http://example.com/posts/). Placeholders are %WEBSITE_URL%';
// Would be turned into
$rehydrated = 'Visit [my page](http://another-site.net/posts/). Placeholders are http://another-site.net';
// instead of the correct:
$rehydrated = 'Visit [my page](http://another-site.net/posts/). Placeholders are %WEBSITE_URL%';
An improvement would be something like this:
// replace existing % with %% as well to help guard against this:
$search = 'http://example.com';
$dehydrated = str_replace(['%', $search], ['%%', '%WEBSITE_URL%'], $text);
// then we use preg_replace with a negative lookahead, eg:
$replace = 'http://another-site.net';
$rehydrated = preg_replace('/%WEBSITE_URL%(?!%)/', $replace, $dehydrated);
$rehydrated = str_replace('%%', '%', $rehydrated);
This is better and should work for 99.99% of cases, but it can be "defeated" if we had something like:
$text = 'Visit [my page](http://example.com/posts/), %http://example.com%';
How can I make sure this will always work, regardless of what the input might be?
2
Answers
One solution could be to temporarily replace the tags that are present before dehydration with a custom unique tag:
The only collision possible would happen if someone put
%SOME_INTERNAL_KEY_00000%
with a timestamp corresponding to the dehydration time (very unlikely ?).If the dehydration and rehydration processes are done on different PHP processes (different apps/servers ?), you should only need to transfer the temporary tag(s) with the dehydrated text.
This does not answer the question but, does it not provide a solution for your scenario by simply making the links relative?
When you hover over the markdown link:
Visit the page my page
You see that it links to
"https://stackoverflow.com/posts/"
(even matches the current protocol)