Hello I have my code that copy the html from external url and echo it on my page.
Some of the HTMLs have links and/or picure SRC inside.
I will need some help to truncate them (from absolute url to relative url inside $data )
For example : inside html there is href
<a href="https://www.trade-ideas.com/products/score-vs-ibd/" >
or SRC
<img src="http://static.trade-ideas.com/Filters/MinDUp1.gif">
I would like to keep only subdirectory.
/products/score-vs-ibd/z
/Filters/MinDUp1.gif
Maybe with preg_replace , but im not familiar with Regular expressions.
This is my original code that works very well, but now im stuck truncating the links.
<?php
$post_tags = get_the_tags();
if ( $post_tags ) {
$tag = $post_tags[0]->name;
}
$html= file_get_contents('https://www.trade-ideas.com/ticky/ticky.html?symbol='. "$tag");
$start = strpos($html,'<div class="span3 height-325"');
$end = strpos($html,'<!-- /span -->',$start);
$data= substr($html,$start,$end-$start);
echo $data ;
?>
2
Answers
Here is the code:
Example:
getUrlPaths("http://myassets.com:80/files/images/image.gif")
returns files/images/image.gifYou can locate all the URLs in the html string with a regex using
preg_match_all()
.The regex:
will capture both the entire URL and the path/query string for every occurrence of
="http://domain/path"
or='https://domain/path?query'
(http/https, single or double quotes, with/without query string).Then you can just use
str_replace()
to update the html string.Run it live here.
Note, this will change all absolute URLs enclosed in quotes immediately following an
=
.