skip to Main Content

ok so there is an example of url.
https://www.finn.no/car/used/search.html?orgId=3553552&sort=PUBLISHED_DESC

In here I have adverticemets stored in tags. I need to collect them each time page on my site is geting loaded and display them to the visitor also changing some style options like background and how they apear in general on my website.
Also there is pagination option so would need to transfet that too.

Only option this market place deliver is iFrame that looks very bad in 2023 world.

Adress of original site where this will be posted: https://bbvest.no

I tried code with no succses:

<?php
    $url="https://www.finn.no/car/used/search.html?orgId=3553552&sort=PUBLISHED_DESC";
    $html=file_get_contents($url);
        $doc = new DOMDocument();
    $doc->loadHTML($html);
    $div=$doc->getElementsByClassName("ads__unit");
        
        
?>

<div><?php echo $div; ?></div>

thanks for any help.

2

Answers


  1. Chosen as BEST ANSWER

    I used this sintax It's takes less time to load and also it's grabs all the contents and places it nicely. with import of css I was able to get all I wanted. Now I can go ahead and make additable css styles and also other setups.

    Plan is to get this to working WP/JOMLA plugin.

        <?php
    $merchantID = '3553552';
    $finn_link = 'https://www.finn.no/car/used/search.html?orgId=' . $merchantID;
    $finnTagName = 'article';
    $finnAttrName = 'class';
    $finnAttrValue = 'ads__unit';
    
    $finnDom = new DOMDocument;
    $finnDom->preserveWhiteSpace = false;
    @$finnDom->loadHTMLFile($finn_link);
    
    $finnHtml = getTags( $finnDom, $finnTagName, $finnAttrName, $finnAttrValue );
    
    function getTags( $finnDom, $finnTagName, $finnAttrName, $finnAttrValue ){
        $finnHtml = '';
        $domxpath = new DOMXPath($finnDom);
        $newDom = new DOMDocument;
        $newDom->formatOutput = true;
    
        $filtered = $domxpath->query("//$finnTagName" . '[@' . $finnAttrName . "='$finnAttrValue']");
        // $filtered =  $domxpath->query('//div[@class="className"]');
        // '//' when you don't know 'absolute' path
    
        // since above returns DomNodeList Object
        // I use following routine to convert it to string(html); copied it from someone's post in this site. Thank you.
        $i = 0;
        while( $myItem = $filtered->item($i++) ){
            $node = $newDom->importNode( $myItem, true );    // import node
            $newDom->appendChild($node);                    // append node
        }
        $finnHtml = $newDom->saveHTML();
        return $finnHtml;
    }
    
    ?>
    <?php echo $finnHtml; ?>
    

  2. Class DOMDocument does not contains method getElementsByClassName

    to get the text and image,

    <?php
    $url="https://www.finn.no/car/used/search.html?orgId=3553552&sort=PUBLISHED_DESC";
    $html=file_get_contents($url);
    $doc = new DOMDocument();
    libxml_use_internal_errors(true); // use it if getting error DOMDocument::loadHTML(): Tag finn-topbar invalid in Entity
    $doc->loadHTML($html);
    
    $arts = $doc->getElementsByTagName('article'); // get tag article
    $display = "";
    foreach($arts as $index => $art){
        $imgs = $doc->getElementsByTagName('img'); // get tag img in tag article
        $article = $art->textContent; // text of article
        $display.= $article."</br>";
        $display.= $imgs[$index]->getAttribute('src')."</br>"; // src img in tag img
    }
    ?>
    
    <div><?php echo $display; ?></div>
    

    try use regex preg_match_all,

    <?php
        $str = file_get_contents('https://www.finn.no/car/used/search.html?orgId=3553552&sort=PUBLISHED_DESC');
        preg_match_all('#<article class="ads__unit (.*)">(.*?)</article>#', $str, $matches);
        $div = "";
        foreach($matches as $match){
            foreach($match as $mt){
                $div .= $mt;
            }
        }
    ?>
    <div><?php echo $div ?></div>
    

    or

    <?php
        $str = file_get_contents('https://www.finn.no/car/used/search.html?orgId=3553552&sort=PUBLISHED_DESC');
        
        $div = "";
        if(preg_match('#<div class="ads (.*)">(.*)</div>#', $str, $m)){
            $div .= $m[0];
        } else {
            echo 'Regex syntax has to be improved to your search criteria'.PHP_EOL;
        }
    ?>
    
    <div><?php echo $div; ?></div>
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search