skip to Main Content

I want to load csv file data to extract the urls from CSV and check for the title tag for all the urls and update the urls with corresponding title tags in a new csv. But while I try to add data to the csv all the urls are getting listed but only the title of the last url is displayed in the CSV. I have tried different ways to overcome this problem but unable to do so.

Here is my code:

  <?php
ini_set('max_execution_time', '300'); //300 seconds = 5 minutes
ini_set('max_execution_time', '0');

include('simple_html_dom.php');

// if (isset($_POST['resurl'])) {
//     $url = $_POST['resurl'];


if (($csv_file = fopen("old.csv", "r", 'a')) !== FALSE) {

    $arraydata = array();
    while (($read_data = fgetcsv($csv_file, 1000, ",")) !== FALSE) {
        $column_count = count($read_data);

        for ($c = 0; $c < $column_count; $c++) {

            array_push($arraydata, $read_data[$c]);
        }
    }


    fclose($csv_file);
}
$title = [];

foreach ($arraydata as $ad) {
    $ard = [];
    $ard = $ad;
    $html = file_get_html($ard);

    if ($html) {
        $title = $html->find('title', 0)->plaintext;
        // echo '<pre>';
        // print_r($title);


    }
}


$ncsv = fopen("updated.csv", "a");
$head = "Url,Title";

fwrite($ncsv, "n" . $head);

foreach ($arraydata as $value) {
    // $ar[]=$value;
    $csvdata = "$value,$title";
    fwrite($ncsv, "n" . $csvdata);
}

fclose($ncsv);

2

Answers


  1. Chosen as BEST ANSWER

    I was able to solve it finally.

    Here is the updated code:

    <?php
    ini_set('max_execution_time', '300'); //300 seconds = 5 minutes
    ini_set('max_execution_time', '0');
    
    include('simple_html_dom.php');
    
    // if (isset($_POST['resurl'])) {
    //     $url = $_POST['resurl'];
    
    
    if (($csv_file = fopen("ntsurl.csv", "r", 'a')) !== FALSE) {
    
        $arraydata = array();
        while (($read_data = fgetcsv($csv_file, 1000, ",")) !== FALSE) {
            $column_count = count($read_data);
    
            for ($c = 0; $c < $column_count; $c++) {
    
                array_push($arraydata, $read_data[$c]);
            }
        }
    
    
        fclose($csv_file);
    }
    
    
    // print_r($arraydata);
    $title=[];
    $ncsv=fopen("ntsnew.csv","a");
      $head="Website Url,title";
          fwrite($ncsv,"n".$head);
    
    foreach($arraydata as $ad)
    {
           $ard = [];
        $ard = $ad;
        $html = file_get_html($ard);
    
        if ($html) {
            $title = $html->find('title', 0)->plaintext;
             echo '<pre>';
            print_r($title);
    
    
        $csvdata="$ard,$title ";
            fwrite($ncsv,"n".$csvdata);
    
     
    }
    
        }
    // fclose($ncsv);
    

  2. I’ve changed the code so that you write the CSV file as you read the HTML pages. This saves having another loop and an extra array of titles.

    I’ve also changed it to use fputcsv to write the data out as it sorts ot things like escaping values etc.

    // Open file, using w to clear the old file down
    $ncsv = fopen('updated.csv', 'w');
    $head = 'Url,Title';
    
    fwrite($ncsv, "Url,Title" . PHP_EOL . $head);
    foreach ($arraydata as $ad) {
        $html = file_get_html($ad);
    
        // Fetch title, or set to blank if html is not loaded
        if ($html) {
            $title = $html->find('title', 0)->plaintext;
        } else {
            $title = '';
        }
        // Write record out
        fputcsv($ncsv, [$value, $title]);
    }
    
    fclose($ncsv);
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search