skip to Main Content

This is my first time using curl_multi_init() so I’m probably misunderstanding something. Learning to use it properly is more important to me than solving my problem because this particular function will solve a lot of my problems in future.

This particular call is for uploading Etsy photos. Etsy documentation for this call here.

It works fine in Postman. The code snippet Postman generates for "PHP – cURL" works fine. It keeps working fine even after my edits to it.

Trouble is, I’ve got well over a thousand high resolution images to upload, so running the entire snippet from start to finish, then looping it a thousand times will time out no matter how generous my php.ini settings.

So, line by line I merged the existing code with a synchronous snippet and, I must have done something wrong. This example is almost exactly the live code. I’ve just deleted/simplified irrelevant things and redacted personal information. (Hopefully I didn’t delete/simplify the bug.):

Edit

This code works when limited to 7 calls. This is a very recent discovery, but absolutely critical to solving the question overall.

<?php
include_once 'databaseStuff.php';
include_once 'EtsyTokenStuff.php';
$result = mysqli_query($conn, "SELECT product, listing_id, alt_text, dataStuff;");
$multiCurl = [];
$multiResult = [];
$multiHandle = curl_multi_init();
if (mysqli_num_rows($result) > 0){
    while ($row = mysqli_fetch_assoc($result)){
        for($image = 1; $image <=2; $image++){
            $multiCurl[$row['product'] . "_" . $image] = curl_init();
            curl_setopt_array($multiCurl[$row['product'] . "_" . $image], 
                array(
                    CURLOPT_URL => "https://openapi.etsy.com/v3/application/shops/$myShopNumber/listings/" . $row['listing_id'] . "/images",
                    CURLOPT_RETURNTRANSFER => true,
                    CURLOPT_ENCODING => '',
                    CURLOPT_MAXREDIRS => 10,
                    CURLOPT_TIMEOUT => 0,
                    CURLOPT_FOLLOWLOCATION => true,
                    CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
                    CURLOPT_CUSTOMREQUEST => 'POST',
                    CURLOPT_POSTFIELDS => array(
                        "image" => new CURLFILE(
                            [
                                1 => "img/imagePathStuff/" . $row['product'] . ".jpg",
                                2 => "img/differentImagePathStuff/" . $row['product'] . ".jpg"
                            ][$image]
                        ),
                        // "listing_image_id" =>,
                        "rank" => $image,
                        "overwrite" => true,
                        // "is_watermarked" =>,
                        "alt_text" => $row['alt_text']
                    ),
                    CURLOPT_HTTPHEADER => array(
                        "x-api-key: $myAPIKey",
                        "authorization: Bearer {$etsyAccessToken}"
                    ),
                )
            );
            curl_multi_add_handle($multiHandle, $multiCurl[$row['product'] . "_" . $image]);
        }
    }
    $index = null;
    do {
        curl_multi_exec($multiHandle, $index);
    } while($index > 0);
    foreach($multiCurl as $k => $curlHandle){
        $multiResult[$k] = curl_multi_getcontent($curlHandle);
        curl_multi_remove_handle($multiHandle, $curlHandle);
    }
    curl_multi_close($multiHandle);
}

Once it starts working I’ll probably block it out into functions, but I prefer to edit broken code in this format and add the function calls later.

Newer Insights

Having never worked with these functions before, I’m not sure how they’re supposed to behave but the behaviour I’ve noticed:

  • If I limit the number of images uploaded to 7, everything works as intended. But if I run this code, no limit, even the first 7 images won’t connect with the server. When I limit to 8 or higher, I hit an internal service error, but I suspect that might be an issue with my sloppy code. I need to look over it a few more times to see why it always crashes at the exact same point.
  • No, it wasn’t sloppy code. Commenting out curl_multi_exec removes the error. Commenting out everything below except curl_multi_exec and its loop does not remove the error. Max calls seems to be at 7, no matter which code snippet I borrow and replace. I can’t even cause it to reduce to 6 with deliberately sloppy snippets. It’s always 7.
  • Opening php.ini and changing memory_limit = 256M to memory_limit = 512M not only fails to fix the problem, but makes the problem worse. Sending 7 results in an Internal Service Error. This was tested in the live environment, so I quickly reverted back to memory_limit = 256M. All damage caused was instantly repaired. I won’t be testing that much further if I don’t have to.

Older insights

  • The number of loops for the do-while loop varies from hundreds of thousands to millions while trying to upload 4 images. I suspect this is the correct number of loops, since everything else seems to work when it behaves this way. So now I know.
  • This exact code has an Etsy specific problem. Ignore this if you aren’t developing code for Etsy’s API, but Etsy doesn’t like it when you upload two photos to the same listing at the same time. Photos to different listings at the same time, however, is okay. So a loop that covers a single listing will not work.
  • Following the advice of @Kazz, while (false !== ($info = curl_multi_info_read($multiHandle))) { print_r($info); } returns Array ( [msg] => 1 [result] => 7 [handle] => Resource id #1009 ) for each item (with +1 to each Resource id for each result following). 7 corresponds with the error "CURLE_COULDNT_CONNECT".

Earlier insights

  • Although almost every change seems inconsequential, changing the URL to https://google.com causes everything to time out. Therefore, my code at least has access to the internet.
  • Visiting the correct url in browser gives an authentication error, as I’d expect.
  • All of the code executes, start to finish, no fatal errors.
  • The do-while loop executes once then loops once more. (Maybe it’s supposed to or maybe it’s supposed to loop once per photo. Couldn’t get that clarified anywhere.)
  • It’s supposed to update photos. Unfortunately the first test was on very minor edits, but trying again including a deliberately wrong photo I at least know that that particular photo didn’t update, so probably none of them updated.
  • curl_multi_getcontent($curlHandle) always returns an empty string
  • curl_multi_exec($multiHandle, $index) always returns 0 (previous claim that it was 1002 was incorrect. 1002 was actually the value of the second argument $index after running the function.)
  • This particular call normally has very detailed responses for 201 and at least returns the error for 400, 401, 403, 404, 409, and 500, but I don’t think my code is even going far enough to make the call. I haven’t even figured out how to get the response codes at all.
  • For a script that transfers well over one thousand high resolution images from my server to Etsy’s server, it certainly executes very fast.
  • The $multiHandle seems to work as intended. At the very least, a var_dump($multiHandle) reveals all the correct file names in there.

Here is a list of diagnostic functions I’ve tried and their outputs, again thanks to @Kazz for the functions.

  • while (false !== ($info = curl_multi_info_read($multiHandle))) { print_r($info); } returns Array ( [msg] => 1 [result] => 7 [handle] => Resource id #1009 ) for each item (with +1 to each Resource id for each result following)
  • print_r(dns_get_record('openapi.etsy.com', DNS_A)); returns Array ( [0] => Array ( [host] => e8520.b.akamaiedge.net [class] => IN [ttl] => 0 [type] => A [ip] => 104.127.77.191 ) )
  • var_dump(exec('ping -c 3 openapi.etsy.com')); returns string(0) ""
  • exec('ping -c 3 openapi.etsy.com', $output); var_dump($output); returns array(0) { }
  • exec('ping -n 3 openapi.etsy.com', $output); var_dump($output); returns array(0) { }
  • this whole thing returns "TCP/IP Connection
    OK. Attempting to connect to ‘104.127.77.191’ on port ’80’…OK. Sending HTTP GET request…OK. Reading response: HTTP/1.1 301 Moved Permanently Server: AkamaiGHost Content-Length: 0 Location: openapi.etsy.com Date: Tue, 21 Feb 2023 02:34:15 GMT Connection: close Closing socket…OK."

It wouldn’t surprise me if it’s a minor typo causing this. What is it?

4

Answers


  1. Chosen as BEST ANSWER

    This isn't the best answer, but it meets the minimum requirements of "an answer" and I don't want to keep shifting back the goal posts.

    The code runs fine 7 calls at a time. So with a few loops, I should be able to get a 7x improvement on the original code. (Will update if this doesn't work.)

    Running with over 1000 calls at a time, curl_multi_exec behaves as if it never executed at all.

    Running with exactly 8 calls, I hit an Internal Service error double free or corruption (out) according to the CGI Error Log.

    The above will produce more times if I refresh the page but I only seemed to get this error once: php: malloc.c:3722: _int_malloc: Assertion '(unsigned long) (size) >= (unsigned long) (nb)' failed. (The first ' was actually a `, but that causes a formatting error on this site, so I changed it.)

    Although only appearing once, it did appear while I was testing, so I'm certain it's relevant.

    It seems I've found a gap between "well behaving success" and "well behaving failure", but I'm not brave enough to guess why the gap exists or speculate whether or not someone will patch it out.

    I hope someone can explain to me why it only works for 7 and not 8. (Maybe it has something to do with php.ini, or CURLOPT.) I've updated the title accordingly to attract someone who might know the answer. But "only run it for 7 calls" is a valid answer to the question of how to get it working at all.


  2. This is just a bit of a guess, but if your problem is that your are timing out, then it seems that the following loop you coded may be the problem:

        $index = null;
        do {
            curl_multi_exec($multiHandle, $index);
        } while($index > 0);
    

    You are making repeated calls to curl_multi_exec, which is burning up CPU all the while you are waiting for all of your uploads to complete. You should instead only periodically be checking the status of your uploads and going into a wait state in between. This should reduce your total CPU time:

        while (TRUE) {
            $status = curl_multi_exec($multiHandle, $activeCount);
            if ($status == CURLM_OK && $activeCount) {
                // Wait some time before checking again:
                curl_multi_select($multiHandle, $timeout=1.0);
            }
            else {
                break;
            }
        }
    
    Login or Signup to reply.
  3. you’re probably tripping a rate-limit or connection-limit or anti-ddos-firewall or whatever beyond 7 concurrent connections. keep it under 8, something like (untested)

    <?php
    include_once 'databaseStuff.php';
    include_once 'EtsyTokenStuff.php';
    $result = mysqli_query($conn, "SELECT product, listing_id, alt_text, dataStuff;");
    $multiCurl = [];
    $multiResult = [];
    $multiHandle = curl_multi_init();
    $unemployed_workers = array();
    $employed_workers = array();
    $curl_responses = array();
    $max_workers = 7;
    for($i = 0; $i < $max_workers; ++$i){
        $unemployed_workers[] = curl_init();
    }
    $work = function()use(&$unemployed_workers, &$employed_workers, &$multiHandle, &$curl_responses){
        if(empty($employed_workers)){
            // nobody working, nothing to do..
            return;
        }
        for(;;){
            do{
                $err = curl_multi_exec($multiHandle, $running);
            } while($err == CURLM_CALL_MULTI_PERFORM);
            if($err != CURLM_OK){
                throw new RuntimeException("curl_multi_exec error {$err}: ". curl_multi_strerror($err));
            }
            if(count($employed_workers) > $running){
                // at least 1 worker has finished, process it
                break;
            } else {
                // no workers have finished, wait for activity
                curl_multi_select($multiHandle, 1);
            }
        }
        while($msg = curl_multi_info_read($multiHandle)){
            if($msg['msg'] != CURLMSG_DONE){
                // unknown message, ignore?
                continue;
            }   
            $result = $msg['result'];
            if($result != CURLE_OK){
                throw new Exception("curl error {$result}: ". curl_error($msg['handle']));
            }
            $url = curl_getinfo($msg['handle'], CURLINFO_EFFECTIVE_URL);
            $response = curl_multi_getcontent($done['handle']);
            $curl_responses[$url] = $response;
            $key = array_search($done['handle'], $employed_workers, true);
            if($key === false){
                throw new LogicException("Could not find worker");
            }
            $unemployed_workers[] = $employed_workers[$key];
            unset($employed_workers[$key]);
            curl_multi_remove_handle($multiHandle, $done['handle']);
        }
    }
    if (mysqli_num_rows($result) > 0){
        while ($row = mysqli_fetch_assoc($result)){
            for($image = 1; $image <=2; $image++){
                while(empty($unemployed_workers)){
                    $work();
                }
                $worker = array_pop($unemployed_workers);
                $employed_workers[] = $worker;
                curl_setopt_array($worker, 
                    array(
                        CURLOPT_URL => "https://openapi.etsy.com/v3/application/shops/$myShopNumber/listings/" . $row['listing_id'] . "/images",
                        CURLOPT_RETURNTRANSFER => true,
                        CURLOPT_ENCODING => '',
                        CURLOPT_MAXREDIRS => 10,
                        CURLOPT_TIMEOUT => 0,
                        CURLOPT_FOLLOWLOCATION => true,
                        CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
                        CURLOPT_POST => true,
                        CURLOPT_POSTFIELDS => array(
                            "image" => new CURLFILE(
                                [
                                    1 => "img/imagePathStuff/" . $row['product'] . ".jpg",
                                    2 => "img/differentImagePathStuff/" . $row['product'] . ".jpg"
                                ][$image] // what the fuck?
                            ),
                            // "listing_image_id" =>,
                            "rank" => $image,
                            "overwrite" => true,
                            // "is_watermarked" =>,
                            "alt_text" => $row['alt_text']
                        ),
                        CURLOPT_HTTPHEADER => array(
                            "x-api-key: $myAPIKey",
                            "authorization: Bearer {$etsyAccessToken}"
                        ),
                    )
                );
                curl_multi_add_handle($multiHandle, $worker]);
            }
        }
        while(!empty($employed_workers)){
            $work();
        }
        foreach($unemployed_workers as $worker){
            curl_close($worker);
        }
        curl_multi_close($multiHandle);
    }
    
    Login or Signup to reply.
  4. So, the first answer I posted improved things from 0 to 7, but now things are improved from 7 to… 90ish? I don’t think it consistently fails on the same number, and I’m not investigating further because only the first 10 calls per second are received by the server anyway.

    The second improvement came from updating PHP from version 7 to version 7.4.

    So, limiting curl_multi_exec to 10 calls, I can now loop out all the calls until it eventually hits a 503 error.

    Luckily, @Booboo’s solution fixed it.

    So, in summary, the correct solutions are:

    • Limit the number of calls manually
    • Update PHP to at least 7.4. (More recent versions are unavailable for me to test, but more recent may be better.)
    • Use @Booboo’s snippet to fix 503 errors. (Give it an upvote if you use it. It’s only fair.)
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search