$ch = curl_init();
$post_field = 'ajax=SearchArticulo&cntrSgn=DeExMEkRabGEO396gOLDMqUZiXe2BibRjqgUXwZlQmMgrw4jJmdAwbUD11%2BddBhn&srcInicio=false&isSimple=false&codMarca=0&field=nombre&value=&oferta=false&pvpSubido=False&detallada=false&codPedido=';
$post_field .= '&cat1=5&cat2=95&cat3=0&token=&User=user&Pwd=password';
curl_setopt($ch, CURLOPT_URL, 'https://actibios.com/WebForms/Clientes/GenerarPedidosVentas_new.aspx');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_field);
curl_setopt($ch, CURLOPT_USERAGENT, 'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36');
curl_setopt ($ch, CURLOPT_REFERER, 'https://actibios.com/WebForms/Clientes/GenerarPedidosVentas_new.aspx');
curl_setopt($ch, CURLOPT_USERPWD, 'user:password');
curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_BASIC);
curl_setopt($ch, CURLOPT_COOKIE, 'grpAct@Session=sessionhere;grpAct@CodEmpresa=1; grpAct@CodDelegacion=1; grpAct@Year=; grpAct@Version=2024.01.25.005; grpAct@User=user;grpAct@Pwd=password;');
curl_setopt($ch, CURLOPT_COOKIEJAR, $_SERVER['DOCUMENT_ROOT'].'/cookie_actibios.txt');
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Connection: Keep-Alive'
));
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
$result = curl_exec($ch);
if (curl_errno($ch)) {
echo 'Error:' . curl_error($ch);
}
//curl_close($ch);
$dom = new DOMDocument();
$dom->loadHTML($result);
$tables = $dom->getElementsByTagName('table');
$table_array = array();
foreach ($tables as $table) {
$rows = $table->getElementsByTagName('tr');
foreach ($rows as $row) {
$cols = $row->getElementsByTagName('td');
$row_array = array();
foreach ($cols as $col) {
$row_array[] = $col->nodeValue;
}
$table_array[] = $row_array;
}
}
$products = [];
foreach ($table_array as &$item) {
// get description
curl_setopt($ch, CURLOPT_URL, 'https://actibios.com/WebForms/Clientes/Indicacion.aspx?cp='.$item[0]);
curl_setopt($ch, CURLOPT_POSTFIELDS, "ajax=GetIndicacion&codArticulo=".$item[0]);
curl_setopt($ch, CURLOPT_COOKIEJAR, $_SERVER['DOCUMENT_ROOT'].'/cookie_actibios.txt');
curl_setopt($ch, CURLOPT_COOKIEFILE, $_SERVER['DOCUMENT_ROOT'].'/cookie_actibios.txt');
curl_setopt ($ch, CURLOPT_REFERER, 'https://actibios.com/WebForms/Clientes/Indicacion.aspx');
curl_setopt($ch, CURLOPT_USERPWD, 'user:password');
curl_setopt($ch, CURLOPT_VERBOSE, true);
$page_content = preg_replace('/<(pre)(?:(?!</1).)*?</1>/s','',curl_exec($ch));
$desc = explode(':',$page_content);
// get description
$description = str_replace(",'marca'","",$desc[3]);
$description = str_replace("'", "", $description);
$name = str_replace("xc2xa0",' ',$item[1]);
$name = trim($name);
$products[] = [
'ref' => $item[0],
'name' => $name,
'cat1' => 'cat1',
'cat2' => 'cat2',
'image' => 'https://actibios.com/WebForms/Controls/imgArticulo.aspx?ca='.$item[0],
'stock' => $item[5],
'desc' => $description,
'brand' => $item[2],
'price1' => floatval($item[6]),
'price2' => floatval($item[7])
];
unset($item[8]);
unset($item[9]);
unset($item[10]);
}
//var_dump($products);
echo json_encode($products);
if i use this code everything work fine but the next day it doesn’t work anymore.
how i can fix this?
i need login first and then parse html table with products.
I also tried writing a cookie to a file and retrieving it from the file, but that doesn’t work either. curl_setopt($ch, CURLOPT_COOKIEJAR, $_SERVER[‘DOCUMENT_ROOT’].’/cookie_actibios.txt’); curl_setopt($ch, CURLOPT_COOKIEFILE, $_SERVER[‘DOCUMENT_ROOT’].’/cookie_actibios.txt’); file create and save actibios.com FALSE / FALSE 0 SERVERUSED srv2|Ziwe0|Ziwez BUT IT DIDN’T WORK
And i tried add user and password on this, but same problem
if i change Cookie: grpAct@Session=sessionhere it works again
2
Answers
Ok. I guess I've solved the problem. My cookie.txt file was missing some keys and values. e.g. grpAct@CnfCookies=
I added the initial keys and values and now they are updated and I can access the data I need.
Let the curl handle cookies for you, use
CURLOPT_COOKIEFILE
to load cookies andCURLOPT_COOKIEJAR
to store them. Otherwise you have to manually parse response header and update the cookie values etc.Edit: There is something called cookie engine, which basically follows server instruction about cookies management, when you don’t have it you are blindly sending the same hardcoded cookies which are no longer valid. There are only two options which enables cookie engine:
CURLOPT_COOKIEFILE
andCURLOPT_COOKIEJAR
, you can interact with the engine by the file content and theCURLOPT_COOKIELIST
(andCURLOPT_COOKIESESSION
but lets leave that for now), nothing else, theCURLOPT_COOKIE
dosn’t interact with the cookie engine at all, when you use both its like having hardcoded cookie in request + the cookie engine data, it will not add them to the cookie data file, also server will not response with set-cookie because from its perspective it looks like your engine is aware of them already thus they don’t get to the engine and ultimately neither to the cookie data file, on the other handCURLOPT_COOKIELIST
does interact with the engine, you can permanently add/edit/remove cookies through this option to the cookie data file (and requests ofc). I mentioning all this because it could be cause of "it didn’t work".Usually easy way is to login while having the
CURLOPT_COOKIEJAR
option set, then reusing the cookie data file for future requests, by logging in you get exact state of cookies which are expected. Only when you understand how the sever manages the cookies you can skip this step (and hope it will stay the same) by keeping prearanged cookies which server expects, but that is not really "fair use" because cookies are actually owned by server, you are not suppose to manipulate them on your own and/or ignore the server cookie management, that is why the cookie engine will save the day.Worth noting that the server has to support these long sessions (accepting old cookies as authorization), that is completely on server decision. When not accepted you have only option – login again, having automated detection of not being logged in and login when needed could be the only solution in that case.
I expect that some token or id gets regenerated the other day but your app doesn’t adapt, that is why I suggested having cookie engine on.
You could use some tools like wireshark or proxy for individual request/response inspection to find mismatch.