I want to parse data from site https://csfloat.com/search?def_index=4727 and i use puppeteer.
const puppeteer = require("puppeteer");
(async () => {
const browser = await puppeteer.launch({headless: false});
const page = await browser.newPage();
await page.goto("https://csfloat.com/search?def_index=4727");
let arr = await page.evaluate(() => {
let text = document.getElementsByClassName("price ng-star-inserted")
let array = []
for (let i = 0; i < text.length; i++) {
array.push(text[i].innerText)
}
console.log(array)
})
})()
But the problem is that when i run this script, it opens it’s own browser and open this page, where i am not authorized, so i cant parse data beacuse even if i paste my login and password, i have to confirm this in steam, so, how can i do this from my browser where i am authorized or how can i fix this problem, maybe another library???
2
Answers
You can always use your beloved browser developer tools.
And then select the
Console
tab to write your own script there.Or you can also use
Recorder
tab when you want to do some automated routine task daily or hourly.You can access it by selecting the double chevron arrow on tab bar.
And there, you can do many things to automate clicks, scroll, and even waiting for an element to be exist and visible. Then you can always export it to puppeteer script, if you like to.
I hope this can help you much.
Edi gives some good suggestions, but to supplement those, here are a few other approaches. There’s no silver bullet in web scraping, so you’ll need to experiment to see what works for a particular site (I don’t have a Steam account).
userDataDir
flag, then run it once headfully with a long timeout or REPL and login manually. Kill the script without trying to automate the site yet. The session should be saved, so on subsequent runs, you’ll be pre-authorized and you can automate as normal.fetch
call. A simple example of this strategy is here. The same caveats as above apply.