I am working with NodeJS and the Puppeteer library to load a website and then check if a certain text is displayed on the page. I would like to count the number of occurrences of this specific text. Specifically, I would like this search to work exactly in the same manner as how the Ctrl+F
function works in Chrome or Firefox.
Here’s the code I have so far:
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://example.com');
// How do I count the occurrences of the specific text here?
await browser.close();
})();
Can someone please help me with a solution on how to achieve this? Any help would be greatly appreciated. Thanks in advance!
3
Answers
you can get all the text and then run regex or simple search.
As I mentioned in a comment, the Ctrl+f algorithm may not be as simple as you presume, but you may be able to approximate it by making a list of all visible, non-style/script/metadata values and text contents.
Here’s a simple proof of concept:
Output:
Undoubtedly, this has some false positives and negatives, and I’ve only tested it on google.com. Feel free to post a counterexample and I’ll see if I can toss it in.
Also, since we run two separate queries, then combine the results and dedupe, ordering of the text isn’t the same as it appears on the page. You could query by
*, [value]
and use conditions to figure out which you’re working with if this matters. I’ve assumed your final goal is just a true/false "does some text exist?" semantic.