So I’m creating this scraping application that essentially uses a REST API for several entities on the server. The main functionallity is just:
let buffer = []
for(entity in entities){
let data = await scrapingProcessForAnEntity(entity);
buffer.push(data)
}
I oversimplified the script because the scraping process and how it is stored is not relevant, the point is that I have this function scrapingProcessForAnEntity
that gets and returns all the information I need in a Promise.
The thing is, since there are a lot of entities, I want to make it so that I can run the process for multiple entities at a time, and once one of the processes is finished, a new one takes its place. I made some tests trying to use an array of Promises and Promise.race()
but I can’t figure how to make the finished Promise quit the array. I also can’t just run the process for all entities at once with a Promise.all()
because the server is not capable of handling too many requests at once. It should be ok if I limit it to 3~5 entities.
My current implementation is:
let promises = []
let buffer = []
async function run(){
for(entity in entities){
addPromise(() => scrapingProcessForAnEntity(entity))
}
}
async function addPromise(prom){
if(promises.length >= 3){ //In this example I'm trying to make it run for up to 3 at a time
await moveQueue()
}
promises.push(prom())
}
async function moveQueue(){
if(promises.length < 3){
return
}
let data = await Promise.race(promises)
buffer.push(data)
promises.splice(promises.indexOf(data), 1) //<---- This is where I'm struggling at
//how can I remove the finished promised from the array?
}
Adding the data into to the buffer is not directly implemented in the Promise itself because there’s processing involved in it and I’m not sure if adding the data from 2 promises at the same time might cause an issue.
I also implemented a way to clear all the promises on the end. My only struggle is how to find which promise inside of the array has finished so that it can be replaced.
2
Answers
you can use the map method to execute multiple promises .for example
you can chunk the entities how many you want before making the promises
here allEntities will have the array of promises and Promise.all method will execute it prallelly
use the Promise.all() instead of Promise.race(). This returns a single Promise that resolves when all of the input Promises have resolved.
it ensures that the Promises are all running at the same time, and the array is only spliced after all of the Promises have finished. This also removes the need to track which individual Promise has finished.