skip to Main Content

I am working with a project where we create a bunch of entries in firestore based on results from an API endpoint we do not control, using a firestore cloud function. The API endpoint returns ids which we use for the document ids, but it does not include any timestamp information. Since we want to include a createdDate in our documents, we are using admin.firestore.Timestamp.now() to set the timestamp of the document.
On subsequent runs of the function, some of the documents will already exist so if we use batch.commit with create, it will fail since some of the documents exist. However, if we use batch.commit with update, we will either not be able to include a timestamp, or the current timestamp will be overwritten. As a final requirement, we do update these documents from a web application and set some properties like a state, so we can’t limit the permissions on the documents to disallow update completely.

What would be the best way to achieve this?
I am currently using .create and have removed the batch, but I feel like this is less performant, and I occasionally do get the error Error: 4 DEADLINE_EXCEEDED on the firestore function.

First prize would be a batch that can create or update the documents, but does not edit the createdDate field. I’m also hoping to avoid reading the documents first to save a read, but I’d be happy to add it in if it’s the best solution.
Thanks!

Current code is something like this:

  const createDocPromise = docRef
    .create(newDoc)
    .then(() => {
      // success, do nothing
    })
    .catch(err => {
      if (
        err.details &&
        err.details.includes('Document already exists')
      ) {
        // doc already exists, ignore error
      } else {
        console.error(`Error creating doc`, err);
      }
    });

2

Answers


  1. Since you prefer to keep the batch and you want to avoid reading the documents, a possible solution would be to store the timestamps in a field of type Array. So, you don’t overwrite the createdDate field but save all the values corresponding to the different writes.

    This way, when you read one of the documents you sort this array and take the oldest value: it is the very first timestamp that was saved and corresponds to the document creation.

    This way you don’t need any extra writes or extra reads.

    Login or Signup to reply.
  2. This might not be possible with batched writes as set() will overwrite the existing document, update() will update the timestamp and create() will throw an error as you’ve mentioned. One workaround would be to use create() for each document with Promise.allSettled() that won’t run catch() if any of the promise fails.

    const results = [] // results from the API
    
    const promises = results.map((r) => db.doc(`col/${r.id}`).create(r));
    
    const newDocs = await Promise.allSettled(promises)
    
    // either "fulfilled" or "rejected"
    newDocs.forEach((result) => console.log(result.status))
    

    If any documents exists already, create() will throw an error and status for that should be rejected. This way you won’t have to read the document at first place.


    Alternatively, you could store all the IDs in a single document or RTDB and filter out duplicates (this should only cost 1 read per invocation) and then add the data.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search