Transactions and batched writes can be used to write multiple documents by means of an atomic operation.
Documentation says that Using the Cloud Firestore client libraries, you can group multiple operations into a single transaction.
I cannot understand what is the meaning of client libraries here and if it’s correct to use Transactions and batched writes within a Cloud Function.
Example given: suppose in the database I have 3 elements (which doc IDs are A, B, C). Now I need to insert 3 more elements (which doc IDs are C, D, E). The Cloud Function should add just the latest ones and send a Push Notification to the user telling him that 2 new documents are available.
The doc ID could be the same but since I need to calculate how many documents are new (the ones that will be inserted) I need a way to read the doc ID first and check for its existence. Hence, I’m wondering if Transactions fit Cloud Functions or not.
Also, each transaction or batch of writes can write to a maximum of 500 documents. Is there any other way to overcome this limit within a Cloud Function?
2
Answers
Firestore Transaction behaviour is different between the Clients SDKs (JS SDK, iOS SDK, Android SDK , …) and the Admin SDK (a set of server libraries), which is the SDK we use in a Cloud Function. More explanations on the differences here in the documentation.
Because of the type of data contention used in the Admin SDK you can, with the
getAll()
method, retrieve multiple documents from Firestore and hold a pessimistic lock on all returned documents.So this is exactly the method you need to call in your transaction: you use
getAll()
for fetching documents C, D & E and you detect that only C is existing so you know that you need to only add D and E.Concretely, it could be something along the following lines:
To test it: Create one of the C, D or E doc in the
coltest
collection, then create a doc in thetempo
collection (Just a simple way to trigger this test Cloud Function): the CF is triggered. Then look at thecoltest
collection: the two missing docs were created; and look a the CF log: counter = 2.AFAIK the answer is no.
There used to also be a one second delay required as well between 500 record chunks. I wrote this a couple of years ago. The script below reads the CSV file line by line, creating and setting a new batch object for each line. A counter creates a new batch write per 500 objects and finally asynch/await is used to rate limit the writes to 1 per second. Last, we notify the user of the write progress with console logging. I had published an article on this here >> https://hightekk.com/articles/firebase-admin-sdk-bulk-import
NOTE: In my case I am reading a huge flat text file (a manufacturers part number catalog) for import. You can use this as a working template though and modify to suit your data source. Also, you may need to increase the memory allocated to node for this to run:
node --max_old_space_size=8000 app.js
The script looks like: