I am new to Firestore and building an event planning app but I am unsure what the best way to structure the data is taking into account the speed of queries and Firestore costs based on reads etc. In both options I can think of, I have a users
collection and an events
collection
Option 1:
In the users
collection, each user has an array of eventIds
for events they are hosting and also events they are attending. Then I query the events
collection for those eventIds
of that user so I can list the appropriate events to the user
Option 2:
For each event in the events
collection, there is a hostId
and an array of attendeeIds
. So I would query the events
collection for events where the hostID === user.id
and where attendeeIds.includes(user.id)
I am trying to figure out which is best from a performance and a costs perspective taking into account there could be thousands of events to iterate through. Is it better to search events
collections by an eventId
as it will stop iterating when all events are found or is that slow since it will be searching for one eventId
at a time? Maybe there is a better way to do this than I haven’t mentioned above. Would really appreciate the feedback.
2
Answers
I would recommend going with option 2 because it might save you some reads:
where(documentId(), "in", [...userEvents])
or fetch each of them individually if you have many.resource.data.hostId == request.auth.uid
.When using the first option, you’ll have to query the user’s document in security rules to check if this eventID is present in that events array (that may cost you another read). Checkout the documentation for more information on billing.
In addition to @Dharmaraj answer, please note that none of the solutions is better than the other in terms of performance. In Firestore, the query performance depends on the number of documents you request (read) and not on the number of documents you are searching. It doesn’t really matter if you search 10 documents in a collection of 100 documents or in a collection that contains 100 million documents, the response time will always be the same.
From a billing perspective, yes, the first solution will imply an additional document to read, since you first need to actually read the user document. However, reading the array and getting all the corresponding events will also be very fast.
Please bear in mind, that in the NoSQL world, we are always structuring a database according to the queries that we intend to perform. So if a query returns the documents that you’re interested in, and produces the fewest reads, then that’s the solution you should go ahead with. Also remember, that you’ll always have to pay a number of reads that is equal to the number of documents the query returns.
Regarding security, both solutions can be secured relatively easily. Now it’s up to you to decide which one works better for your use case.