In a collection, approximately 1000 records are written per day per operating user (approximately 10 operating users in the field), I have a query that obtains the records made during the day in order to keep track of the points covered by mobile users in the field:
const lastActividadGPS = await firestore
.collection("Actividad_GPS")
.where("instancia", "==", instancia) //instancia
.where("user_id", "==", parseInt(ejecutivo.user_id))
.where("fecha_hora", ">=", dayStart.toDate())
.where("fecha_hora", "<=", dayEnd.toDate())
.orderBy("fecha_hora", "desc")
.get();
The collection:
It happens that when I have 100K+ of documents accumulated in the collection and 1-3 users make queries (they execute this query internally from the web platform) up to 30M readings are made per day, but if I delete all the data it normalizes and I have up to 50K of readings per day. I have not been able to identify the problem with the excessive readings – when the collection has a lot of data – and I do not know if firebase at the time of sorting reads all the documents and then filters it.
We also tried including limit in the query but it didn’t work.
2
Answers
Firestore queries execute on indexes that you define, only allows queries that it can actually execute on those indexes, and only read (and charge document reads and bandwidth) for documents that are matched by those conditions in your query.
So if you are charged for more reads than you expect, it’s because those documents are actually matched in the queries you execute.
I can tell you that time series data deserves special handling. It boils down to hardware utilization, how data is stored and accessed at the physical layer. Mongo talks about it. BTW I’m not saying that Firebase/Firestore isn’t appropriate for your needs, just sharing that time series records get special attention / modeling.