Firebase - How to count the number of a large collection in FireStore?

MAY
October 31, 2023
159 views
0 votes
2 Answers

We have a collection that may contain hundreds of billions of data.
Now we want to get the count of it.

Thanks!

When I’m using count() to count the number:

ref = db.collection('my_collection').count()
print(ref.get())

it always returns error like this:

raise exceptions.from_grpc_error(exc) from exc
google.api_core.exceptions.DeadlineExceeded: 504 Aggregation query timed out. Please try either limiting the entities scanned, or running with an updated index configuration.

It’s timeout all the time. Am I missing something? What’s the proper way to count a large collection?

Answers

- ooxvyd
- October 31, 2023 at 8:24 am
- 0 votes
0
When dealing with a large collection in Firestore, counting the number of documents using count() may lead to timeouts or performance issues because it tries to count all the documents in the collection in one go. Firestore’s query and data retrieval operations are optimized for smaller result sets. Make sure you have relevant indexes configured for your collection. Firestore needs proper indexes for efficient queries. You can check the Firestore console to ensure that indexes are in place for the fields you are querying or filtering on. Without proper indexes, the queries will be slower. Instead of counting all documents at once, paginate through the collection and count documents in smaller chunks. Here’s a general outline of how you might approach it:
```
import firebase_admin
from firebase_admin import credentials
from google.cloud import firestore

# Initialize Firebase Admin SDK
cred = credentials.Certificate('path-to-serviceAccountKey.json')
firebase_admin.initialize_app(cred)

# Initialize Firestore client
db = firestore.Client()

# Your collection reference
collection_ref = db.collection('my_collection')

batch_size = 1000  # Adjust as needed
total_count = 0
query = collection_ref

while True:
    documents = query.limit(batch_size).stream()
    batch_count = len(list(documents))
    
    total_count += batch_count
    
    if batch_count < batch_size:
        break

print("Total Count:", total_count)
```
Login or Signup to reply.

- AlexMamo
- October 31, 2023 at 10:04 am
- 0 votes
0
If you have in a single collection, hundreds of billions of documents, then count() doesn’t seem to be the right solution. If the count() function cannot return a result within 60 seconds, then a DEADLINE_EXCEEDED error is thrown. Please note that the performance always depends on the size of the collection. So a possible workaround is to use counters for such large collections. Alternatively, you can create and maintain your own counter as explained in this resource.

Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Firebase – How to count the number of a large collection in FireStore?

Answers