skip to Main Content

I am thinking about how to model a collection where each document is a building with a geolocation. I know I should use Geohashes, but what I am worried about is that I would get for each query dozens (if not hundreds) of documents read.
Would using a single document as a cluster of dozens of buildings be a bad idea? Is there a better solution to this problem?

2

Answers


  1. I know I should use Geohashes, but what I am worried about is that I would get for each query dozens (if not hundreds) of documents read.

    According to the official documentation:

    Firestore is optimized for storing large collections of small documents.

    If you think that you’ll have a large number of document reads, then you should consider studying the Realtime Database, which has a different billing mechanism.

    Would using a single document as a cluster of dozens of buildings be a bad idea?

    No, as you long as stay below the maximum 1 MiB limitation.

    Is there a better solution for this problem?

    There is no "perfect", "the best" or "the correct" solution for structuring a Cloud Firestore database. In your case, you have to do some maths in order to see which solution out of the above three fits best your needs.

    Login or Signup to reply.
  2. I have at some point experimented with storing multiple geohash values into a single document, with the document ID being the common prefix of those hashes and then for each point we keep the exact lat/lon for the corresponding point and its document ID.

    Doing this for all documents, you end up with multiple additional documents (I called them geo-index documents) with extra metadata.
    Here’s an example of such a document:
    enter image description here

    So here, the common prefix is cu, meaning that the document contains the metadata for all geohashes starting with cu.


    To execute a geoquery, I’d then calculate the geohash ranges that contained potential matching documents, read the geo-index documents for those ranges, and qualify/disqualify the actual documents based on the lat/lon from the geo-index documents.

    With this approach I was able to drastically reduce the number of documents that had to be read to determine the matches. But on the other hand, it required me create new data structures, essentially building my own index type for the database.

    You’ll have to determine whether that trade-off of complexity vs cost is worth it for your use-case and other requirements.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search