skip to Main Content

I’m creating a small service where I poll around 100 accounts (in a Twitter-like service) frequently (every 5 seconds or so) to check for new messages, as the service doesn’t yet provide a streaming API (like Twitter actually does).

In my head, I have the architecture planned as queuing Tickers every 5 seconds for every user. Once the tick fires I make an API call to the service, check their messages, and call SELECT to my Postgres database to get the specific user details and check the date of the most recent message, and if there are messages newer than that UPDATE the entry and notify the user. Repeat ad nauseum.

I’m not very experienced in backend things and architecture, so I want to make sure this isn’t an absolutely absurd setup. Is the amount of calls to the database sensible? Am I abusing goroutines?

3

Answers


  1. Let me answer given what you describe.

    I want to make sure this isn’t an absolutely absurd setup.

    I understand the following. For each user, you create a tick every 5 seconds in one goroutine. Another goroutine consumes those ticks, performing the polling and comparing the date of the last message with the date you have recorded in your PostgreSQL database.

    The answer is: it depends. How many users do you have and how many can your application support? In my experience the best way to answer this question is to measure performance of your application.

    Is the amount of calls to the database sensible?

    It depends. To give you some reassurance, I have seen a single PostgreSQL database take hundreds of SELECT per second. I don’t see a design mistake, so benchmarking your application is the way to go.

    I am abusing goroutines?

    Do you mean like executing too many of them? I think it is unlikely that you are abusing goroutines that way. If there is a particular reason you think this could be the case, posting the corresponding code snippet could make your question more precise.

    Login or Signup to reply.
    • Is your architecture the most efficient way to go ? No.
    • Should you do something about it now ? No, you should test your solution.

    You can always go deeper with optimisations, in your case you need client throughput so you can use a bunch of well known optimisations like switching to a reactive model, add some cache server, spread the load on multiple DB slaves, …

    You should test your solution at scale, if it fits your needs in term of user throughput and server cost, then your solution is the right one.

    Login or Signup to reply.
  2. Your proposed solution: 1 query in every 5 seconds for every user. Having 100 users this is:

    1 * 100 / 5 seconds = 20 queries / second
    

    This is not considered a big load if the queries are fast.

    But why do you need to do this for every users separately? If you need to pick up updates in the granularity of 5 seconds, you could just execute 1 query in every 5 seconds which does not filter by user but checks for updates from all the users.

    If the above query gives results, you can iterate over the results and do the necessary for each user that had updates in the last 5 seconds. This results in:

    1 query / 5 seconds = 0.2 query / second
    

    Which is a hundred times less queries, still getting you all the updates in the same time granularity.

    If the task to be performed for the updates is long or depends on external systems (e.g. a call to another server), you may perform those tasks in separate goroutines. You may choose to either launch a new goroutine for each task, or you may have a pool of worker goroutines which consume these queued tasks, and just queue the task (using channels).

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search