Common way to keep clients in sync with server in real time is to make Websocket/SSE connection and push all updates this way. This is obviously very efficient, but also requires us too set up a server to handle all those persistent connections and to communicate with the rest of our infrastructure.
While I was looking into video streaming solutions, I learned that current way to go there is to put your data in form of static files, allow clients request whatever and whenever they need and let highly optimized servers like nginx do the rest for you.
So I started thinking if this could be also the way to go with message communication. Just put all data you want your clients to have fresh and synced into form of static files and set up nginx to serve them. Taking advantage of things like http/2, memcached, last-modified tags and request limiting would reduce overheat from clients polling the same files over and over again to absolute minimum. And not only we could get away without having to maintain additional communication protocol, but we could awoid invoking our backend code at all.
Do I miss something here?
2
Answers
IMHO this would be a step backwards rather than forwards. You can have a look at the many discussions on SO, such as this discussion, this one and this one.
In this SO thread there’s a good discussion about this question and you will find some of the additional costs related to your approach.
In short, using polling (even after optimization techniques such as your suggested “static file service” / http/2 / memcached, etc’), will always consume more resources than push techniques such as WebSockets.
For example, header parsing, cache validation, (authentication where required) etc’ are all repeated for each poll request and can be easily avoided by pushing the data.
There’s actually quite a bit of overhead with this method. It isn’t ideal. The only reason people do this is to re-use existing HTTP file/blob-based CDNs for video streaming.
The latency is high, as segments have to be written out and uploaded. Even if you stream the segments coming in to clients, you have the overhead of having the manifest. Even if you do away with the manifest, you have the overhead of having a client requesting segments. Even if you use push with HTTP/2, all of this complexity still exists.
Simply put, DASH and HLS are hacks that are designed to solve a specific need. The only reason they’re viable at all is that the payloads are relatively large.
I’m assuming your messages are much smaller than video data. It’s probably not worth the overhead.
There’s still significant overhead. Ideally, you would use HTTP/2 and push the resource, but this again requires a specialized server.
Correct.
At the end of this, you need to consider the trade-offs you’re making. Some things to consider:
If you’re polling infrequently, or for larger data updates, the overhead is probably fine.