How to store unique visits in Redis

Maddie
September 15, 2020
120 views
3 votes
3 Answers

I want to know how many people visited each blog page. For that, I have a column in the Blogs table (MS SQL DB) to keep the total visits count. But I also want the visits to be as unique as possible.
So I keep the user’s unique Id and blog Id in the Redis cache, and every time a user visits a page, I check if she has visited this page before, if not, I will increase the total visit count.

My question is, what is the best way of storing such data?
Currently, I create a key like this "project-visit-{blogId}-{userId}" and use StringSetAsync and StringGetAsync. But I don’t know if this method is efficient or not.

Any ideas?

Tags: redis stackexchange.redis

Answers

- for_stack
- September 15, 2020 at 11:55 am
- 0 votes
0
Your solution is not atomic, unless you wrap the get and set operation in a transaction or Lua script.

A better solution is to save project-visit-{blogId}-{userId} into a Redis set. When you get a visit, call SADD add an item into the set. Redis adds a new item to the set, only if the user has not visited this page before. If you want to get the total count, just call SCARD to get the size of the set.

Login or Signup to reply.

- MostafaTalebi
- September 15, 2020 at 11:59 am
- 0 votes
0
Regardless of the back-end technology (programming language etc.), you can use Redis stream. It is a very new feature in Redis 5 and allows you to define publisher and subscriber to a topic (stream) created in Redis. Then, in each user visit, you commit a new record (of course, async) to this stream. You can hold whatever info you want in that record (user ip, id etc..).

Defining a key for each unique visit is not a good idea at all, because:
- It makes the life harder for redis GC
- Performance, comparing the use-case, is not comparable to Stream, specially if you use that instance of redis for other purposes
- Constantly collecting these unique visits and processing them is not efficient. You have to always scan through all keys
Conclusion:
If you want to use Redis, go with Redis Stream. If Redis can be changed, go with Kafka for sure (or a similar technology).
Login or Signup to reply.

- ElenaKolevska
- September 17, 2020 at 11:56 am
- 0 votes
0
If you can sacrifice some precision, the HyperLogLog (HLL) probabilistic data structure is a great solution for counting unique visits because:
- It only uses 12K of memory, and those are fixed – they don’t grow with the number of unique visits
- You don’t need to store user data, which makes your service more privacy-oriented
The HyperLogLog algorithm is really smart, but you don’t need to understand its inner workings in order to use it, some years ago Redis added it as a data structure. So all you, as a user, need to know is that with HyperLogLogs you can count unique elements (visits) in a fixed memory space of 12K, with a 0.81% margin of error.

Let’s say you want to keep a count of unique visits per day; you would have to have one HyperLogLog per day, named something like cnt:page-name:20200917 and every time a user visits a page you would add them to the HLL:
```
> PFADD cnt:page-name:20200917 {userID}
```
If you add the same user multiple time, they will still only be counted once.
To get the count you run:
```
> PFCOUNT cnt:page-name:20200917
```
You can change the granularity of unique users by having different HLLs for different time intervals, for example cnt:page-name:202009 for the month of September, 2020.

This quick explainer lays it out pretty well: https://www.youtube.com/watch?v=UAL2dxl1fsE

This blog post might help too: https://redislabs.com/redis-best-practices/counting/hyperloglog/

And if you’re curious about the internal implementation Antirez’s release post is a great read: http://antirez.com/news/75

NOTE: Please note that with this solution you lose the information of which user visited that page, you only have the count
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.