Redis Secondary Indexes and Performance Question

Westy
August 20, 2020
152 views
0 votes
2 Answers

I know that Redis doesn’t really have the concept of secondary indexes, but that you can use the Z* commands to simulate one. I have a question about the best way to handle the following scenario.

We are using Redis to keep track of orders. But we also want to be able to find those orders by phone number or email ID. So here is our data:

> set 123 7245551212:[email protected]
> set 456 7245551212:[email protected]
> set 789 7245559999:[email protected]
> zadd phone-index 0 7245551212:123:[email protected]
> zadd phone-index 0 7245551212:456:[email protected]
> zadd phone-index 0 7245559999:789:[email protected]

I can see all the orders for a phone number via the following (is there a better way to get the range other than adding a ‘Z’ to the end?):

> zrangebylex phone-index [7245551212 (7245551212Z
1) "7245551212:123:[email protected]"
2) "7245551212:456:[email protected]"

My question is, is this going to perform well? Or should we just create a list that is keyed by phone number, and add an order ID to that list instead?

> rpush phone:7245551212 123
> rpush phone:7245551212 456
> rpush phone:7245559999 789
> lrange phone:7245551212 0 -1
1) "123"
2) "456"

Which would be the preferred method, especially related to performance?

Tags: redis secondary-indexes

Answers

- LeoMurillo
- August 20, 2020 at 10:46 pm
- 0 votes
0
RE: is there a better way to get the range other than adding a ‘Z’ to the end?
Yes, use the next immediate character instead of adding Z:
```
zrangebylex phone-index [7245551212 (7245551213
```
But certainly the second approach offers better performance.

Using a sorted set for lexicographical indexing, you need to consider that:
- The addition of elements, ZADD, is O(log(N))
- The query, ZRANGEBYLEX, is O(log(N)+M) with N being the number of elements in the sorted set and M the number of elements being returned
In contrast, using lists:
- The addition, RPUSH, is O(1)
- The query, LRANGE, is O(N) as you are starting in zero.
You can also use sets (SADD and SMEMBERS), the difference is lists allows duplicates and preserves order, sets ensure uniqueness and doesn’t respect insertion order.
Login or Signup to reply.

- Gawain
- August 22, 2020 at 2:07 pm
- 0 votes
0
ZSet use skiplist for score and dict for hashset. And if you add all elements with same score, skiplist will be turned to B-TREE like structure, which have a O(logN) time complexity for lexicographical order search.

So if you don’t always perform range query for phone number, you should use list for orders which phone number as key for precise query. Also this will work for email(you can use hash to combine these 2 list). In this way performance for query will be much better than ZSET.

Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.