The KEYS command returns some results:
> keys Types/*/*BackgroundJob.json
1) "Types/Xyz.Data/Xyz.Data.BackgroundJobEngine.BackgroundJob.json"
2) "Types/Xyz.Web.SystemAdmin/Xyz.Web.SystemAdmin.Models.Encryption.EncryptionMethodByBackgroundJob.json"
3) "Types/BackgroundJobs/SharpTop.Engine.BackgroundJobs.AutofillBackgroundJob.json"
4) "Types/Quartz.Server/BJE.UDT.BackgroundJob.json"
5) "Types/DFControllersTest.Compensation/DFControllersTest.Compensation.SubmitCompensationPublishBackgroundJob.json"
6) "Types/SpecFlowTest.Architecture.Base/SpecFlowTest.Architecture.Base.Model.IntStudioConfigBackgroundJob.json"
7) "Types/SpecFlowTest.Benefits.UI/SpecFlowTest.Benefits.UI.Base.Services.BackgroundJobsService+BackgroundJob.json"
8) "Types/Xyz.WFM.ExpressionService.Client/Xyz.WFM.ExpressionService.Client.BackgroundJob.ExpressionManagerBackgroundJob.json"
9) "Types/DFControllersTest.Compensation/DFControllersTest.Compensation.SubmitGenerateBudgetWorksheetBackgroundJob.json"
10) "Types/DFControllersTest.Compensation/DFControllersTest.Compensation.SubmitCompensationUnPublishBackgroundJob.json"
11) "Types/IntegrationStudio/IntegrationStudio.DAL.Entities.IntStudioConfigBackgroundJob.json"
12) "Types/IntegrationStudio/IntegrationStudio.DAL.Entities.BackgroundJob.json"
But the SCAN using the same pattern returns none:
> scan 0 match Types/*/*BackgroundJob.json
1) "1966080"
2) (empty list or set)
I tried to follow the returned cursor value several iterations, but without scripting it to check it through, it seems endless series of empty results.
What is going on?
Edit 1
I finally decided to code it:
private async IAsyncEnumerable<string> QueryRedisAsync(string pattern, [EnumeratorCancellation] CancellationToken ct = default)
{
var db = connection.GetDatabase();
var cursor = "0";
int count = 0;
do
{
++count;
ct.ThrowIfCancellationRequested();
var tmp = await db.ExecuteAsync("SCAN", cursor, "MATCH", pattern, "COUNT", "1000");
var scanResult = (RedisResult[])tmp;
cursor = scanResult[0].ToString();
var keys = (RedisKey[])scanResult[1];
foreach (var key in keys)
{
yield return key.ToString();
}
}
while (cursor != "0");
Console.WriteLine(count);
}
The code performed 1058 (!) iterations where exactly one match was found on some iteration, namely:
- 173
- 189
- 242
- 351
- 416
- 473
- 590
- 912
- 975
- 983
- 998
- 1027
So, I used SCAN
in order to be "nice" and it caused 1058 round trips to the server.
Am I doing something wrong?
Possible duplicate
I don’t think this is a duplicate of redis scan returns empty results but nonzero cursor. It does not seem reasonable to exercise 1K+ round-trips to the server for getting just a few results.
2
Answers
The
KEYS
command behaves totally different fromSCAN
command.KEYS
command iterates all keys in Redis, and filter keys matching your given pattern. That’s why a single round trip gives you the answer. However, when runningKEYS
command, Redis blocks, and cannot process other command. So it’s a bad idea to useKEYS
command in production env, especially when you have a large dataset.SCAN
command also iterates the keys in Redis. However, for each scan, it only checks a few keys (you can use thecount
parameter to control the number of keys), filters keys matching your pattern, and returns. So you need to do multiple round trips to iterate all keys in Redis. Since each scan operation only checks a few keys, it won’t block Redis for a long time. And that’s the recommended way to scan the key space.Because you have a large dataset, and there’re only a few keys matching your pattern (a small proportion). The first 1057 scans do not get a key matching the pattern.
YES,
SCAN
is nicer thanKEYS
, especially when you need to scan all keys in Redis (no pattern specified, or a large portion of keys match the pattern).However, in your case, a better solution is to create a secondary index for the keys matching the pattern. Say, you can save these keys in a Redis SET, and scan the SET to get the keys.
Compare to the amount of the keys you have in your db, the performance of your Scan is accurate. Count don’t give a precise number of keys, but give the server a hint of how many “steps” to take for each iteration. So 1058 × 1000 ~= 1058186, it makes sense.
Your problem is that you use one huge server, so finding the keys taking a lot of iteration, but if you’ll use Keys it will do all those iterations at one call, and will block completely the server till it finish, possibly causing crashes and errors for many reasons.
In your case, depend on if the client you use implement that correctly, if you’ll use cluster with few small shards instead of one big standalone, the client can direct the request to only shards holding slots that can fits your pattern, saving big amount of the effort.
If the client you use doesn’t implement that, or you don’t want to have a cluster, the answer from for_stack is accurate.