skip to Main Content

The KEYS command returns some results:

> keys Types/*/*BackgroundJob.json
1) "Types/Xyz.Data/Xyz.Data.BackgroundJobEngine.BackgroundJob.json"
2) "Types/Xyz.Web.SystemAdmin/Xyz.Web.SystemAdmin.Models.Encryption.EncryptionMethodByBackgroundJob.json"
3) "Types/BackgroundJobs/SharpTop.Engine.BackgroundJobs.AutofillBackgroundJob.json"
4) "Types/Quartz.Server/BJE.UDT.BackgroundJob.json"
5) "Types/DFControllersTest.Compensation/DFControllersTest.Compensation.SubmitCompensationPublishBackgroundJob.json"
6) "Types/SpecFlowTest.Architecture.Base/SpecFlowTest.Architecture.Base.Model.IntStudioConfigBackgroundJob.json"
7) "Types/SpecFlowTest.Benefits.UI/SpecFlowTest.Benefits.UI.Base.Services.BackgroundJobsService+BackgroundJob.json"
8) "Types/Xyz.WFM.ExpressionService.Client/Xyz.WFM.ExpressionService.Client.BackgroundJob.ExpressionManagerBackgroundJob.json"
9) "Types/DFControllersTest.Compensation/DFControllersTest.Compensation.SubmitGenerateBudgetWorksheetBackgroundJob.json"
10) "Types/DFControllersTest.Compensation/DFControllersTest.Compensation.SubmitCompensationUnPublishBackgroundJob.json"
11) "Types/IntegrationStudio/IntegrationStudio.DAL.Entities.IntStudioConfigBackgroundJob.json"
12) "Types/IntegrationStudio/IntegrationStudio.DAL.Entities.BackgroundJob.json"

But the SCAN using the same pattern returns none:

> scan 0 match Types/*/*BackgroundJob.json
1) "1966080"
2) (empty list or set)

I tried to follow the returned cursor value several iterations, but without scripting it to check it through, it seems endless series of empty results.

What is going on?

Edit 1

I finally decided to code it:

private async IAsyncEnumerable<string> QueryRedisAsync(string pattern, [EnumeratorCancellation] CancellationToken ct = default)
{
    var db = connection.GetDatabase();
    var cursor = "0";
    int count = 0;
    do
    {
        ++count;
        ct.ThrowIfCancellationRequested();

        var tmp = await db.ExecuteAsync("SCAN", cursor, "MATCH", pattern, "COUNT", "1000");
        var scanResult = (RedisResult[])tmp;
        cursor = scanResult[0].ToString();
        var keys = (RedisKey[])scanResult[1];

        foreach (var key in keys)
        {
            yield return key.ToString();
        }
    } 
    while (cursor != "0");
    Console.WriteLine(count);
}

The code performed 1058 (!) iterations where exactly one match was found on some iteration, namely:

  1. 173
  2. 189
  3. 242
  4. 351
  5. 416
  6. 473
  7. 590
  8. 912
  9. 975
  10. 983
  11. 998
  12. 1027

So, I used SCAN in order to be "nice" and it caused 1058 round trips to the server.

Am I doing something wrong?

Possible duplicate

I don’t think this is a duplicate of redis scan returns empty results but nonzero cursor. It does not seem reasonable to exercise 1K+ round-trips to the server for getting just a few results.

2

Answers


  1. The KEYS command behaves totally different from SCAN command.

    KEYS command iterates all keys in Redis, and filter keys matching your given pattern. That’s why a single round trip gives you the answer. However, when running KEYS command, Redis blocks, and cannot process other command. So it’s a bad idea to use KEYS command in production env, especially when you have a large dataset.

    SCAN command also iterates the keys in Redis. However, for each scan, it only checks a few keys (you can use the count parameter to control the number of keys), filters keys matching your pattern, and returns. So you need to do multiple round trips to iterate all keys in Redis. Since each scan operation only checks a few keys, it won’t block Redis for a long time. And that’s the recommended way to scan the key space.

    The code performed 1058 (!) iterations where exactly one match was found on some iteration, namely

    Because you have a large dataset, and there’re only a few keys matching your pattern (a small proportion). The first 1057 scans do not get a key matching the pattern.

    So, I used SCAN in order to be "nice" and it caused 1058 round trips to the server. Am I doing something wrong?

    YES, SCAN is nicer than KEYS, especially when you need to scan all keys in Redis (no pattern specified, or a large portion of keys match the pattern).

    However, in your case, a better solution is to create a secondary index for the keys matching the pattern. Say, you can save these keys in a Redis SET, and scan the SET to get the keys.

    Login or Signup to reply.
  2. Compare to the amount of the keys you have in your db, the performance of your Scan is accurate. Count don’t give a precise number of keys, but give the server a hint of how many “steps” to take for each iteration. So 1058 × 1000 ~= 1058186, it makes sense.

    Your problem is that you use one huge server, so finding the keys taking a lot of iteration, but if you’ll use Keys it will do all those iterations at one call, and will block completely the server till it finish, possibly causing crashes and errors for many reasons.

    In your case, depend on if the client you use implement that correctly, if you’ll use cluster with few small shards instead of one big standalone, the client can direct the request to only shards holding slots that can fits your pattern, saving big amount of the effort.

    If the client you use doesn’t implement that, or you don’t want to have a cluster, the answer from for_stack is accurate.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search