skip to Main Content

In code of my colleagues I’ve noticed code like this:

private Task FetchAllKeysFromRedis(List<string> keys, ConcurrentBag<RedisModel> resultsBag, CancellationToken cancellationToken)
{
        var parallelism = Environment.ProcessorCount;
        var semafore = new SemaphoreSlim(initialCount: parallelism, maxCount: parallelism);
        var tasks = new List<Task>();
        foreach (var key in keys)
        {
            cancellationToken.ThrowIfCancellationRequested();
            tasks.Add(FetchFromRedis(key, semafore, resultsBag, cancellationToken));
        }

        return Task.WhenAll(tasks);

List<string> keys approximately contains around 1000 keys, and FetchFromRedis method asynchronously execute I/O operaion (fetch from Redis), so to sum up it executes around 1000 of I/O operations.

Critical section look like this:

private async Task FetchFromRedis(string key, SemaphoreSlim semafore, ConcurrentBag<RedisResult> resultsBag, CancellationToken cancellationToken)
{
     await semafore.WaitAsync(cancellationToken);
     try
     {
         var redisResult = await _getRedisResultFromRedis.ExecuteAsync(key, cancellationToken);
         if (redisResult != null)
              resultsBag.Add(redisResult);
      
         finally
         {
            semafore.Release();
         }
      }

My question: is Environment.ProcessorCount sensible maxCount for SemaphoreSlim? As it will limit count of threads that can step into critical section to very low, like 8, and time of execution will be much longer?

If it not sensible, then what is sensible value for maxCount of threads?

My opinion: since it involves IO, most of the time will just be waiting, no processor time needed. So limiting concurrency to the number of cores make little sense.

2

Answers


  1. I would agree with your reasoning, since the work looks to be network requests that is limited by latency or bandwith, restricting the concurrency to the number of CPU cores make little sense.

    You probably need to do some testing to find a suitable maxCount value. But ideally you should have a system that is self-adapting to the actual circumstances. I’m honestly not sure what the best design would be, but I would take a look at dataflow to see if that can provide a better way to limit concurrency. Or at least expose the value as configuration so it can be adjusted at some later time.

    Login or Signup to reply.
  2. Configuring the degree of parallelism with the value Environment.ProcessorCount is sensible in only one case: In case you have absolutely no clue how to configure this setting. In such a case any value will be equally random and arbitrary, so why not choose the Environment.ProcessorCount, which reflects somehow the capabilities of the current machine?

    A similar dilemma was surfaced while designing the Parallel.ForEachAsync API, that debuted on .NET 6. After evaluating the available options, the Microsoft API designers chose the Environment.ProcessorCount as the value for the MaxDegreeOfParallelism, when this configuration is not explicitly provided. It’s ironic that the synchronous Parallel.ForEach API, for which the Environment.ProcessorCount would be an even more sensible default, it has actually -1 as the default, which means "unlimited" parallelism, or practically limited by the ThreadPool availability.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search