skip to Main Content

We are experiencing issues with high load on our dotnet-core (3.1) application.

Beyond a certain amount of connection (virtual users), we encouter a bottleneck, the server is starved and we get request timeout but the process doesn’t crash (no kestrel logs). We are using K6 to benchmark our app. For now the load test only performs GET requests on the login page which trigger one basic SQL request on a small dataset (no join, etc).

We used Visual Studio 2019 Perfomance Profiler tool and perfview to investigate the issue, but none of these tools helped us to identify the portion of code that caused this bottleneck.

I found this article about ThreadPool starvation : https://learn.microsoft.com/fr-fr/archive/blogs/vancem/diagnosing-net-core-threadpool-starvation-with-perfview-why-my-service-is-not-saturating-all-cores-or-seems-to-stall
When we tweak the minimum ThreadPool with arbitrary values as the example after, we’ve got a huge improvement in performance (not on the graph). This seems like a stop gap, how bad is it to use it ?

System.Threading.ThreadPool.SetMinThreads(200, 200);

benchmarks that show the starvation
Explanation : 2C_2G/100.csv => 2 cores, 2Go RAM, 100 virtual users

Environment:

  • nginx as reverse proxy
  • K6 as benchmark tool
  • dotnet-core 3.1 (with EntityFramework)
  • operating system : Ubuntu 20.04
  • mariadb as database

2

Answers


  1. You’re executing long-running code while on the thread pool.

    Here’s a way to do that with Task.Run:

    public async Task<byte> CalculateChecksumAsync(Stream stream) => await Task.Run(() =>
    {
        int i;
        byte checksum = 0;
        while ((i = stream.ReadByte()) >= 0)
        {
            checksum += (byte)i;
        }
        return checksum;
    });
    

    To the casual observer that looks like completely async code because there’s
    async/await and Task everywhere.

    But in fact that will tie up a thread pool thread for as long as it takes to
    read the stream (which depends not just on how much data comes through, but the
    bandwidth of the stream as well).

    When the thread pool is starved then there’s a one-second delay before the
    thread pool will spawn a new thread. That means that subsequent calls to
    Task.Run will have their work delayed for that long
    even if your CPU is sitting idle.

    Alternatives:

    • Use async methods instead of synchronous methods where possible (e.g. Stream.ReadAsync), especially when you’re on the thread pool
    • Spawn long-running tasks for long-running code:
      public async Task<byte> CalculateChecksumAsync(Stream stream) => await Task.Factory.StartNew(() =>
      {
          int i;
          byte checksum = 0;
          while ((i = stream.ReadByte()) >= 0)
          {
              checksum += (byte)i;
          }
          return checksum;
      },
      TaskCreationOptions.LongRunning);
      

    The TaskCreationOptions.LongRunning flag tells C# that you want a new thread
    spawned immediately just for your work.

    Login or Signup to reply.
  2. Yes, increasing the minimum worker thread count is not a solution, but a gap-stopper.

    It seems that you are able to reproduce the issue. In that case, I suggest using dotnet-dump to figure out where the blocking code is. Follow the steps in this YouTube Video on diagnosing thread pool starvation, it is pretty effective.

    BTW, for the gap-stopper code, I would read and keep the 2nd argument for the async IO pool count if that’s not causing any trouble, as well as checking the setup result of the call:

    int minWorker, minIOC;
    // Get the current settings.
    ThreadPool.GetMinThreads(out minWorker, out minIOC);
    // Change the minimum number of worker threads to four, but
    // keep the old setting for minimum asynchronous I/O 
    // completion threads.
    if (ThreadPool.SetMinThreads(200, minIOC))
    {
        // The minimum number of threads was set successfully.
    }
    else
    {
        // The minimum number of threads was not changed.
    }
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search