skip to Main Content

I have written pretty basic rust server which makes http call to defined url in environment variable, parse json response and return.

Source Code
| Flamegraph – IO Call
| Dockerfile
| Flamegraph – Static Json

Nginx request – throughput ~30000 requests/second.

curl -v 172.31.50.91/
*   Trying 172.31.50.91...
* TCP_NODELAY set
* Connected to 172.31.50.91 (172.31.50.91) port 80 (#0)
> GET / HTTP/1.1
> Host: 172.31.50.91
> User-Agent: curl/7.61.1
> Accept: */*
>
< HTTP/1.1 200 OK
< Server: nginx/1.20.0
< Date: Sun, 05 Mar 2023 07:58:39 GMT
< Content-Type: application/octet-stream
< Content-Length: 36
< Connection: keep-alive
< Content-Type: application/json
<
* Connection #0 to host 172.31.50.91 left intact
{"status": 200, "msg": "Good Luck!"}

Nginx and Rust server are running on separate ec2 instance c6a.large in same network.

In rust server I have 2 APIs

  1. Returns static response => Throughput 47000 requests/second
  2. Make HTTP request to Nginx server -> Parse Json -> Return parsed data. => Throughput 2462 requests/second. [Issue]

Similar benchmark done in GoLang which uses Fiber as http server in prefork model, json-iterator. It’s able to make call to nginx and return response at rate ~20000 requests/second. Which means there are no issues with infra/docker/client used to test rust server.

There must be something missing in my rust code which causes such regression when http call introduced.

Need help with understanding why this would happen and how to improve rust code.

Benchmark result

[ec2-user@ip-172-31-50-91 ~]$ hey -z 10s  http://172.31.50.22:80/io

Summary:
  Total:        10.0168 secs
  Slowest:      0.0692 secs
  Fastest:      0.0006 secs
  Average:      0.0203 secs
  Requests/sec: 2462.4534

  Total data:   813978 bytes
  Size/request: 33 bytes

Response time histogram:
  0.001 [1]     |
  0.007 [12766] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.014 [1185]  |■■■■
  0.021 [227]   |■
  0.028 [494]   |■■
  0.035 [1849]  |■■■■■■
  0.042 [3840]  |■■■■■■■■■■■■
  0.049 [3127]  |■■■■■■■■■■
  0.055 [992]   |■■■
  0.062 [174]   |■
  0.069 [11]    |

My attempts to improve perf.

  • Both golang and rust are running on docker container on same instance one at a time.
  • System ulimit / somaxcon has been updated to not cause any bottleneck, since static response able to perform 47K rps, it shouldn’t cause limitation
  • Moved external url to lazy_static but it didn’t improve performance
lazy_static! {
    static ref EXTERNAL_URL: String = env::var("EXTERNAL_URL").unwrap();
}
  • Tried changing tokio flavour config, workerthreads = 2, 10, 16 – it didn’t improve perf.
#[tokio::main(flavor = "multi_thread", worker_threads = 10)]
  • Looked into how to make sure hyper network call is being done in tokio async compatible way -> Earlier It had 247 requests/second. Was able to improve IO call by 10x via moving to stream based response processing. Reaching 2400 but still there is scope to improve.

IO Call API – GitHub Link

pub async fn io_call( State(state): State<AppState>) -> Json<IOCall> {
    let external_url = state.external_url.parse().unwrap();
    let client = Client::new();
    let resp = client.get(external_url).await.unwrap();
    let body = hyper::body::aggregate(resp).await.unwrap();

    Json(serde_json::from_reader(body.reader()).unwrap())
}

This usecase is very similar to reverse proxy server.

2

Answers


  1. Chosen as BEST ANSWER

    Thanks to @kmdreko

    Moving hyper client initialization to AppState resolved the problem.

    Git diff

     pub async fn io_call(State(state): State<AppState>) -> Json<IOCall> {
        let external_url = state.external_url.parse().unwrap();
        let resp = state.client.get(external_url).await.unwrap();
        let body = hyper::body::aggregate(resp).await.unwrap();
    
        Json(serde_json::from_reader(body.reader()).unwrap())
    }
    
    [ec2-user@ip-172-31-50-91 ~]$ hey -z 10s http://172.31.50.22:80/io
    
    
    Summary:
      Total:        10.0026 secs
      Slowest:      0.0235 secs
      Fastest:      0.0002 secs
      Average:      0.0019 secs
      Requests/sec: 26876.1036
    
      Total data:   8871456 bytes
      Size/request: 33 bytes
    
    Response time histogram:
      0.000 [1]     |
      0.003 [212705]        |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
      0.005 [39980] |■■■■■■■■
      0.007 [10976] |■■
      0.010 [4259]  |■
      0.012 [794]   |
      0.014 [94]    |
      0.016 [17]    |
      0.019 [0]     |
      0.021 [4]     |
      0.023 [2]     |
    
    
    Latency distribution:
      10% in 0.0006 secs
      25% in 0.0009 secs
      50% in 0.0013 secs
      75% in 0.0022 secs
      90% in 0.0038 secs
      95% in 0.0052 secs
      99% in 0.0083 secs
    
    Details (average, fastest, slowest):
      DNS+dialup:   0.0000 secs, 0.0002 secs, 0.0235 secs
      DNS-lookup:   0.0000 secs, 0.0000 secs, 0.0000 secs
      req write:    0.0000 secs, 0.0000 secs, 0.0086 secs
      resp wait:    0.0018 secs, 0.0002 secs, 0.0234 secs
      resp read:    0.0001 secs, 0.0000 secs, 0.0109 secs
    
    Status code distribution:
      [200] 268832 responses
    
    

  2. You need to narrow down the possible cause… Here are a few pointers that should lead you in the right direction…

    Server

    • Did you check the CPU load on the server? That’s probably the first thing you should take a look at.
    • Did you make sure you’ve compiled in release mode?
    • Are those JSONs small (if they are large, they might increase the CPU and network cost).

    Test client

    Is it possible that your test client is the bottleneck?

    • Does it manage to send enough requests in parallel?
    • You might want to check its CPU, RAM, etc. if there are signs the client can’t send enough requests.
    • If it is too far away, that would increase the latency. (But if you managed to achieve a lot more static requests from the same client/server machines, that shouldn’t be the problem.)

    Network

    The following shouldn’t be the case if you’ve run the static/Go tests with the same setup, but just in case:

    • As previously mentioned, make sure you’re not testing with too large JSONs.
    • Make sure you are not hitting some limit in the number of concurrent requests. Sometimes AWS can limit/throttle your requests.

    If the above doesn’t help, try testing the same setup locally and see what you can achieve. If no difference, try with smaller JSONs and see if that helps.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search