skip to Main Content

I have a FastAPI application that uses Server Sent Events (SSE) for streaming the response of a generative AI model, similar to the API of OpenAI. The application is deployed using the following architecture:

  • FastAPI application hosted by Gunicorn with uvicorn worker
  • EKS that runs the dockerized FastAPI application
  • ALB controlled by the ingress controller installed on the EKS
  • API Gateway that adds an authentication layer to all of the services hosted on the EKS

Cloud setup

When I run the FastAPI application with the SSE endpoint locally, everything works perfectly. However, when deploying the application with the above-mentioned stack, the SSE response is not streamed back, but returned when the stream completes with all the chunks at once.

After investigating, I discovered that the issue occurs when I add the API Gateway layer, which I need for authentication. The response isn’t streamed anymore, and the content-length header is added when passing through the API Gateway. This makes it look like the API Gateway is waiting for the response to fully complete before adding the header and sending it back to the client.

Another problem I encountered is that the request times out after 30 seconds due to the API Gateway, while the SSE response could take longer than that.

I am looking for a solution to support SSE while keeping the authentication layer outside of the application code. Any suggestions or guidance on how to achieve this would be greatly appreciated!

2

Answers


  1. This is how API Gateway works, full stop, there is no streamed responses and the response timeout is at 30 seconds. You’ve already concluded this. The mistake comes in with your expectation:

    which I need for authentication

    You don’t need APIGW for authentication, You can directly verify users authentication and authorize them in your application. Using APIGW for authentication, is really an abuse of what it can do as it is not meant for that purpose, which is why you are running into all sorts of issues. For the cost of APIGW, adding it in also does not make sense.

    We should investigate why you believe you can’t add authentication into your application. What’s stopping you from doing that?

    Login or Signup to reply.
  2. Original question:

    After investigating, I discovered that the issue occurs when I add the
    API Gateway layer, which I need for authentication.

    Based on your diagram, you can try authenticate users with Cognito. In this case, you do not need API GATEWAY which is not design for streaming.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search