I have AWS Lambda function that connects over a VPC into a privately hosted SQL Server database.
I intermittently get this exception:
Microsoft.Data.SqlClient.SqlException (0x80131904): A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: TCP Provider, error: 40 – Could not open a connection to SQL Server: Could not open a connection to SQL Server)
Since the DNS is not setup I use an IP address to connect to the SQL Server. My understanding is that the environment is stateless which means I can’t make use of SQL Connection Pooling. My connection string I am currently using is:
Data Source=xx.xx.x.xx;Initial Catalog=DbName;Persist Security Info=True;User ID=user;Pooling=false;Password=password;TrustServerCertificate=true
I have turned off connection pooling in the SQL Connection string.
I also updated my code to explicitly close the connection
await using var connection = new SqlConnection(AppSettings.ConnectionString);
await connection.OpenAsync();
var payrollOrderCategories = await connection.QueryAsync<Dto.Payroll.PayrollOrderCategory>(
"SELECT Id, Name FROM PayrollOrderCategory"
);
await connection.CloseAsync();
The logs also indicates the following:
at Microsoft.Data.ProviderBase.DbConnectionFactory.CreateNonPooledConnection(DbConnection owningConnection, DbConnectionPoolGroup poolGroup, DbConnectionOptions userOptions)
which proves it is not using pooling but the problem persists intermittently. The API also calls another internal service which it is always able to reach, which leads me to think it is not a network related issue on the VPC
2
Answers
This intermittent issue with the service can be attributed to the misconfigured route tables in the VPC. Since only two out of the three subnets were correctly configured to connect to the SQL server, the service's availability depended on which subnet was being utilized at any given time. If the traffic was routed through one of the two properly configured subnets, the service functioned normally. However, whenever the traffic happened to go through the third, misconfigured subnet, the service became unreachable. This inconsistency in routing led to the intermittent accessibility of the service.