Athena’s documentation states that Port 444 must be open to support streaming query results.
I do encounter error while querying Athena via JDBC, and the error is gone as soon as I disable query result streaming and use pagination.
I am confused by that “keep port 444 open” part” – what does that mean to a fully managed, serverless offering like Athena – nothing more from the doc is said about how to do that and all my googling effort cannot provide a satisfactory answer.
What VPC is used by Athena? And what security group is used? Can I alternate the rules to allow outbound traffics via port 444?
What is the missing piece?
2
Answers
To have your JDBC driver works well with the Athena, check following 2 points:
IAM Permission: Add the
athena:GetQueryResultsStream
policy to the principal, whose access_key is used to configurate the JDBC driver. You may need additional permissions. The policyathena:GetQueryResultsStream
only allows you to stream the query result.Port 444 is not blocked, all along the way: think about the complete network journey
Caveat: I haven’t used the China-Regions which you’re linking to, and I think they may be subtly different from the "rest" of the AWS Global Infrastructure so take this with a grain of salt.
The docs outline the following point, which helps to explain when this affects you:
If you’re calling the Athena service from a resource running inside a VPC via an Interface VPC-Endpoint, this interface endpoint needs to have a security group attached, that also opens port 444 for inbound traffic, not only the usual suspects (80, 443).
If you’re not using an interface VPC-endpoint and instead make a call to the public Athena endpoint (the default), this won’t matter to you as AWS will ensure that this can receive traffic on port 444.