I need to configure Athena to connect to a couple of data sources – MySQL RDS, Postgres RDS, and a custom federation Lambda.
Ideally, I’d like to configure these via Cloudformation so I can commit these configurations. I cannot seem seem find any resources to configure these data sources – https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/AWS_Athena.html
The AWS CLI doesn’t seem to support the creation of data sources as well. https://docs.aws.amazon.com/cli/latest/reference/athena/
Are there APIs or other ways to configure Athena data sources?
2
Answers
you should use this Constuct CfnApplication
The magic you’re looking for is the
AWS::Athena::DataCatalog
resource. You can use that to create a new data catalog that points to your federated data source by providing aType
value ofLAMBDA
. Here’s a sample JSON template for creating a new federated data source:Or the CLI
The provided function is the ARN of the Lambda function for your data source, which also needs to get deployed.
For the officially supported data sources in aws-athena-query-federation repo, they are available in the Serverless Application Repository (SAR) and can also be deployed using the AWS SAM CLI as they all typically have various additional roles and configuration required to run.
If you want to use CloudFormation, you can grab the SAR Application ARN and parameters. These can be found either when you’re creating the data source or by searching the SAR. For MySql the ARN is
arn:aws:serverlessrepo:us-east-1:292517598671:applications/AthenaMySQLConnector
so the CloudFormation template would look something like this (plus all the additional parameters):So to sum up, create the Lambda then create the
DataCatalog
resource to point to the Lambda.