I need to connect to Athena using Python.
The code used is as follows:
import pyathena
import pandas as pd
athena_conn = pyathena.connect(access_key,
secret_key,
s3_staging_dir,
region_name)
df = pd.read_sql("SELECT * FROM db.tableLIMIT 10", athena_conn)
df.head(5)
I, personally don’t have access to Athena with my AWS, hence I’m borrowing the access_key
and secret_access_key
from my colleague, who has access to Athena.
I get the following error while running the code :
An error occurred (AccessDeniedException) when calling the StartQueryExecution operation:
User: arn:aws:iam::xxxxx:user/xxxx is not authorized to perform: athena:StartQueryExecution on resource:
arn:aws:athena:us-east-1:xxxx:workgroup/primary because no identity-based policy allows the
athena:StartQueryExecution action
unable to rollback
Is it because my account doesn’t have access to Athena?
2
Answers
From pyathena ยท PyPI documentation:
Note that each parameter is specified with a name
aws_access_key_id="YOUR_ACCESS_KEY_ID"
, rather than merely being positional. It might be that you need to specify the name of each parameter when passing them to the function.Also, make sure the provided credentials are associated with an IAM User that has permission to use Amazon Athena AND has permission to access the underlying data in Amazon S3.
Assuming the user in the error message you redacted is your colleagues user then he doesn’t have the required permissions. At least not for the workgroup you are using.
It’s hard to say what the exact permissions your collegue has without posting the policy document attached to his user.
However, the error message is quite clear that he is lacking the permission:
athena:StartQueryExecution
. This permission must be given in an Identity Based Policy to his user.Note that the permission must also be given for the resource in question (or for * for all resources).
An example of a valid policy document for this permission would be:
This policy allows StartQueryExecution on all workgroups in account
12345
. It’s possible he is lacking this permission entirely or has it for a specific workgroup.If he has the permission for a specific workgroup you should configure your client to run the query in that workgroup. Currently you are trying to run it on the default
primary
workgroup.