skip to Main Content

I need to connect to Athena using Python.

The code used is as follows:

import pyathena
import pandas as pd

athena_conn = pyathena.connect(access_key, 
                 secret_key,
                 s3_staging_dir,
                 region_name)

df = pd.read_sql("SELECT * FROM db.tableLIMIT 10", athena_conn)
df.head(5)

I, personally don’t have access to Athena with my AWS, hence I’m borrowing the access_key and secret_access_key

from my colleague, who has access to Athena.

I get the following error while running the code :

An error occurred (AccessDeniedException) when calling the StartQueryExecution operation: 

User: arn:aws:iam::xxxxx:user/xxxx is not authorized to perform: athena:StartQueryExecution on resource:
arn:aws:athena:us-east-1:xxxx:workgroup/primary because no identity-based policy allows the 
athena:StartQueryExecution action
unable to rollback

Is it because my account doesn’t have access to Athena?

2

Answers


  1. From pyathena · PyPI documentation:

    from pyathena import connect
    
    cursor = connect(aws_access_key_id="YOUR_ACCESS_KEY_ID",
                     aws_secret_access_key="YOUR_SECRET_ACCESS_KEY",
                     s3_staging_dir="s3://YOUR_S3_BUCKET/path/to/",
                     region_name="us-west-2").cursor()
    

    Note that each parameter is specified with a name aws_access_key_id="YOUR_ACCESS_KEY_ID", rather than merely being positional. It might be that you need to specify the name of each parameter when passing them to the function.

    Also, make sure the provided credentials are associated with an IAM User that has permission to use Amazon Athena AND has permission to access the underlying data in Amazon S3.

    Login or Signup to reply.
  2. Assuming the user in the error message you redacted is your colleagues user then he doesn’t have the required permissions. At least not for the workgroup you are using.

    It’s hard to say what the exact permissions your collegue has without posting the policy document attached to his user.

    However, the error message is quite clear that he is lacking the permission: athena:StartQueryExecution. This permission must be given in an Identity Based Policy to his user.

    Note that the permission must also be given for the resource in question (or for * for all resources).

    An example of a valid policy document for this permission would be:

    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Sid": "VisualEditor0",
                "Effect": "Allow",
                "Action": "athena:StartQueryExecution",
                "Resource": "arn:aws:athena:*:12345:workgroup/*"
            }
        ]
    }
    

    This policy allows StartQueryExecution on all workgroups in account 12345. It’s possible he is lacking this permission entirely or has it for a specific workgroup.

    If he has the permission for a specific workgroup you should configure your client to run the query in that workgroup. Currently you are trying to run it on the default primary workgroup.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search