Amazon web services - Unable to read text file in Glue job

RushHour
June 27, 2024
102 views
0 votes
2 Answers

I am trying to read the schema from a text file under the same package as the code but cannot read that file using the AWS glue job. I will use that schema for creating a dataframe using Pyspark. I can load that file locally. I am zipping the code files as .zip, placing them under the s3 bucket, and then referencing them in the glue job. Every other thing works fine. No problem there. But when I try the below code it doesn’t work.

file_path = os.path.join(Path(os.path.dirname(os.path.relpath(__file__))), "verifications.txt")
multiline_data = None
with open(file_path, 'r') as data_file:
   multiline_data = data_file.read()
self.logger.info(f"Schema is {multiline_data}")

This code throws the below error:

Error Category: UNCLASSIFIED_ERROR; NotADirectoryError: [Errno 20] Not a directory: 'src.zip/src/ingestion/jobs/verifications.txt'

I also tried with abs_path but it didn’t help either. The same block of code works fine locally.

I also tried directly passing the "./verifications.txt" path but no luck.

So how do I read this file?

Answers

- Bogdan
- June 27, 2024 at 1:36 pm
- 0 votes
0
AWS Glue scripts typically run in a managed environment, meaning your file is not visible to the ETL script. The reason the import works on your local machine is that the file is accessible from there since both are on the same machine. For such jobs, consider using S3 to store your files.

Login or Signup to reply.

- BrianWylie
- June 27, 2024 at 6:45 pm
- 0 votes
0
As @Bogdan mentioned the way to do this is use S3 to store the verifications.txt file. Here’s some example code using boto3
```
import boto3

# Hardcoded S3 bucket/key (these are normally passed in as Glue Job params)
s3_bucket = 'your-bucket-name'
s3_key = 'path/to/verifications.txt'

# Read data from S3 using boto3
s3_client = boto3.client('s3')
response = s3_client.get_object(Bucket=s3_bucket, Key=s3_key)
multiline_data = response['Body'].read().decode('utf-8')
```
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Amazon web services – Unable to read text file in Glue job

Answers