aws-glue Questions

Read Json in Pyspark

December 20, 2022
Douglas Oliveira
2 Answers

I want to read a JSON file in PySpark, but the JSON file is in this format (without comma and square brackets): {"id": 1, "name": "jhon"} {"id": 2, "name": "bryan"} {"id": 3, "name": "jane"} Is there an easy way to…

VIEW QUESTION

aws_glue_trigger in terraform creates invalid expression schedule in aws

December 8, 2022
Display name
2 Answers

I am trying to create a AWS Glue job scheduler in terraform based on condition where Crawler triggered by Cron succeeded: resource "aws_glue_trigger" "trigger" { name = "trigger" type = "CONDITIONAL" actions { job_name = aws_glue_job.job.name } predicate { conditions…

VIEW QUESTION

How to use CloudFormation to update AWS Glue Jobs

November 19, 2022
wz366
2 Answers

We have many AWS Glue jobs and we are only updating the job code, which are scripts stored in S3. The problem is CloudFormation couldn't tell when and when not to update our Glue jobs because all CloudFormation template parameters…

VIEW QUESTION

Automating CSV analysis?

October 20, 2022
Gautam
2 Answers

My e-commerce company generates lots of CSV data. To track order status, the team must download a number of trackers. Creating a relationship and subsequently analyse,its a time-consuming process. Which AWS low-code solution can be used to automate the workflow?

VIEW QUESTION

ParamValidationError: Parameter validation failed: Bucket name must match the regex – Amazon web services

September 29, 2022
averma
2 Answers

I'm trying to run a Glue job by calling it from lambda function. The glue job in itself is running perfectly fine but when I trigger it from lambda function, I get the below error: [ERROR] ParamValidationError: Parameter validation failed:…

VIEW QUESTION

How to set Spark Config in an AWS Glue job, using Scala Spark? – Amazon web services

September 19, 2022
jamesbascle
2 Answers

When running my job, I am getting the following exception: Exception in User Class: org.apache.spark.SparkException : Job aborted due to stage failure: Task 32 in stage 2.0 failed 4 times, most recent failure: Lost task 32.3 in stage 2.0 (TID…

VIEW QUESTION

AWS Glue Job : An error occurred while calling getCatalogSource. None.get – Amazon web services

September 19, 2022
Brahim BEN ADDI
2 Answers

I was using Password/Username in my aws glue conenctions and now I switched to Secret Manager. Now I get this error when I run my etl job : An error occurred while calling o89.getCatalogSource. None.get Even tho the connections and…

VIEW QUESTION

set_glue_version exception after upgrading aws-glue-sessions – Amazon web services

September 19, 2022
the0ther0ne
5 Answers

Using interactive Glue Sessions in a Jupyter Notebook was working correctly with the aws-glue-sessions package version 0.32 installed. After upgrading with pip3 install --upgrade jupyter boto3 aws-glue-sessions to version 0.35, the kernel would not start. Gave an error message in…

VIEW QUESTION

Execute only one Glue job at a time / sequential Glue job execution – Amazon web services

September 12, 2022
Sapnesh Naik
2 Answers

Currently, we have the following AWS setup for executing Glue jobs. An S3 event triggers a lambda function execution whose python logic triggers 10 AWS Glue jobs. S3 -> Trigger -> Lambda -> 1 or more Glue Jobs. With this…

VIEW QUESTION

Pyspark unable to overwrite csv in S3 – Amazon web services

September 8, 2022
tarun
2 Answers

I am facing issue when i try to write file in S3 as CSV. I am basically trying to overwrite existing single csv file in an S3 folder. Below is the peice of code in I'm running. I am getting…

VIEW QUESTION