skip to Main Content

Amazon web services – Why does AWS EMR PySpark get stuck when I try to aggregate dataframe

I'm running a Spark application in AWS EMR. The code is like this: with SparkSession.builder.appName(f"Spark App").getOrCreate() as spark: dataframe = spark.read.format('jdbc').options( ... ).load() print("Log A") max_date_result = dataframe.agg(max_(date_format('date', 'yyyy-MM-dd')).alias('max_date')).collect()[0] print("Log B") This application always gets stuck for a long time…

VIEW QUESTION

Azure – Save a PySpark dataframe in a SQL database in Synapse gives the error "IllegalArgumentException: KrbException: Cannot locate default realm"

I tried to save a PySpark dataframe in a SQL database in Synapse: test = spark.createDataFrame([Row("Sarah", 28), Row("Anne", 5)], ["Name", "Age"]) test.write .format("jdbc") .option("url", "jdbc:sqlserver://XXXX.sql.azuresynapse.net:1433;database=azlsynddap001;encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.sql.azuresynapse.net;loginTimeout=30;Authentication=ActiveDirectoryIntegrated") .option("forwardSparkAzureStorageCredentials", "true") .option("dbTable", "test_CP") .save() I got the following error: IllegalArgumentException: KrbException: Cannot locate default…

VIEW QUESTION
Back To Top
Search