skip to Main Content

Dataframes saved to S3/S3A from Spark are unencrypted despite settings "fs.s3a.encryption.algorithm" and "fs.s3a.encryption.key" – Ubuntu

Description Within PySpark, even though a DataFrame can be saved to S3/S3A (not AWS, but a S3-compliant storage), its data are saved unencrypted despite that setting fs.s3a.encryption.algorithm (SSE-C) and fs.s3a.encryption.key are used. Reproducibility Generate the key as followed: encKey=$(openssl rand…

VIEW QUESTION

Visual Studio Code – Error when creating SparkSession in PySpark

When I am trying to create a sparksession I get this error: spark = SparkSession.builder.appName("Practice").getOrCreate() py4j.protocol.Py4JError: org.apache.spark.api.python.PythonUtils.getPythonAuthSocketTimeout does not exist in the JVM This is my code: import pyspark from pyspark.sql import SparkSession spark = SparkSession.builder.appName("Practice").getOrCreate() What am I doing…

VIEW QUESTION
Back To Top
Search