skip to Main Content

Azure – Querying and Inserting records from SQL Server using Python

We are porting some code from SSIS to Python. As part of this project, I'm recreating some packages but I'm having issues with the database access. I've managed to query the DB like this: employees_table = (spark.read .format("jdbc") .option("url", "jdbc:sqlserver://dev.database.windows.net:1433;database=Employees;encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30;")…

VIEW QUESTION

Dataframes saved to S3/S3A from Spark are unencrypted despite settings "fs.s3a.encryption.algorithm" and "fs.s3a.encryption.key" – Ubuntu

Description Within PySpark, even though a DataFrame can be saved to S3/S3A (not AWS, but a S3-compliant storage), its data are saved unencrypted despite that setting fs.s3a.encryption.algorithm (SSE-C) and fs.s3a.encryption.key are used. Reproducibility Generate the key as followed: encKey=$(openssl rand…

VIEW QUESTION

Visual Studio Code – Error when creating SparkSession in PySpark

When I am trying to create a sparksession I get this error: spark = SparkSession.builder.appName("Practice").getOrCreate() py4j.protocol.Py4JError: org.apache.spark.api.python.PythonUtils.getPythonAuthSocketTimeout does not exist in the JVM This is my code: import pyspark from pyspark.sql import SparkSession spark = SparkSession.builder.appName("Practice").getOrCreate() What am I doing…

VIEW QUESTION
Back To Top
Search