skip to Main Content

How can I change location of default database for the warehouse?(spark) – Ubuntu

... <property> <name>hive.metastore.warehouse.dir</name> <value>hdfs://spark-master-01:9000/skybluelee/skybluelee_warehouse_mysql_5.7</value> <description>location of default database for the warehouse</description> </property> ... the code is a part of /user/spark3/conf/hive-site.xml At first the value was hdfs://spark-master-01:9000/kikang/skybluelee_warehouse_mysql_5.7 And I changed the value hdfs://spark-master-01:9000/skybluelee/skybluelee_warehouse_mysql_5.7 Below there is a code and result println(spark.conf.get("spark.sql.warehouse.dir"))…

VIEW QUESTION

Pyspark – Flatten nested json

I have a json that looks like this: [ { "event_date": "20221207", "user_properties": [ { "key": "user_id", "value": { "set_timestamp_micros": "1670450329209558" } }, { "key": "doc_id", "value": { "set_timestamp_micros": "1670450329209558" } } ] }, { "event_date": "20221208", "user_properties": [ {…

VIEW QUESTION

Visual Studio Code – Error when creating SparkSession in PySpark

When I am trying to create a sparksession I get this error: spark = SparkSession.builder.appName("Practice").getOrCreate() py4j.protocol.Py4JError: org.apache.spark.api.python.PythonUtils.getPythonAuthSocketTimeout does not exist in the JVM This is my code: import pyspark from pyspark.sql import SparkSession spark = SparkSession.builder.appName("Practice").getOrCreate() What am I doing…

VIEW QUESTION
Back To Top
Search