...
<property>
<name>hive.metastore.warehouse.dir</name>
<value>hdfs://spark-master-01:9000/skybluelee/skybluelee_warehouse_mysql_5.7</value>
<description>location of default database for the warehouse</description>
</property>
...
the code is a part of /user/spark3/conf/hive-site.xml
At first the value was
hdfs://spark-master-01:9000/kikang/skybluelee_warehouse_mysql_5.7
And I changed the value
hdfs://spark-master-01:9000/skybluelee/skybluelee_warehouse_mysql_5.7
Below there is a code and result
println(spark.conf.get("spark.sql.warehouse.dir")) //--Default : spark-warehouse
spark
.sql("""
SELECT
website,
avg(age) avg_age,
max(id) max_id
FROM
people a
JOIN
projects b
ON a.name = b.manager
WHERE
a.age > 11
GROUP BY
b.website
""")
.write
.mode("overwrite") //--Overwrite mode....
.saveAsTable("JoinedPeople") //--saveAsTable(<warehouse_table_name>)....
sql("SELECT * FROM JoinedPeople").show(1000)
hdfs://spark-master-01:9000/skybluelee/skybluelee_warehouse_mysql_5.7
+--------------------+-------+------+
| website|avg_age|max_id|
+--------------------+-------+------+
|http://hive.apach...| 30.0| 2|
|http://kafka.apac...| 19.0| 3|
|http://storm.apac...| 30.0| 9|
+--------------------+-------+------+
value ‘spark.sql.warehouse.dir’ was changed kikang into skybluelee as I want.
but the location of table "JoinedPeople" doesn’t change. The location is ‘hdfs://spark-master-01:9000/kikang/skybluelee_warehouse_mysql_5.7’ – the first value in hive-site.xml
I want to change the location of default database.
How can I change the default location?
I changed ‘spark-defaults.conf’ and of courese turn off & on the ubuntu. But not effective
2
Answers
I found what I missed!
this is a part of warehouse in mysql_5.7 when I tried first in metastore_db was created, so even I changed the location, that doesn't make change in metastore
May you check what your Spark version is in this scenario?
According to Hive Tables in the official Spark documentation:
Does changing the property in
hive-site.xml
work for you (assuming the Spark version is above 2.0.0)?Does setting the property before the initialization of a Spark session work for you?