I have followed the official documentation to set up Apache Spark on my local Windows 11 machine.
This setup includes:
- Proper installation of Apache Spark, setting up the env variables etc.
- Creation of a virtual env specifically for Python 3.9 to ensure compatibility with
PySpark
.
Despite these steps, I’m encountering a ShowString
error in VS Code:
While I can initiate a Spark session successfully and it starts without errors, I run into problems when trying to use df.show()
to display DataFrame
contents. The method fails and returns a ShowString
error.
Not sure if the current version of java17 spark3.5 are supporting showstring on win11.
But any suggestions are highly appreciated 🙂
[enter image description here](https://i.sstatic.net/2fBDWU3M.png)
[enter image description here](https://i.sstatic.net/3mjFNGlD.png)
[enter image description here](https://i.sstatic.net/gTm3ecIz.png)
I’ve tried multiple debug steps – verifying that the env variables are currently pointed and making sure that the spark sessions starts.
2
Answers
Error Message:
Was looking at the spark UI to debug this and realized that the SparkEnv was basically looking for a python3 executable file
python3.exe
usingpython.exe
and explicitly specifying in the python path helped.