I am new to Zeppelin and Pyspark.
I have tried in vain to get Zeppelin to run with Pyspark.
My setup:
- 4 x Raspberry 4(8GB)
- Ubuntu Server 64bit 20.04
- Hadoop: 3.2.2
- Yarn
- Spark 3.1.1 & Hadoop integrated
- Zeppelin 0.9
Pi01 as Master Pi02-04 as Worker. Spark installed on all Pi.
Hadoop & Yarn are running without any problems.
Pyspark shell runs and I can execute commands.
But same command in zeppelin fails..
java.io.IOException: Cannot run program "python": error=2, No such file or directory
However, with Zeppelin %Pyspark as well as %python does not work. I have now searched for many hours but have not found a solution (switching from Debian to Ubuntu and back again).
I also tried to access the Spark Master from my Win10 PC with Jupyter Notebook but I don’t know how to do that and finally gave up.
Any Ideas?
I appreciate your help.
2
Answers
In the end, I got it working. I used Spark version 2.4.7 and now it works. I don't know if this is a coincidence or if it was really necessary.
Try adding PYTHON installation path to Zeppelin config