Problem Description:
We tried to start Apache Zeppelin 0.11.1 on the following server.
We followed the installation instructions on the official Apache Zepplin website.
Apache Zeppelin can be started and the notebooks open.
When executing a paragraph, the attached error message appears.
We first tried to implement the whole thing with Docker.
We then also tested the binary version.
However, the same error occurs in both variants.
System Description:
Ubuntu 20.04.6 LTS (GNU/Linux 5.15.0-1063-aws x86_46)
Java: 1.8.0_412 (OpenJDK Runtime Environment – build 1.8.0_412-8u412-ga-1~20.04.1-b08)
Scala: 2.11.12 (OpenJDK 64-Bit Server VM, Java1.8.0_412)
Spark: 3.5.1 (Spark Hadoop3)
Zeppelin: 0.11.1 from (https://zeppelin.apache.org/docs/latest/quickstart/install.html)
REPOSITORY | TAG | IMAGE ID | CREATED | SIZE |
---|---|---|---|---|
apache/zeppelin | 0.11.1 | f6b9613cfb44 | 2 months ago | 8.44GB |
docker run -p 8080:8080 --rm
-v /projects/zeppelin/lib:/opt/zeppelin/.m2/repository
-v /projects/zeppelin/logs:/logs
-v /projects/zeppelin/notebook:/notebook
-v /projects/zeppelin/db:/db
-v /projects/zeppelin/lib/spark-current/spark-3.5.1-bin-hadoop3:/opt/spark
-e ZEPPELIN_LOG_DIR='/logs'
-e ZEPPELIN_NOTEBOOK_DIR='/notebook'
-e ZEPPELIN_JAVA_OPTS="-Dspark.executor.memory=16g -Dspark.cores.max=8 -Dspark.io.compression.codec=snappy"
-e ZEPPELIN_INTP_MEM="-Xmx16g"
-e SPARK_HOME=/opt/spark
--name zeppelin apache/zeppelin
Stacktrace:
org.apache.zeppelin.interpreter.InterpreterException: org.apache.zeppelin.interpreter.InterpreterException: Fail to open SparkInterpreter
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:76)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:861)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:769)
at org.apache.zeppelin.scheduler.Job.run(Job.java:186)
at org.apache.zeppelin.scheduler.AbstractScheduler.runJob(AbstractScheduler.java:135)
at org.apache.zeppelin.scheduler.FIFOScheduler.lambda$runJobInScheduler$0(FIFOScheduler.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: org.apache.zeppelin.interpreter.InterpreterException: Fail to open SparkInterpreter
at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:140)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
... 8 more
Caused by: scala.reflect.internal.FatalError: Error accessing /projects/zeppelin-v2/zeppelin-0.11.1-bin-all/interpreter/spark/._spark-interpreter-0.11.1.jar
at scala.tools.nsc.classpath.AggregateClassPath.$anonfun$list$3(AggregateClassPath.scala:113)
at scala.collection.Iterator.foreach(Iterator.scala:943)
at scala.collection.Iterator.foreach$(Iterator.scala:943)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
at scala.collection.IterableLike.foreach(IterableLike.scala:74)
at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
at scala.tools.nsc.classpath.AggregateClassPath.list(AggregateClassPath.scala:101)
at scala.tools.nsc.util.ClassPath.list(ClassPath.scala:36)
at scala.tools.nsc.util.ClassPath.list$(ClassPath.scala:36)
at scala.tools.nsc.classpath.AggregateClassPath.list(AggregateClassPath.scala:30)
at scala.tools.nsc.symtab.SymbolLoaders$PackageLoader.doComplete(SymbolLoaders.scala:298)
at scala.tools.nsc.symtab.SymbolLoaders$SymbolLoader.complete(SymbolLoaders.scala:250)
at scala.reflect.internal.Symbols$Symbol.completeInfo(Symbols.scala:1542)
at scala.reflect.internal.Symbols$Symbol.info(Symbols.scala:1514)
at scala.reflect.internal.Mirrors$RootsBase.init(Mirrors.scala:258)
at scala.tools.nsc.Global.rootMirror$lzycompute(Global.scala:74)
at scala.tools.nsc.Global.rootMirror(Global.scala:72)
at scala.tools.nsc.Global.rootMirror(Global.scala:44)
at scala.reflect.internal.Definitions$DefinitionsClass.ObjectClass$lzycompute(Definitions.scala:301)
at scala.reflect.internal.Definitions$DefinitionsClass.ObjectClass(Definitions.scala:301)
at scala.reflect.internal.Definitions$DefinitionsClass.init(Definitions.scala:1511)
at scala.tools.nsc.Global$Run.<init>(Global.scala:1213)
at scala.tools.nsc.interpreter.IMain._initialize(IMain.scala:124)
at scala.tools.nsc.interpreter.IMain.initializeSynchronous(IMain.scala:146)
at org.apache.zeppelin.spark.SparkScala212Interpreter.createSparkILoop(SparkScala212Interpreter.scala:195)
at org.apache.zeppelin.spark.AbstractSparkScalaInterpreter.open(AbstractSparkScalaInterpreter.java:116)
at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:124)
... 9 more
Caused by: java.io.IOException: Error accessing /projects/zeppelin-v2/zeppelin-0.11.1-bin-all/interpreter/spark/._spark-interpreter-0.11.1.jar
at scala.reflect.io.FileZipArchive.scala$reflect$io$FileZipArchive$$openZipFile(ZipArchive.scala:190)
at scala.reflect.io.FileZipArchive.root$lzycompute(ZipArchive.scala:238)
at scala.reflect.io.FileZipArchive.root(ZipArchive.scala:235)
at scala.reflect.io.FileZipArchive.allDirs$lzycompute(ZipArchive.scala:272)
at scala.reflect.io.FileZipArchive.allDirs(ZipArchive.scala:272)
at scala.tools.nsc.classpath.ZipArchiveFileLookup.findDirEntry(ZipArchiveFileLookup.scala:76)
at scala.tools.nsc.classpath.ZipArchiveFileLookup.list(ZipArchiveFileLookup.scala:63)
at scala.tools.nsc.classpath.ZipArchiveFileLookup.list$(ZipArchiveFileLookup.scala:62)
at scala.tools.nsc.classpath.ZipAndJarClassPathFactory$ZipArchiveClassPath.list(ZipAndJarFileLookupFactory.scala:58)
at scala.tools.nsc.classpath.AggregateClassPath.$anonfun$list$3(AggregateClassPath.scala:105)
... 36 more
Caused by: java.util.zip.ZipException: error in opening zip file
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.<init>(ZipFile.java:231)
at java.util.zip.ZipFile.<init>(ZipFile.java:157)
at java.util.zip.ZipFile.<init>(ZipFile.java:171)
at scala.reflect.io.FileZipArchive.scala$reflect$io$FileZipArchive$$openZipFile(ZipArchive.scala:187)
... 45 more
2
Answers
**Spark Interpreter Settings **
Binary_Interpreter_Settings_Spark_01
Binary_Interpreter_Settings_Spark_02
Binary_Interpreter_Settings_Spark-Submit_01
Docker_Interpreter_Settings_Spark_01
Docker_Interpreter_Settings_Spark_02
Docker_Interpreter_Settings_Spark_03
Binary_Interpreter_Settings_Spark-Submit_01
I hope my answer is not too late.
I’ve encountered the same problem and spent three days debugging all the yarn logs and source code.
I believe there is a bug in version 0.11.1.
The bug can be found in the following location:
$SPARK_HOME/interpreter
. In this subdirectory (I haven’t checked any other directories), macOS metadata files are being created and referenced.In my opinion, the person who built this version used a Mac and accidentally included these metafiles in the release.
If you go to that directory and look at it, including hidden files, you’ll see files with the prefix ‘._’ that are being used by RemoteInterpreterServer.
Simple Solution
Use a different version.
In the link below, I’ve written an article with more information on this issue. (It’s in Korean, so you might need to use a browser translator.)
https://velog.io/@on5949/%ED%99%95%EC%8B%A4%ED%95%98%EC%A7%80-%EC%95%8A%EC%9D%8C-zeppelin-0.11.1-%EB%B2%84%EA%B7%B8-%EB%A6%AC%ED%8F%AC%ED%8A%B8
I will report this bug to the Zeppelin developers soon.