[ZEPPELIN-6416] Fix zeppelin-interpreter-shaded leak via zeppelin-jupyter-interpreter scope#5246
Conversation
…yter-interpreter scope zeppelin-jupyter-interpreter/pom.xml redeclared its dependency on zeppelin-interpreter-shaded without specifying scope, defaulting it to compile and overriding the parent (zeppelin-interpreter-parent)'s provided scope. This made the shaded jar transitive to anything that depends on zeppelin-jupyter-interpreter (notably spark-interpreter), and in turn into zeppelin-integration's test classpath. Mixing the shaded and unshaded org.eclipse.aether.* in the same JVM produced ClassCastException at InterpreterSettingManager:186 during MiniZeppelinServer startup, breaking all 7 selenium IT cases since the ZEPPELIN-6355 zengine merge changed the dependency-resolution order that previously masked the issue. Production runtime is not affected: the Zeppelin distribution does not ship zeppelin-interpreter-shaded on the server JVM classpath, so the two-JVM isolation introduced by ZEPPELIN-3689 still holds for deployed installations. Changes: - zeppelin-jupyter-interpreter/pom.xml: drop the local redeclaration of zeppelin-interpreter-shaded so the parent's <scope>provided</scope> applies (matching every other interpreter module). A comment documents why the redeclaration must not return. - zeppelin-server/pom.xml: add a maven-enforcer-plugin bannedDependencies rule that fails the build if zeppelin-interpreter-shaded ever appears on the server classpath (compile or test, transitive). - zeppelin-integration/pom.xml: same enforcer rule guarding the in-process MiniZeppelinServer test classpath. A follow-up (ZEPPELIN-6417) tracks the structural decoupling of JupyterKernelInterpreter into a separate kernel-client library so interpreter modules never depend on the %jupyter magic interpreter artifact directly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
6de5674 to
4abec5c
Compare
There was a problem hiding this comment.
Pull request overview
Fixes a Maven dependency-scope regression where zeppelin-interpreter-shaded leaks onto the in-process server JVM test classpath (notably zeppelin-integration Selenium ITs), causing ClassCastException due to shaded vs unshaded org.eclipse.aether.* type mismatches.
Changes:
- Removes the redundant
zeppelin-interpreter-shadeddependency re-declaration fromzeppelin-jupyter-interpreterso it correctly inheritsprovidedscope fromzeppelin-interpreter-parent. - Adds
maven-enforcer-pluginbannedDependencieschecks inzeppelin-serverandzeppelin-integrationto prevent future shaded-jar leakage (including transitive paths).
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| zeppelin-server/pom.xml | Adds an enforcer rule to ban org.apache.zeppelin:zeppelin-interpreter-shaded from the server module dependency graph (including transitives). |
| zeppelin-jupyter-interpreter/pom.xml | Drops the dependency re-declaration that was overriding inherited provided scope and leaking the shaded jar transitively. |
| zeppelin-integration/pom.xml | Adds an enforcer rule to ban org.apache.zeppelin:zeppelin-interpreter-shaded from the integration-test classpath (including transitives). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@pan3793 @Reamer @tbonelee @ParkGyeongTae Could you please review this PR? it just fixes the CI only. |
ParkGyeongTae
left a comment
There was a problem hiding this comment.
LGTM. Clear root cause and the enforcer rule is a nice safety net. CI is green — good to merge. Thanks!
|
Merged into master (5d1ede4). |
What is this PR for?
Fixes a regression that breaks all Selenium integration tests in
zeppelin-integration(InterpreterIT,AuthenticationIT,ZeppelinIT,InterpreterModeActionsIT,SparkParagraphIT,PersonalizeActionsIT,ParagraphActionsIT) on master since the [ZEPPELIN-6355] zengine→server merge. They abort duringMiniZeppelinServerstartup with:Scope of impact
frontend.ymlselenium IT job has been red on every master push since 2026-05-05, blocking PR merges.zeppelin-interpreter-shaded.jaron the server JVM classpath; the two-JVM isolation introduced by [ZEPPELIN-3689] still holds for deployed installations. The leak is confined tozeppelin-integration's test classpath.Root cause
zeppelin-jupyter-interpreter/pom.xmlre-declared its dependency onzeppelin-interpreter-shadedwithout scope, silently overriding the parent's<scope>provided</scope>and downgrading it to compile. That made the shaded jar transitive to anyone depending onzeppelin-jupyter-interpreter— in particularspark-interpreter(becauseIPySparkInterpreterextendsIPythonInterpreterwhich extendsJupyterKernelInterpreter), and onward intozeppelin-integration's test classpath via<dependency>spark-interpreter</dependency>.Both unshaded
zeppelin-interpreter.jarandzeppelin-interpreter-shaded.jarend up in the same test JVM. Because the shade plugin keepsorg.apache.zeppelin.dep.*class names un-relocated (per<exclude>org/apache/zeppelin/**</exclude>) but rewrites their internalorg.eclipse.aether.*references toshaded.org.apache.zeppelin.org.eclipse.aether.*, both jars contain identically-namedBooter/Repository/DependencyResolverclasses that disagree on theRemoteRepositorytype. Whichever the classloader picks first wins; post-merge the shaded variant wins, sodependencyResolver.getRepos()returns shadedRemoteRepositoryinstances which fail to cast to the unshaded type expected byInterpreterSettingManager.The scope-omission has been latent since 2019-12 ([ZEPPELIN-4497]). The [ZEPPELIN-6355] merge changed the dependency-resolution order in
zeppelin-integration's test classpath and exposed it.What type of PR is it?
Bug Fix
Todos
zeppelin-interpreter-shadedredeclaration inzeppelin-jupyter-interpreter/pom.xmlso the parent'sprovidedappliesmaven-enforcer-pluginbannedDependenciesrule onzeppelin-serverandzeppelin-integrationto catch any future leakmvn dependency:treeis clean forzeppelin-serverandzeppelin-integrationfrontend.ymlselenium IT goes green on this PRWhat is the Jira issue?
zeppelin-jupyter-kernel-clientlibrary so interpreter modules never depend on the%jupytermagic interpreter artifact directly (structural decoupling)How should this be tested?
Screenshots (if appropriate)
N/A
Questions