ZEPPELIN-4176. Remove old spark interpreter#3375
Conversation
| // zeppelin.spark.useHiveContext & zeppelin.spark.concurrentSQL are legacy zeppelin | ||
| // properties, convert them to spark properties here. | ||
| if (entry.getKey().toString().equals("zeppelin.spark.useHiveContext")) { | ||
| conf.set("spark.useHiveContext", entry.getValue().toString()); |
There was a problem hiding this comment.
I don't recall this is the name in spark? spark.useHiveContext anyway this isn't supported in spark today?
There was a problem hiding this comment.
yes, it is not supported today, just for legacy support. Will remove it in future.
| if (entry.getKey().toString().equals("zeppelin.spark.useHiveContext")) { | ||
| conf.set("spark.useHiveContext", entry.getValue().toString()); | ||
| } | ||
| if (entry.getKey().toString().equals("zeppelin.spark.concurrentSQL") |
There was a problem hiding this comment.
should we only set this for SQL interpreter? this might have unintended effect for non-SQL ones
There was a problem hiding this comment.
I am afraid no. Because it would set spark.scheduler.mode which is need to be set when start driver is starting. And starting driver is in SparkInterpreter
There was a problem hiding this comment.
then maybe we should rename and deprecate this one in the next release. IIRC, people has complained about paragraph execution order changing and breaking stuff, so if this affects all spark interpreter and not just sql, it has a higher risk of that
There was a problem hiding this comment.
This would not affect the executing order of spark scala code. Because spark scala interpreter use FIFOScheduler. Only SparkSqlInterpreter is affected, as SparkSqlInterpreter use ParallelScheduler https://github.com/apache/zeppelin/blob/master/spark/interpreter/src/main/java/org/apache/zeppelin/spark/SparkSqlInterpreter.java#L128
| this.innerInterpreter.bind("z", z.getClass().getCanonicalName(), z, | ||
| Lists.newArrayList("@transient")); | ||
| } catch (Exception e) { | ||
| LOGGER.error("Fail to open SparkInterpreter", ExceptionUtils.getStackTrace(e)); |
There was a problem hiding this comment.
log e instead of ExceptionUtils.getStackTrace(e)?
| if (scalaVersionString.contains("version 2.10")) { | ||
| return "2.10"; | ||
| } else { | ||
| return "2.11"; |
There was a problem hiding this comment.
this could break with scala 2.12
|
I don't see any blocker other than comment on 2.12 and fair scheduler mode |
| try { | ||
| String keytab = getProperties().getProperty("spark.yarn.keytab"); | ||
| String principal = getProperties().getProperty("spark.yarn.principal"); | ||
| UserGroupInformation.loginUserFromKeytab(principal, keytab); |
There was a problem hiding this comment.
Set the kerberos authentication information according to the configuration in the new spark interpreter.
Can it be added to the new spark interpreter?
This is very useful.
There was a problem hiding this comment.
This is legacy code for OldSparkInterpreter. At that time, we didn't pass spark conf via --conf in spark-submit. But now, we correct that in SparkInterpreterLauncher, so we don't need to do that again.
### What is this PR for? When Zeppelin is running in Kubernetes, "View in Spark web UI" gives internal address, instead of address defined in SERVICE_DOMAIN. I think this problem is side effect of #3375 and this PR includes fix and updated unittest. ### What type of PR is it? Bug Fix ### What is the Jira issue? https://issues.apache.org/jira/browse/ZEPPELIN-4226 ### How should this be tested? Run Zeppelin on kubernetes, and run spark job, click "View in Spark web UI" button. ### Questions: * Does the licenses files need update? no * Is there breaking changes for older versions? no * Does this needs documentation? no Author: Lee moon soo <moon@apache.org> Closes #3451 from Leemoonsoo/ZEPPELIN-4226 and squashes the following commits: 7e34542 [Lee moon soo] use StringUtils.isBlank a33c3b2 [Lee moon soo] pickup SparkUI address from zeppelin.spark.uiWebUrl
What is this PR for?
This PR is just to remove the old spark interpreter. The old spark interpreter has several issues, and we introduce new spark interpreter implementation in 0.8. This ticket is to remove it in 0.9. Here's the issues of old spark interpreter.
What type of PR is it?
[ Improvement ]
Todos
What is the Jira issue?
How should this be tested?
Screenshots (if appropriate)
Questions: