Skip to content

[ZEPPELIN-2194] precode for PySparkInterpreter#2096

Closed
tinkoff-dwh wants to merge 6 commits into
apache:masterfrom
tinkoff-dwh:ZEPPELIN-2194
Closed

[ZEPPELIN-2194] precode for PySparkInterpreter#2096
tinkoff-dwh wants to merge 6 commits into
apache:masterfrom
tinkoff-dwh:ZEPPELIN-2194

Conversation

@tinkoff-dwh
Copy link
Copy Markdown
Contributor

What is this PR for?

Added parameter zeppelin.pyspark.precode to PySparkInterpreter. This is snippet of code which executes when interpreter initialize.

What type of PR is it?

Feature

What is the Jira issue?

https://issues.apache.org/jira/browse/ZEPPELIN-2194

How should this be tested?

  1. Set parameter zeppelin.pyspark.precode precodeVar='some text'
  2. Run
%pyspark
print(precodeVar)

Questions:

  • Does the licenses files need update? no
  • Is there breaking changes for older versions? no
  • Does this needs documentation? no

@Leemoonsoo
Copy link
Copy Markdown
Member

Leemoonsoo commented Mar 5, 2017

Thanks @tinkoff-dwh for the contribution.
I've seen #2078.

If precode is going to be added to all interpreter, how about create some facility that runs code when initialize interpreter and then make feature based on it?

How about extend Interpreter.Hook() which currently has PRE_EXEC and POST_EXEC as event type that make code run before and after the interpret() call respectively.

I think we can add event type such as AFTER_OPEN, BEFORE_CLOSE. Then with minimum modification of each interpreter, we can have the feature in all interpreters.

what do you think?

@masyan
Copy link
Copy Markdown
Contributor

masyan commented Mar 5, 2017

I think this is a good idea. I'm trying to implement this and test with task #2085

@masyan
Copy link
Copy Markdown
Contributor

masyan commented Mar 5, 2017

@Leemoonsoo
Although probably incorrectly implement this in hooks. This may confuse the user. If he is going to register these hooks in their paragraphs, and they will not work (because they will be executed only when the initialization of the interpreter).

I propose, then, is a global solution. Add two parameters
zeppelin.interpretator.after_open
zeppelin.interpretator.before_close

and add interpetator.interpret(interpetator.getProperty(propertyName)) to methods open and close (RemoteInterpreterServer.java)

@Leemoonsoo
Copy link
Copy Markdown
Member

@masyan Make sense!

@masyan
Copy link
Copy Markdown
Contributor

masyan commented Mar 5, 2017

@Leemoonsoo
so not working as I expected, method open not runs (in RemoteInterpreterServer). need to better understand the circuit how it works

@masyan
Copy link
Copy Markdown
Contributor

masyan commented Mar 5, 2017

close works, but open not. I don't understad why. Maybe run in InterpretJob.jobRun()
if (!lazy.isOpen()) {
lazy.open();
interpreter.interpret(interpetator.getProperty(propertyName), context);
}

@masyan
Copy link
Copy Markdown
Contributor

masyan commented Mar 5, 2017

and InterpreterContext is unavailable in method close.

  • problem in InterpreterGroup (one parameter zeppelin.interpretator.after_open for all interpreters in group)

@felixcheung
Copy link
Copy Markdown
Member

Also, is precode a concept at the interpreter instance level?
Would there be need to run different code per note - how is this going to interact with interpreter mode like per note, isolated and so on?

@tinkoff-dwh
Copy link
Copy Markdown
Contributor Author

@felixcheung

  1. Yes.
  2. Code will be executed after method open() (at the end of this method).
    accordingly:
  1. Per Note - executes once for note + Interpreter. For different code, percode is equals (the same).
  2. Per User - executes once for each user + Interpreter.

@Leemoonsoo
Copy link
Copy Markdown
Member

@masyan

Do you mind open PR or JIRA issue and discuss there for more generic way of run code before/after open/close ? I think it can be little complicated to discuss here.

@tinkoff-dwh
Copy link
Copy Markdown
Contributor Author

@Leemoonsoo
https://issues.apache.org/jira/browse/ZEPPELIN-2216
what to do with this PR?

# Conflicts:
#	docs/interpreter/spark.md
#	spark/src/main/java/org/apache/zeppelin/spark/PySparkInterpreter.java
#	spark/src/test/java/org/apache/zeppelin/spark/PySparkInterpreterTest.java
@tinkoff-dwh
Copy link
Copy Markdown
Contributor Author

Ready to review

@tinkoff-dwh
Copy link
Copy Markdown
Contributor Author

#2221 general solution

@tinkoff-dwh tinkoff-dwh deleted the ZEPPELIN-2194 branch May 1, 2017 14:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants