Skip to content

[Zeppelin-635] AngularJS utilities functions and Hide/Show ParagraphId feature#678

Closed
doanduyhai wants to merge 10 commits into
apache:masterfrom
doanduyhai:PushToServer_Angular_Function
Closed

[Zeppelin-635] AngularJS utilities functions and Hide/Show ParagraphId feature#678
doanduyhai wants to merge 10 commits into
apache:masterfrom
doanduyhai:PushToServer_Angular_Function

Conversation

@doanduyhai
Copy link
Copy Markdown
Contributor

What is this PR for?

This is a PR to add utilities functions for the AngularJS system and a new Hide/Show Paragraph Id feature.

See screenshots below to have a feeling of how it works

1) AngularJS Utility Functions

  • runParagraph('paragraphId') : run a paragraph given its id

  • runParagraphs(['paragraphId1', 'paragraphId2']) : run a list of paragraphs given their id

  • pushToServer('varName', value, parameters) : push a variable varName with its value to the server/interpreter angular object repository using current noteId and provided interpreter(s) (and optionally paragraph id(s)). parameters is a map with the following properties:

    Ex:

    parameters = {
         interpreter: 'spark',
         interpreters: ['md', 'sh'],
         paragraph: '20160126-153136_1060166247',
         paragraphs: ['20160126-171914_843040190', '20160126-181556_1915782845'],
         scope: 'note',
         runParagraphs: false    
    }
    • interpreter: name of the interpreter to be used for angular object repository lookup
    • interpreters: name of the interpreters to be used for angular object repository lookup
    • paragraph: paragraph id used to update the angular object repository
    • paragraphs: paragraph ids used to update the angular object repository
    • scope: scope of the angular object, can be note or paragraph. Default = paragraph
    • runParagraphs: if true, run the paragraphs listed by the properties paragraph or paragraphs. Default = true

Remarks:

  1. if both interpreter and interpreters are provided, the single interpreter will be added to the interpreters list. Same remark for paragraph and paragraphs
  2. if a list of interpreters and paragraphs are provided, for each interpreter the object will be pushed only to the paragraphs which matches this interpreter
  3. if one or several paragraph is provided, the object will be pushed to the registry using the (current note id, paragraph id) scope. This correspond to the default paragraph value for the scope variable. However, it is also possible to save the same object to the current note id scope. In this case set the scope variable to note. In this case the object will be saved using paragraph scope and note scope
  4. if scope = paragraph and no paragraph id is provided, the object will be saved using the current note id scope
  5. if runParagraphs = true and no paragraph id is provided, it will be ignored
  6. set runParagraphs to false if you want to push the variable to the back-end without refreshing paragraph
  • removeFromServer('varName', parameters): remove the variable varName from server/interpreter angular object repository using current noteId.

    Ex:

    parameters = {
         interpreter: 'spark',
         interpreters: ['md', 'sh'],
         paragraph: '20160126-153136_1060166247',
         paragraphs: ['20160126-171914_843040190', '20160126-181556_1915782845']
    }
    • interpreter: name of the interpreter to be used for angular object repository lookup
    • interpreters: name of the interpreters to be used for angular object repository lookup
    • paragraph: paragraph id used to remove object from the angular object repository
    • paragraphs: paragraph ids used to remove object from the angular object repository

Remarks:

  1. if both interpreter and interpreters are provided, the single interpreter will be added to the interpreters list. Same remark for paragraph and paragraphs
  2. if a list of interpreters and paragraphs are provided, for each interpreter the object will be pushed only to the paragraphs which matches this interpreter
  3. if one or several paragraph is provided, the object will be remove from the registry using the (current note id, paragraph id) scope
  4. if no paragraph is provided, the object will be remove from the registry using the current note id scope

2) Hide/Show Paragraph Id

This feature is necessary for the above AngularJS functions to work. One can always retrieve the current paragraph Id using the code

 z.getInterpreterContext().getParagraphId()

But it is far from being user-friendly. Therefore, now you can show or hide paragraph Id using a toggle. This is placed at the same place as Show/Hide line numbers.

The paragraph id is displayed at the top left corner, above the Title.

image

image

I'm completely open to any suggestion with regard to the location of the display

3) Technical Impl

  1. First I have added a new Thrift method angularRegistryPush(). This method is used to push the current angular object registry from the Zeppelin server to the remote interpreter. According to @Leemoonsoo, when a note is loaded from the JSON file, the interpreter may not be initialized yet so we need to store all the angular variables in the local registry. As soon as the remote interpreter is started, we can call the angularRegistryPush() method to push the local registry to the remote interpreter

  2. A new public constructor has been added to the AngularObject class. The reason is because when we push the whole Map<String, Map<String, AngularObject>> registry to the remote interpreter using angularRegistryPush(), the registry is serialized using Gson. At the remote interpreter side, the payload is deserialized back into a Map<String, Map<String, AngularObject>>. If a class do not have a public constructor then Gson will use some Unsafe tricks to create it (see here). Consequently, the private transient List<AngularObjectWatcher> watchers = new LinkedList<AngularObjectWatcher>(); property will NOT be initialized and will throw NullPointerException later when the code access the watchers

  3. The runParagraph() and runParagraphs() Angular functions are injected into the isolated scope created for each paragraph so that people can access them. Currently I'm using the following trick to retrieve the paragraph details for a given paragraphId:

    var paragraphDiv = angular.element('#' + paragraphId +
                                    '_paragraphColumn_main[ng-controller="ParagraphCtrl"]');
    
    var paragraph = paragraphDiv.scope().paragraph;

    It's not the cleanest way. Ideally, we should inject into the isolated scope the note.paragraphs array from the Note scope but I'm not sure if it is recommended to leak such information into an isolated scope. Any idea/suggestion is welcomed

  4. New message operations ANGULAR_OBJECT_CLIENT_UPDATE and ANGULAR_OBJECT_CLIENT_REMOVE have been added to support pushToServer() and removeFromServer() Angular functions. The core functionalities are implemented in NotebookServer.angularObjectClientUpdate() and NotebookServer.angularObjectClientDelete(). I also add 8 unit tests for those 2 methods. Similar to runParagraph(), the pushToServer() and removeFromServer() AngularJS functions are injected into the isolated scope to be available for end users.

    A remark on the unit tests. I had to mock a bunch of classes for the tests, which suggest that the code is somehow entangled. I try to make the method as modular as possible but there are a lot of dependencies on objects like Notebook, Note, InterpreterGroup etc ... Maybe a general refactoring later would be nice

  5. The ZeppelinContext.getAngularObject() private method has been updated. Right now it look for variable using only the note scope. I change the code so that now, we look for a variable using:

    • first (noteId, current paragraphId) as key
    • if no object found, then use noteId as key
    • if no object found, then look at the global scope
  6. Last but not least, I have changed the current behavior of dynamic form. In the method Paragraph.jobRun(), when we encounter a variable definition block ( ${var = ...} ) while parsing, we first look into the angular object registry using (noteId, paragraphId) scope, then (noteId) scope. If we find a matching variable we'll use it and we do not create any form input for the paragraph. At worst, we can just remove the last commit, it is independent from the other features.

@Leemoonsoo @jongyoul @bzz @felixcheung @corneadoug I really need your remarks/comments for the above 6 technical points. Thanks

What type of PR is it?

[Feature]

Todos

  • - Code review by committers/contributors/community
  • - Responses and suggestions for the technical questions raised in chapter 3) Technical Impl
  • - Features tested following the below scenarios (see screenshots)

Is there a relevant Jira issue?

ZEPPELIN-635

How should this be tested?

Clone this pull request locally with:

  • git fetch origin pull/678/head:AngularJSFunctions
  • git checkout AngularJSFunctions

See below 12 screenshots for all possible test scenarios

Screenshots (if appropriate)

Hide/Show ParagraphId
paragraphid demo

Run Single Paragraph
runparagraph - scenario 1

Run Multiple Paragraphs
runparagraph - scenario 2

Push To Server Scenario 1: single interpreter, note scope
pushtoserver - scenario 1

Push To Server Scenario 2: multiple interpreters, note scope
pushtoserver - scenario 2

Push To Server Scenario 3: single interpreter, single paragraph
pushtoserver - scenario 3

Push To Server Scenario 4: single interpreter, multiples paragraphs
pushtoserver - scenario 4

Push To Server Scenario 5: single interpreter, multiples paragraphs AND note scope
pushtoserver - scenario 5

Push To Server Scenario 6: single interpreter, multiples paragraphs AND run paragraphs
pushtoserver - scenario 6

Push To Server Scenario 7: multiple interpreters, multiples paragraphs, overriding dynamic forms
pushtoserver - scenario 7

Remove From Server Scenario 1: single interpreter, single paragraph
removefromserver - scenario 1

Remove From Server Scenario 2: single interpreter, note scope
removefromserver - scenario 2

Questions:

  • Does the licenses files need update? --> No
  • Is there breaking changes for older versions? --> No
  • Does this needs documentation? --> Yes

@doanduyhai doanduyhai changed the title [Zeppelin-] AngularJS utilities functions and Hide/Show ParagraphId feature [Zeppelin-635] AngularJS utilities functions and Hide/Show ParagraphId feature Jan 27, 2016
@doanduyhai
Copy link
Copy Markdown
Contributor Author

CI build is green

@doanduyhai doanduyhai force-pushed the PushToServer_Angular_Function branch from 701cbfa to bf3f894 Compare January 28, 2016 11:01
@doanduyhai doanduyhai force-pushed the PushToServer_Angular_Function branch from bf3f894 to 80e1c27 Compare January 29, 2016 05:55
@Leemoonsoo
Copy link
Copy Markdown
Member

@doanduyhai Thanks for great work.
I have look through this PR and while this PR covers wide range of improvement/new features, i'd suggest split it into multiple smaller PR, if possible.

1) AngularJS Utility Function,

This is the first time exposing Zeppelin provided javascript function to user. How about we make them little bit different from other functions defined in the scope. For example instead of

$scope.runParagraph(..)
$scope.pushToServer(...)

collect all exposed function into z

$scope.z.runParagraph(...)
$scope.z.pushToServer(...)

I this way, both user/developer can explicitly recognize which is exposed function to user. What do you think?


  • pushToServer('varName', value, parameters)

Once angularObject is created (binded), no matter who (front-end / back-end) creates, every change of binded scope variable automatically pushed to server.
So this function name may confuse a bit. How about different name, such as bindAngularObject() ?

And this function give user access to any scope of AngularObjectRegistry of any Interpreter process.
Is there any special reason? otherwise can we restrict parameters from

parameters = {
     interpreter: 'spark',
     interpreters: ['md', 'sh'],
     paragraph: '20160126-153136_1060166247',
     paragraphs: ['20160126-171914_843040190', '20160126-181556_1915782845'],
     scope: 'note',
     runParagraphs: false    
}

to

parameters = {
    scope : 'note'
}

So user's javascript code only able to access current paragraph or notebook scope of AngularObject in the current interpreter process.


  1. Technical Impl
    Regarding initialize AngularObjectRegistry on Interpreter creation using angularRegistryPush() looks good.

And do you mind making separate PR for change regarding dynamic form?
That i'm not quite sure the benefit of reading angularObject in dynamic form compare to the complexity it brings. I mean complexity of the concept and expected behavior that user need to understand.

@doanduyhai
Copy link
Copy Markdown
Contributor Author

Thanks @Leemoonsoo for you initial remarks.

That i'm not quite sure the benefit of reading angularObject in dynamic form compare to the complexity it brings. I mean complexity of the concept and expected behavior that user need to understand.

I think I did not define clearly the objectives of this PR so let me explain it again:

Functional requirements: As a Zeppelin user, I would like to design an HTML form which:

  1. can have custom styling and look n feel
  2. can have extended form inputs (email input, calendar, checkbox, items list ...)
  3. can have custom behaviors (changing a drop-down will hide/show extra form inputs)
  4. can control the execution of one or many other paragraph(s) in the same note
  5. can interact with the AngularObjectRegistry for one or many other paragraph(s) in the same note. For example, in my form, I want to set an Angular variable called "name" and in another pagragraph execute the SparkSQL query:
%sql
SELECT * FROM table WHERE name = ${name}

Current Zeppelin dynamic forms can be enhanced to support points 2. and 5. but I think it's very hard (unless we redesign it from scratch) to support all other points. And I want to have a very flexible form system so I need to design my HTML form in another paragraph with %angular

That's the main reason I want to be able to override dynamic forms system if a variable is already present in the AngularObjectRegistry (whether it was bound by a zeppelin context or by an AngularJS function client-side, it doesn't matter). Currently to achieve the same goal, you need to use the z.angularUnbind(), z.angularBind() and attach listener programmatically using Scala like in this video: https://youtu.be/J6Ei1RMG5Xo?t=944. It's very cumbersome, as a web-designer, I want to use pure HTML and AngularJS javascript to design my forms

i'd suggest split it into multiple smaller PR, if possible.

Totally agree, I would propose to split this PR in smaller ones like:

  1. Add Show/Hide paragraph ID feature
  2. Add AngularJS utilities functions (or we can split it in smaller PR, one for each function (bindToAngular(), unbindFromAngular() ...) but I think one PR for all the functions is enough
  3. Override the existing dynamic form with AngularJS

collect all exposed functions into `z``

+1 Good idea

So this function name may confuse a bit. How about different name, such as bindAngularObject() ?

+1. I will rename to bindAngularObject() and unbindAngularObject() to match naming convention from the ZeppelinContext object

And this function give user access to any scope of AngularObjectRegistry of any Interpreter process.
Is there any special reason?

There are serious reasons for the current design.

Right now, the current scope of the AngularObjectRegistry are, in descending order:

  1. Interpreter Group
  2. Global
  3. Note
  4. Paragraph

If we want to bind an angular object to a particular paragraph, the noteId is not necessary because we always choose the id from the current note (and it is not possible to change it to enforce note scope isolation). However, because of point 1. above, we need to provide also the interpreter group (so the interpreter name) of the paragraph otherwise we won't be able to retrieve the AngularObjectRegistry at the server-side. The below piece of code does this job:

final InterpreterGroup interpreterGroup = findGroupForInterpreter(noteId, note.getNoteReplLoader().getInterpreterSettings(), interpreterName);

final AngularObjectRegistry registry = interpreterGroup.getAngularObjectRegistry();

That explain why in the parameters map, we have the properties interpreter: 'sh' or interpreters: ['sh','md'].

@corneadoug
Copy link
Copy Markdown
Contributor

I just reviewed the code (didn't try the PR yet).
It looks pretty nice.

For the show paragraphID, since it is an utility, maybe we could only show that ID as the top line of the settings drop down. That way we wouldn't need to toggle it nor to hide it depending on the view (report view) and we could copy paste it (hopefully).

+1 for the z. and bind/unbind as names

@doanduyhai
Copy link
Copy Markdown
Contributor Author

For the show paragraphID, since it is an utility, maybe we could only show that ID as the top line of the settings drop down. That way we wouldn't need to toggle it nor to hide it depending on the view (report view) and we could copy paste it (hopefully).

+1

@doanduyhai
Copy link
Copy Markdown
Contributor Author

Just a few update. I don't forget this PR. I'll spit it into 3 smallers PR:

  1. Add paragraphId in the settings drown-down as suggested by @corneadoug
  2. Add the bindToAngular(), unbindFromAngular() JS functions to a new z object in the isolated angular scope. I'll also simplify the scoping so that users can only push angular values to a specific paragraph. After a second though, pushing data to the whole note scope does not really make sense because the angular registry is primarily scoped to interpreter group
  3. Override the existing dynamic form with AngularJS

@doanduyhai
Copy link
Copy Markdown
Contributor Author

Closing in favor of #739, #740, #741, #742, #744 and #745

@doanduyhai doanduyhai closed this Feb 24, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants