Skip to content

API: Add view interfaces#4925

Merged
rdblue merged 1 commit into
apache:masterfrom
jzhuge:view-api
Dec 5, 2022
Merged

API: Add view interfaces#4925
rdblue merged 1 commit into
apache:masterfrom
jzhuge:view-api

Conversation

@jzhuge

@jzhuge jzhuge commented Jun 1, 2022

Copy link
Copy Markdown
Member

Co-authored-by: anjalinorwood@gmail.com

@github-actions github-actions Bot added the API label Jun 1, 2022
@jzhuge

jzhuge commented Jun 1, 2022

Copy link
Copy Markdown
Member Author

Split from #4567

@nastra

nastra commented Jun 1, 2022

Copy link
Copy Markdown
Contributor

@jzhuge I think you wanted to link to #4657 here :)

Comment thread api/src/main/java/org/apache/iceberg/catalog/ViewCatalog.java Outdated
Comment thread api/src/main/java/org/apache/iceberg/catalog/ViewCatalog.java Outdated
Comment thread api/src/main/java/org/apache/iceberg/catalog/ViewCatalog.java Outdated
Comment thread api/src/main/java/org/apache/iceberg/catalog/ViewCatalog.java Outdated
Comment thread api/src/main/java/org/apache/iceberg/view/ViewDefinition.java Outdated
Comment thread api/src/main/java/org/apache/iceberg/catalog/ViewCatalog.java Outdated
Comment thread api/src/main/java/org/apache/iceberg/view/ViewUpdateProperties.java Outdated
Comment thread api/src/main/java/org/apache/iceberg/catalog/ViewCatalog.java Outdated

@amogh-jahagirdar amogh-jahagirdar left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks @jzhuge for the contribution! @jackye1995 @nastra let us know if you have any further comments.

cc: @rdblue @danielcweeks

@jzhuge

jzhuge commented Jun 20, 2022

Copy link
Copy Markdown
Member Author

Thanks @amogh-jahagirdar.

BTW, I am also ok if we decide to remove UpdateViewProperties and just use the existing UpdateProperties.

@jackye1995 jackye1995 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@jackye1995

Copy link
Copy Markdown
Contributor

@rdblue do you have any additional comment for adding these view APIs?

Comment thread api/src/main/java/org/apache/iceberg/view/View.java
@jzhuge

jzhuge commented Jun 30, 2022

Copy link
Copy Markdown
Member Author

If no more comment, is it possible to get it merged so that I can use it in the Core PR?

@jzhuge

jzhuge commented Jul 19, 2022

Copy link
Copy Markdown
Member Author

Hi @danielcweeks, could you please take a look and merge if it looks ok?

@danielcweeks danielcweeks self-requested a review July 22, 2022 18:28
@jzhuge

jzhuge commented Aug 15, 2022

Copy link
Copy Markdown
Member Author

Hi @danielcweeks, have you got a chance to take a look?

Comment thread api/src/main/java/org/apache/iceberg/catalog/ViewCatalog.java Outdated
long timestampMillis();

/**
* Returns the version summary such as the name and genie-id of the operation that created that version of the view

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to remove references to genie.

@danielcweeks

Copy link
Copy Markdown
Contributor

@jzhuge or @anjalinorwood I feel like I'm missing something in terms of where we actually access the view content / text? I would think that would live in ViewRepresentation or ViewVersion?

@amogh-jahagirdar

Copy link
Copy Markdown
Contributor

@danielcweeks I'll let @jzhuge @anjalinorwood confirm but the model at least in my mind is that there is an implementation of ViewRepresentation; for the SQL view representation, this would be SqlViewRepresentation which would expose a sql() method for surfacing the literal text. Down the line there would be other view representation types such as for substrait which would expose the view representation in their own way (serialized plan node for example).

Comment thread api/src/main/java/org/apache/iceberg/view/ViewVersion.java Outdated
Comment thread api/src/main/java/org/apache/iceberg/view/View.java Outdated
Comment thread api/src/main/java/org/apache/iceberg/view/ViewVersion.java
Comment thread api/src/main/java/org/apache/iceberg/view/ViewVersion.java Outdated
@jzhuge

jzhuge commented Aug 30, 2022

Copy link
Copy Markdown
Member Author

Merged Amogh's PR, rebased, and applied spotless.

Comment thread api/src/main/java/org/apache/iceberg/view/ViewBuilder.java Outdated
@jzhuge jzhuge force-pushed the view-api branch 2 times, most recently from 65b7b6e to e3de593 Compare November 6, 2022 21:52
String dialect();

/** The default catalog when the view is created. */
String defaultCatalog();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a getter. wondering why default...? should it just be catalog()?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a Spark/Presto SQL session, you can run USE to set default catalog and namespace, thus the "default".

@jzhuge

jzhuge commented Nov 7, 2022

Copy link
Copy Markdown
Member Author

Created #6134 to add the missing field query-column-names to SQL view representation in the view spec.

@wmoustafa

Copy link
Copy Markdown
Contributor
  • No need for something like View.updateRepresentions(), as buildView() + replace() should be enough

Can one replace the current version or an old version by adding a new dialect? Does it result in a new version?

Comment thread api/src/main/java/org/apache/iceberg/view/ViewVersion.java Outdated
* @param query view query
* @return this for method chaining
*/
ViewBuilder withQuery(String query);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My original question was that this builder returns a View, and in this discussion, you mentioned that withQuery() is the method responsible for the fact that it should return SQLViewRepresentation. However, I think this method still does not enforce it in this iteration.

Comment on lines +31 to +33
/** The view query SQL text. */
String query();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would not this method be also common with other representations?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the SQL text of the view. Other view representations don't necessarily have a SQL query. I think this is reasonable.

Comment on lines +34 to +36
/** The view query SQL dialect. */
String dialect();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this needs to be an Enum and somehow be merged with Type. @rdblue, @danielcweeks, @amogh-jahagirdar, thoughts? Further, it is better for the spec to take a position on supported dialects, or not take a position at all, but in this case SQL should not be a special type. Right now the position sounds a bit fuzzy.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree that we want to define supported dialects. We don't want engine writers to need to come to the Iceberg community and get a new symbol in an enum before storing a view. I don't see much benefit to doing that.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is fine to do either: not take a position at all or take a well defined position. There could be a few scenarios where the intended meaning is not achieved. For example:

  • One defines the view SQL using a non-SQL string (e.g., some other DSL), and still leverages SQLViewReperesentation for that. Even if this is permitted, it is not clear from the spec if it is okay.
  • The dialect could be the same but not standardized, e.g., dialects such as spark, Spark, SparkSQL, Spark SQL.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe what we need is an API to add supported dialects to the metadata? It is up to each catalog implementation to add their dialects, and view definitions can be associated with a dialect ID.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe what we need is an API to add supported dialects to the metadata? It is up to each catalog implementation to add their dialects, and view definitions can be associated with a dialect ID.

Already added

  /**
   * Add a SQL {@link View} for a different dialect.
   *
   * @param sqlViewRepresentation a SQL view representation
   * @return this for method chaining
   * @throws IllegalArgumentException if the dialect is the same as the SQL view being built
   */
  ViewBuilder withOtherSQLRepresentation(SQLViewRepresentation sqlViewRepresentation);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was referring to making dialects (not just view representations) a construct in the metadata where people can define and add them, e.g., some catalogs define SparkSQL, id = 1 and TrinoSQL, id = 2, and each view representation references a standardized dialect ID. They are not part of the Iceberg spec, but defining and adding them dynamically is.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One defines the view SQL using a non-SQL string

I don't think this is allowed. If you create a SQL view then it should contain SQL. In any case, being strict about dialects doesn't help prevent this.

The dialect could be the same but not standardized

This is a risk, but I think it isn't very likely. We should document the known dialect strings, but we shouldn't switch it to be an enum.

Comment thread api/src/main/java/org/apache/iceberg/view/SQLViewRepresentation.java Outdated
Comment thread api/src/main/java/org/apache/iceberg/view/ViewBuilder.java Outdated
Comment thread api/src/main/java/org/apache/iceberg/view/ViewBuilder.java Outdated
Comment thread api/src/main/java/org/apache/iceberg/view/ViewVersion.java Outdated
Comment thread api/src/main/java/org/apache/iceberg/view/ViewVersion.java Outdated
@jzhuge

jzhuge commented Nov 8, 2022

Copy link
Copy Markdown
Member Author
  • No need for something like View.updateRepresentions(), as buildView() + replace() should be enough

Can one replace the current version or an old version by adding a new dialect? Does it result in a new version?

Not supported.

To add a new dialect, buildView() + withOtherSQLRepresentation() + replace()

@wmoustafa

Copy link
Copy Markdown
Contributor

buildView() + withOtherSQLRepresentation() + replace() automatically copies existing dialects from previous versions or if we want to preserve them, we need to add them again?

@jzhuge

jzhuge commented Nov 8, 2022

Copy link
Copy Markdown
Member Author

buildView() + withOtherSQLRepresentation() + replace() automatically copies existing dialects from previous versions or if we want to preserve them, we need to add them again?

buildView() (with no argument) will start from scratch, i.e., we need to add everything again.

@rdblue

rdblue commented Nov 27, 2022

Copy link
Copy Markdown
Contributor

@jzhuge, this looks good to go. Can you rebase and fix tests?

Co-authored-by: anjalinorwood@gmail.com
Co-authored-by: jahamogh@amazon.com
Co-authored-by: blue@apache.org
@jzhuge

jzhuge commented Dec 2, 2022

Copy link
Copy Markdown
Member Author

@rdblue Please merge.

@rdblue rdblue merged commit b2f4694 into apache:master Dec 5, 2022
@rdblue

rdblue commented Dec 5, 2022

Copy link
Copy Markdown
Contributor

Merge! Let me know where the implementation PR is and I'll start looking at that!

zhongyujiang pushed a commit to zhongyujiang/iceberg that referenced this pull request Apr 16, 2025
Co-authored-by: anjalinorwood@gmail.com
Co-authored-by: jahamogh@amazon.com
(cherry picked from commit b2f4694)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants