Skip to content

[SPARK-35159][SQL][DOCS] Extract hive format doc#32264

Closed
AngersZhuuuu wants to merge 8 commits into
apache:masterfrom
AngersZhuuuu:SPARK-35159
Closed

[SPARK-35159][SQL][DOCS] Extract hive format doc#32264
AngersZhuuuu wants to merge 8 commits into
apache:masterfrom
AngersZhuuuu:SPARK-35159

Conversation

@AngersZhuuuu

@AngersZhuuuu AngersZhuuuu commented Apr 21, 2021

Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Extract common doc about hive format for sql-ref-syntax-ddl-create-table-hiveformat.md and sql-ref-syntax-qry-select-transform.md to refer.

image

Why are the changes needed?

Improve doc

Does this PR introduce any user-facing change?

No

How was this patch tested?

Not need

@github-actions github-actions Bot added the DOCS label Apr 21, 2021
@AngersZhuuuu

Copy link
Copy Markdown
Contributor Author

I am not sure if we need to put this page in which menu page

@SparkQA

SparkQA commented Apr 21, 2021

Copy link
Copy Markdown

Test build #137710 has finished for PR 32264 at commit df246ea.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • SERDE serde_class [ WITH SERDEPROPERTIES (k1=v1, k2=v2, ... ) ]

@SparkQA

SparkQA commented Apr 21, 2021

Copy link
Copy Markdown

Test build #137711 has finished for PR 32264 at commit d98f825.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA

SparkQA commented Apr 21, 2021

Copy link
Copy Markdown

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42239/

@SparkQA

SparkQA commented Apr 21, 2021

Copy link
Copy Markdown

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42239/

@SparkQA

SparkQA commented Apr 21, 2021

Copy link
Copy Markdown

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42238/

@SparkQA

SparkQA commented Apr 21, 2021

Copy link
Copy Markdown

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42238/

Comment thread docs/sql-ref-syntax-hive-format.md Outdated

### Description

Spark support Hive format in `CREATE TABLE` clause and `TRANSFORM` clause, Hive format support

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spark supports Hive format in `CREATE TABLE` clause and `TRANSFORM` clause,
to specify serde or text delimeter.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment thread docs/sql-ref-syntax-hive-format.md Outdated

* **row_format**

Use the `SERDE` clause to specify a custom SerDe for one table or processing inputs and outputs data. Otherwise, use the `DELIMITED` clause to use the native SerDe and specify the delimiter, escape character, null character and so on.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we put it in Description at the beginning?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment thread docs/sql-ref-syntax-hive-format.md Outdated

Use the `SERDE` clause to specify a custom SerDe for one table or processing inputs and outputs data. Otherwise, use the `DELIMITED` clause to use the native SerDe and specify the delimiter, escape character, null character and so on.

* **SERDE**

@cloud-fan cloud-fan Apr 21, 2021

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can merge this with the next one. SERDE serde_class

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@cloud-fan

Copy link
Copy Markdown
Contributor

not sure if we need to put this page in which menu page

We don't need to put it in the menu page.

@SparkQA

SparkQA commented Apr 21, 2021

Copy link
Copy Markdown

Test build #137732 has finished for PR 32264 at commit 203f544.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA

SparkQA commented Apr 21, 2021

Copy link
Copy Markdown

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42259/

@SparkQA

SparkQA commented Apr 21, 2021

Copy link
Copy Markdown

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42259/

Comment thread docs/sql-ref-syntax-hive-format.md Outdated
---
layout: global
title: Data Retrieval
displayTitle: Data Retrieval

@dongjoon-hyun dongjoon-hyun Apr 22, 2021

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, shall we add a reference to this sql-ref-syntax-hive-format.md into sql-ref-syntax-qry.md?

Oh, got it. I saw @cloud-fan 's comment, We don't need to put it in the menu page. Nvm.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, refer this in other menu doc is so strange and it's refer in two syntax doc with different type.

Comment thread docs/sql-ref-syntax-hive-format.md Outdated
Comment thread docs/sql-ref-syntax-hive-format.md Outdated
Spark supports Hive format in `CREATE TABLE` clause and `TRANSFORM` clause,
to specify serde or text delimeter. In `row_format`, uses the `SERDE` clause to specify a custom SerDe
for one table or processing inputs and outputs data. Otherwise, use the `DELIMITED` clause
to use the native SerDe and specify the delimiter, escape character, null character and so on.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about this

There are two ways to specify the `row_format`:
1. Use the `SERDE` clause to specify a custom SerDe class
2. Use the `DELIMITED` clause to specify the delimiter ... and so on for the native text Serde.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Year, more clear

Comment thread docs/sql-ref-syntax-hive-format.md Outdated
Comment thread docs/sql-ref-syntax-hive-format.md Outdated
@cloud-fan

Copy link
Copy Markdown
Contributor

@maropu do you want to take a look?

@SparkQA

SparkQA commented Apr 22, 2021

Copy link
Copy Markdown

Test build #137797 has finished for PR 32264 at commit 5fe64b5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA

SparkQA commented Apr 22, 2021

Copy link
Copy Markdown

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42326/

@SparkQA

SparkQA commented Apr 22, 2021

Copy link
Copy Markdown

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42326/

Comment thread docs/sql-ref-syntax-hive-format.md Outdated
* **row_format**

Used for escape mechanism.
All descriptions about syntax in `row_format` can refer to [HIVE FORMAT](sql-ref-syntax-hive-format.html)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about Specifies the row format for input and output. See [HIVE ROW FORMAT](...) for more syntax details.?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment thread docs/sql-ref-syntax-qry-select-transform.md Outdated
Comment thread docs/sql-ref-syntax-hive-format.md Outdated
@maropu

maropu commented Apr 22, 2021

Copy link
Copy Markdown
Member

Could you add the screenshot of the new page in the PR description?

@maropu

maropu commented Apr 22, 2021

Copy link
Copy Markdown
Member

NOTE: I'm planning to backport this PR and #31010 into branch-3.1/3.0 because I think these document pages are useful for users.

@AngersZhuuuu

Copy link
Copy Markdown
Contributor Author

Could you add the screenshot of the new page in the PR description?

DOne

@SparkQA

SparkQA commented Apr 23, 2021

Copy link
Copy Markdown

Test build #137837 has finished for PR 32264 at commit 9bfa1cf.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA

SparkQA commented Apr 23, 2021

Copy link
Copy Markdown

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42367/

@SparkQA

SparkQA commented Apr 23, 2021

Copy link
Copy Markdown

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42367/

@cloud-fan

Copy link
Copy Markdown
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in 20d68dc Apr 23, 2021
@cloud-fan

Copy link
Copy Markdown
Contributor

thanks, merging to master!

@cloud-fan

Copy link
Copy Markdown
Contributor

@maropu shall we have a single backport PR or two?

@maropu

maropu commented Apr 23, 2021

Copy link
Copy Markdown
Member

They have different jira tickets, so I think its better to backport them separately. Could you? @AngersZhuuuu

@maropu

maropu commented Apr 23, 2021

Copy link
Copy Markdown
Member

Anyway, late lgtm. Thank you, @AngersZhuuuu

@maropu

maropu commented Apr 26, 2021

Copy link
Copy Markdown
Member

They have different jira tickets, so I think its better to backport them separately. Could you? @AngersZhuuuu

ping

@AngersZhuuuu

Copy link
Copy Markdown
Contributor Author

They have different jira tickets, so I think its better to backport them separately. Could you? @AngersZhuuuu

ping

Hmmm, have conflict? need me to create backport PR?

@maropu

maropu commented Apr 26, 2021

Copy link
Copy Markdown
Member

yea, yes. I couldn't cherry-pick them into the previous branches.

@AngersZhuuuu

Copy link
Copy Markdown
Contributor Author

yea, yes. I couldn't cherry-pick them into the previous branches.

Ok, ping you later when PR is ready

AngersZhuuuu added a commit to AngersZhuuuu/spark that referenced this pull request Apr 28, 2021
### What changes were proposed in this pull request?
Extract common doc about hive format for `sql-ref-syntax-ddl-create-table-hiveformat.md` and `sql-ref-syntax-qry-select-transform.md` to refer.

![image](https://user-images.githubusercontent.com/46485123/115802193-04641800-a411-11eb-827d-d92544881842.png)

### Why are the changes needed?
Improve doc

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Not need

Closes apache#32264 from AngersZhuuuu/SPARK-35159.

Authored-by: Angerszhuuuu <angers.zhu@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
xuanyuanking pushed a commit to xuanyuanking/spark that referenced this pull request Sep 29, 2021
### What changes were proposed in this pull request?
Extract common doc about hive format for `sql-ref-syntax-ddl-create-table-hiveformat.md` and `sql-ref-syntax-qry-select-transform.md` to refer.

![image](https://user-images.githubusercontent.com/46485123/115802193-04641800-a411-11eb-827d-d92544881842.png)

### Why are the changes needed?
Improve doc

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Not need

Closes apache#32264 from AngersZhuuuu/SPARK-35159.

Authored-by: Angerszhuuuu <angers.zhu@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants