Skip to content

[Docs] Update migrate behaviour with respect to drop_table in spark-procedures docs.#6025

Merged
szehon-ho merged 8 commits into
apache:masterfrom
sririshindra:updateMirgateTableDocs
Nov 5, 2022
Merged

[Docs] Update migrate behaviour with respect to drop_table in spark-procedures docs.#6025
szehon-ho merged 8 commits into
apache:masterfrom
sririshindra:updateMirgateTableDocs

Conversation

@sririshindra

Copy link
Copy Markdown
Contributor

No description provided.

@github-actions github-actions Bot added the docs label Oct 20, 2022
@sririshindra

Copy link
Copy Markdown
Contributor Author

@samredai @szehon-ho Please take a look at this docs PR when you get a chance. This doc change corresponds to #5622 which recently got merged.

Comment thread docs/spark-procedures.md Outdated
| Argument Name | Required? | Type | Description |
|---------------|---------|------|--------------------------------------------------------------------------------------|
| `table` | ✔️ | string | Name of the table to migrate |
| `properties` | ️ | map<string, string> | Properties for the new Iceberg table |

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you revert the changes to the other lines? I don't think these were updated so there is no need to modify them.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. fixed in the latest commit.

@szehon-ho szehon-ho left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some suggestions

Comment thread docs/spark-procedures.md Outdated

To leave the original table intact while testing, use [`snapshot`](#snapshot) to create new temporary table that shares source data files and schema.

Migrate will create a backup table with name [`table__BACKUP__`]. If you feel confident that the migration succeeded

@szehon-ho szehon-ho Oct 24, 2022

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is too much implementation detail.

But if we keep it, let's be more concise and avoid second-person 'you'

By default, the original table is retained with the name table_BACKUP_.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. fixed in the latest commit.

Comment thread docs/spark-procedures.md Outdated
|---------------|---------|------|--------------------------------------------------------------------------------------|
| `table` | ✔️ | string | Name of the table to migrate |
| `properties` | ️ | map<string, string> | Properties for the new Iceberg table |
| `drop_backup` | | boolean | When true, the backup table is dropped after succesful migration (defaults to false) |

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel the 'successful migration' is a bit too redundant, how about (to make it more concise)

When true, the original table will not be retained as backup (defaults to false)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. fixed in the latest commit.

Comment thread docs/spark-procedures.md Outdated
|---------------|-----------|------|-------------|
| `table` | ✔️ | string | Name of the table to migrate |
| `properties` | ️ | map<string, string> | Properties for the new Iceberg table |
| `properties` | ️ | map<string, string> | Properties for the new Iceberg table |

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the multiple commits. I am unable to figure out what changed in this line. Seems exactly the same to my eyes. Something hidden character must have been added somehow.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have one space after the last | ? You can maybe just copy and paste from the previous version?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think there was an extra space after | . I removed it in the latest commit.

@szehon-ho szehon-ho left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks better, just some small things

Comment thread docs/spark-procedures.md Outdated

To leave the original table intact while testing, use [`snapshot`](#snapshot) to create new temporary table that shares source data files and schema.

By default, the original table is retained with the name `table_BACKUP_`. You can also explicitly pass `drop_backup => true`

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry i should have been more clear, can we remove the second sentence as well? (as its duplicated already in the flag description). As you already mention 'by default', I think the user will know to look below.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, removed the second sentence.

Comment thread docs/spark-procedures.md Outdated
|---------------|-----------|------|-------------|
| `table` | ✔️ | string | Name of the table to migrate |
| `properties` | ️ | map<string, string> | Properties for the new Iceberg table |
| `properties` | ️ | map<string, string> | Properties for the new Iceberg table |

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have one space after the last | ? You can maybe just copy and paste from the previous version?

@szehon-ho szehon-ho merged commit bd225d5 into apache:master Nov 5, 2022
@szehon-ho

Copy link
Copy Markdown
Member

Merged, thanks @sririshindra

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants