Skip to content

[SPARK-57446][SQL] Apply escapeSql to comment in JDBC table/schema comment queries#56506

Closed
dongjoon-hyun wants to merge 1 commit into
apache:masterfrom
dongjoon-hyun:SPARK-57446
Closed

[SPARK-57446][SQL] Apply escapeSql to comment in JDBC table/schema comment queries#56506
dongjoon-hyun wants to merge 1 commit into
apache:masterfrom
dongjoon-hyun:SPARK-57446

Conversation

@dongjoon-hyun

@dongjoon-hyun dongjoon-hyun commented Jun 14, 2026

Copy link
Copy Markdown
Member

What changes were proposed in this pull request?

This PR applies escapeSql to the comment argument of the JDBC dialect comment query
methods: JdbcDialect.getTableCommentQuery, JdbcDialect.getSchemaCommentQuery, and the
MySQLDialect.getTableCommentQuery override.

Why are the changes needed?

The comment was embedded into a single-quoted SQL literal without escaping, so a comment
containing ' breaks the statement. escapeSql doubles single quotes (' -> ''):

  • Before: COMMENT ON TABLE t IS 'a'b' (invalid)
  • After: COMMENT ON TABLE t IS 'a''b' (correct)

This is consistent with the existing string value treatment.

def compileValue(value: Any): Any = value match {
case stringValue: String => s"'${escapeSql(stringValue)}'"

Does this PR introduce any user-facing change?

Yes. A table or schema comment containing a single quote is now stored correctly instead
of failing with a SQL syntax error.

How was this patch tested?

Pass the CIs with a newly added test case.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Claude Opus 4.8)

@dongjoon-hyun

Copy link
Copy Markdown
Member Author

cc @huaxingao , @cloud-fan

@cloud-fan cloud-fan left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 blocking, 1 non-blocking, 0 nits.
Clean, well-scoped fix. The escaping is consistent with the established compileValue pattern, covers every dialect override that embeds a comment literal (the others throw unsupported), and the new test genuinely exercises the escaping. LGTM.

Correctness (1)

  • MySQLDialect.scala:269: escapeSql doubles ' but doesn't escape \ (non-blocking, out of scope) — see inline

// See https://dev.mysql.com/doc/refman/8.0/en/alter-table.html
override def getTableCommentQuery(table: String, comment: String): String = {
s"ALTER TABLE $table COMMENT = '$comment'"
s"ALTER TABLE $table COMMENT = '${escapeSql(comment)}'"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking, out of scope — just flagging for completeness: escapeSql doubles ' but doesn't escape \. MySQL treats backslash as an escape character by default (NO_BACKSLASH_ESCAPES unset), so a comment containing a backslash can still be mangled (or break the statement if trailing) here. This is pre-existing and shared with compileValue's MySQL string-predicate handling, and this PR is a strict improvement for the single-quote case, so nothing to change here.

@dongjoon-hyun

Copy link
Copy Markdown
Member Author

Thank you, @cloud-fan . Yes, it is.

dongjoon-hyun added a commit that referenced this pull request Jun 15, 2026
…ma` comment queries

### What changes were proposed in this pull request?

This PR applies `escapeSql` to the `comment` argument of the JDBC dialect comment query
methods: `JdbcDialect.getTableCommentQuery`, `JdbcDialect.getSchemaCommentQuery`, and the
`MySQLDialect.getTableCommentQuery` override.

### Why are the changes needed?

The `comment` was embedded into a single-quoted SQL literal without escaping, so a comment
containing `'` breaks the statement. `escapeSql` doubles single quotes (`'` -> `''`):

- Before: `COMMENT ON TABLE t IS 'a'b'` (invalid)
- After: `COMMENT ON TABLE t IS 'a''b'` (correct)

This is consistent with the existing string value treatment.

https://github.com/apache/spark/blob/875982c4a65bc703bce88fb17683b423d8f88255/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala#L384-L385

### Does this PR introduce _any_ user-facing change?

Yes. A table or schema comment containing a single quote is now stored correctly instead
of failing with a SQL syntax error.

### How was this patch tested?

Pass the CIs with a newly added test case.

### Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Claude Opus 4.8)

Closes #56506 from dongjoon-hyun/SPARK-57446.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 072994d)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
@dongjoon-hyun

Copy link
Copy Markdown
Member Author

Merged to master/4.x for now because we have Apache Spark 4.2.0 RC3 vote currently.

If it fails, I will consider to cherry-pick this to branch-4.2 and older.

@dongjoon-hyun dongjoon-hyun deleted the SPARK-57446 branch June 15, 2026 01:30
dongjoon-hyun added a commit that referenced this pull request Jun 17, 2026
…ma` comment queries

### What changes were proposed in this pull request?

This PR applies `escapeSql` to the `comment` argument of the JDBC dialect comment query
methods: `JdbcDialect.getTableCommentQuery`, `JdbcDialect.getSchemaCommentQuery`, and the
`MySQLDialect.getTableCommentQuery` override.

### Why are the changes needed?

The `comment` was embedded into a single-quoted SQL literal without escaping, so a comment
containing `'` breaks the statement. `escapeSql` doubles single quotes (`'` -> `''`):

- Before: `COMMENT ON TABLE t IS 'a'b'` (invalid)
- After: `COMMENT ON TABLE t IS 'a''b'` (correct)

This is consistent with the existing string value treatment.

https://github.com/apache/spark/blob/875982c4a65bc703bce88fb17683b423d8f88255/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala#L384-L385

### Does this PR introduce _any_ user-facing change?

Yes. A table or schema comment containing a single quote is now stored correctly instead
of failing with a SQL syntax error.

### How was this patch tested?

Pass the CIs with a newly added test case.

### Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Claude Opus 4.8)

Closes #56506 from dongjoon-hyun/SPARK-57446.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 072994d)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 012a3d8)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
dongjoon-hyun added a commit that referenced this pull request Jun 17, 2026
…ma` comment queries

This PR applies `escapeSql` to the `comment` argument of the JDBC dialect comment query
methods: `JdbcDialect.getTableCommentQuery`, `JdbcDialect.getSchemaCommentQuery`, and the
`MySQLDialect.getTableCommentQuery` override.

The `comment` was embedded into a single-quoted SQL literal without escaping, so a comment
containing `'` breaks the statement. `escapeSql` doubles single quotes (`'` -> `''`):

- Before: `COMMENT ON TABLE t IS 'a'b'` (invalid)
- After: `COMMENT ON TABLE t IS 'a''b'` (correct)

This is consistent with the existing string value treatment.

https://github.com/apache/spark/blob/875982c4a65bc703bce88fb17683b423d8f88255/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala#L384-L385

Yes. A table or schema comment containing a single quote is now stored correctly instead
of failing with a SQL syntax error.

Pass the CIs with a newly added test case.

Generated-by: Claude Code (Claude Opus 4.8)

Closes #56506 from dongjoon-hyun/SPARK-57446.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 072994d)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 012a3d8)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit bf50ad9)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
dongjoon-hyun added a commit that referenced this pull request Jun 17, 2026
…ma` comment queries

This PR applies `escapeSql` to the `comment` argument of the JDBC dialect comment query
methods: `JdbcDialect.getTableCommentQuery`, `JdbcDialect.getSchemaCommentQuery`, and the
`MySQLDialect.getTableCommentQuery` override.

The `comment` was embedded into a single-quoted SQL literal without escaping, so a comment
containing `'` breaks the statement. `escapeSql` doubles single quotes (`'` -> `''`):

- Before: `COMMENT ON TABLE t IS 'a'b'` (invalid)
- After: `COMMENT ON TABLE t IS 'a''b'` (correct)

This is consistent with the existing string value treatment.

https://github.com/apache/spark/blob/875982c4a65bc703bce88fb17683b423d8f88255/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala#L384-L385

Yes. A table or schema comment containing a single quote is now stored correctly instead
of failing with a SQL syntax error.

Pass the CIs with a newly added test case.

Generated-by: Claude Code (Claude Opus 4.8)

Closes #56506 from dongjoon-hyun/SPARK-57446.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 072994d)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 012a3d8)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit bf50ad9)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 4c57f43)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
dongjoon-hyun added a commit that referenced this pull request Jun 17, 2026
…ma` comment queries

This PR applies `escapeSql` to the `comment` argument of the JDBC dialect comment query
methods: `JdbcDialect.getTableCommentQuery`, `JdbcDialect.getSchemaCommentQuery`, and the
`MySQLDialect.getTableCommentQuery` override.

The `comment` was embedded into a single-quoted SQL literal without escaping, so a comment
containing `'` breaks the statement. `escapeSql` doubles single quotes (`'` -> `''`):

- Before: `COMMENT ON TABLE t IS 'a'b'` (invalid)
- After: `COMMENT ON TABLE t IS 'a''b'` (correct)

This is consistent with the existing string value treatment.

https://github.com/apache/spark/blob/875982c4a65bc703bce88fb17683b423d8f88255/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala#L384-L385

Yes. A table or schema comment containing a single quote is now stored correctly instead
of failing with a SQL syntax error.

Pass the CIs with a newly added test case.

Generated-by: Claude Code (Claude Opus 4.8)

Closes #56506 from dongjoon-hyun/SPARK-57446.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 072994d)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 012a3d8)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit bf50ad9)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 4c57f43)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 748f602)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
iemejia pushed a commit to iemejia/spark that referenced this pull request Jun 17, 2026
…ma` comment queries

### What changes were proposed in this pull request?

This PR applies `escapeSql` to the `comment` argument of the JDBC dialect comment query
methods: `JdbcDialect.getTableCommentQuery`, `JdbcDialect.getSchemaCommentQuery`, and the
`MySQLDialect.getTableCommentQuery` override.

### Why are the changes needed?

The `comment` was embedded into a single-quoted SQL literal without escaping, so a comment
containing `'` breaks the statement. `escapeSql` doubles single quotes (`'` -> `''`):

- Before: `COMMENT ON TABLE t IS 'a'b'` (invalid)
- After: `COMMENT ON TABLE t IS 'a''b'` (correct)

This is consistent with the existing string value treatment.

https://github.com/apache/spark/blob/875982c4a65bc703bce88fb17683b423d8f88255/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala#L384-L385

### Does this PR introduce _any_ user-facing change?

Yes. A table or schema comment containing a single quote is now stored correctly instead
of failing with a SQL syntax error.

### How was this patch tested?

Pass the CIs with a newly added test case.

### Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Claude Opus 4.8)

Closes apache#56506 from dongjoon-hyun/SPARK-57446.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants