[SPARK-57446][SQL] Apply escapeSql to comment in JDBC table/schema comment queries#56506
[SPARK-57446][SQL] Apply escapeSql to comment in JDBC table/schema comment queries#56506dongjoon-hyun wants to merge 1 commit into
escapeSql to comment in JDBC table/schema comment queries#56506Conversation
|
cc @huaxingao , @cloud-fan |
cloud-fan
left a comment
There was a problem hiding this comment.
0 blocking, 1 non-blocking, 0 nits.
Clean, well-scoped fix. The escaping is consistent with the established compileValue pattern, covers every dialect override that embeds a comment literal (the others throw unsupported), and the new test genuinely exercises the escaping. LGTM.
Correctness (1)
- MySQLDialect.scala:269:
escapeSqldoubles'but doesn't escape\(non-blocking, out of scope) — see inline
| // See https://dev.mysql.com/doc/refman/8.0/en/alter-table.html | ||
| override def getTableCommentQuery(table: String, comment: String): String = { | ||
| s"ALTER TABLE $table COMMENT = '$comment'" | ||
| s"ALTER TABLE $table COMMENT = '${escapeSql(comment)}'" |
There was a problem hiding this comment.
Non-blocking, out of scope — just flagging for completeness: escapeSql doubles ' but doesn't escape \. MySQL treats backslash as an escape character by default (NO_BACKSLASH_ESCAPES unset), so a comment containing a backslash can still be mangled (or break the statement if trailing) here. This is pre-existing and shared with compileValue's MySQL string-predicate handling, and this PR is a strict improvement for the single-quote case, so nothing to change here.
|
Thank you, @cloud-fan . Yes, it is. |
…ma` comment queries ### What changes were proposed in this pull request? This PR applies `escapeSql` to the `comment` argument of the JDBC dialect comment query methods: `JdbcDialect.getTableCommentQuery`, `JdbcDialect.getSchemaCommentQuery`, and the `MySQLDialect.getTableCommentQuery` override. ### Why are the changes needed? The `comment` was embedded into a single-quoted SQL literal without escaping, so a comment containing `'` breaks the statement. `escapeSql` doubles single quotes (`'` -> `''`): - Before: `COMMENT ON TABLE t IS 'a'b'` (invalid) - After: `COMMENT ON TABLE t IS 'a''b'` (correct) This is consistent with the existing string value treatment. https://github.com/apache/spark/blob/875982c4a65bc703bce88fb17683b423d8f88255/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala#L384-L385 ### Does this PR introduce _any_ user-facing change? Yes. A table or schema comment containing a single quote is now stored correctly instead of failing with a SQL syntax error. ### How was this patch tested? Pass the CIs with a newly added test case. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Claude Code (Claude Opus 4.8) Closes #56506 from dongjoon-hyun/SPARK-57446. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 072994d) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
|
Merged to master/4.x for now because we have Apache Spark 4.2.0 RC3 vote currently. If it fails, I will consider to cherry-pick this to |
…ma` comment queries ### What changes were proposed in this pull request? This PR applies `escapeSql` to the `comment` argument of the JDBC dialect comment query methods: `JdbcDialect.getTableCommentQuery`, `JdbcDialect.getSchemaCommentQuery`, and the `MySQLDialect.getTableCommentQuery` override. ### Why are the changes needed? The `comment` was embedded into a single-quoted SQL literal without escaping, so a comment containing `'` breaks the statement. `escapeSql` doubles single quotes (`'` -> `''`): - Before: `COMMENT ON TABLE t IS 'a'b'` (invalid) - After: `COMMENT ON TABLE t IS 'a''b'` (correct) This is consistent with the existing string value treatment. https://github.com/apache/spark/blob/875982c4a65bc703bce88fb17683b423d8f88255/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala#L384-L385 ### Does this PR introduce _any_ user-facing change? Yes. A table or schema comment containing a single quote is now stored correctly instead of failing with a SQL syntax error. ### How was this patch tested? Pass the CIs with a newly added test case. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Claude Code (Claude Opus 4.8) Closes #56506 from dongjoon-hyun/SPARK-57446. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 072994d) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 012a3d8) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
…ma` comment queries This PR applies `escapeSql` to the `comment` argument of the JDBC dialect comment query methods: `JdbcDialect.getTableCommentQuery`, `JdbcDialect.getSchemaCommentQuery`, and the `MySQLDialect.getTableCommentQuery` override. The `comment` was embedded into a single-quoted SQL literal without escaping, so a comment containing `'` breaks the statement. `escapeSql` doubles single quotes (`'` -> `''`): - Before: `COMMENT ON TABLE t IS 'a'b'` (invalid) - After: `COMMENT ON TABLE t IS 'a''b'` (correct) This is consistent with the existing string value treatment. https://github.com/apache/spark/blob/875982c4a65bc703bce88fb17683b423d8f88255/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala#L384-L385 Yes. A table or schema comment containing a single quote is now stored correctly instead of failing with a SQL syntax error. Pass the CIs with a newly added test case. Generated-by: Claude Code (Claude Opus 4.8) Closes #56506 from dongjoon-hyun/SPARK-57446. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 072994d) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 012a3d8) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit bf50ad9) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
…ma` comment queries This PR applies `escapeSql` to the `comment` argument of the JDBC dialect comment query methods: `JdbcDialect.getTableCommentQuery`, `JdbcDialect.getSchemaCommentQuery`, and the `MySQLDialect.getTableCommentQuery` override. The `comment` was embedded into a single-quoted SQL literal without escaping, so a comment containing `'` breaks the statement. `escapeSql` doubles single quotes (`'` -> `''`): - Before: `COMMENT ON TABLE t IS 'a'b'` (invalid) - After: `COMMENT ON TABLE t IS 'a''b'` (correct) This is consistent with the existing string value treatment. https://github.com/apache/spark/blob/875982c4a65bc703bce88fb17683b423d8f88255/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala#L384-L385 Yes. A table or schema comment containing a single quote is now stored correctly instead of failing with a SQL syntax error. Pass the CIs with a newly added test case. Generated-by: Claude Code (Claude Opus 4.8) Closes #56506 from dongjoon-hyun/SPARK-57446. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 072994d) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 012a3d8) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit bf50ad9) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 4c57f43) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
…ma` comment queries This PR applies `escapeSql` to the `comment` argument of the JDBC dialect comment query methods: `JdbcDialect.getTableCommentQuery`, `JdbcDialect.getSchemaCommentQuery`, and the `MySQLDialect.getTableCommentQuery` override. The `comment` was embedded into a single-quoted SQL literal without escaping, so a comment containing `'` breaks the statement. `escapeSql` doubles single quotes (`'` -> `''`): - Before: `COMMENT ON TABLE t IS 'a'b'` (invalid) - After: `COMMENT ON TABLE t IS 'a''b'` (correct) This is consistent with the existing string value treatment. https://github.com/apache/spark/blob/875982c4a65bc703bce88fb17683b423d8f88255/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala#L384-L385 Yes. A table or schema comment containing a single quote is now stored correctly instead of failing with a SQL syntax error. Pass the CIs with a newly added test case. Generated-by: Claude Code (Claude Opus 4.8) Closes #56506 from dongjoon-hyun/SPARK-57446. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 072994d) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 012a3d8) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit bf50ad9) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 4c57f43) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 748f602) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
…ma` comment queries ### What changes were proposed in this pull request? This PR applies `escapeSql` to the `comment` argument of the JDBC dialect comment query methods: `JdbcDialect.getTableCommentQuery`, `JdbcDialect.getSchemaCommentQuery`, and the `MySQLDialect.getTableCommentQuery` override. ### Why are the changes needed? The `comment` was embedded into a single-quoted SQL literal without escaping, so a comment containing `'` breaks the statement. `escapeSql` doubles single quotes (`'` -> `''`): - Before: `COMMENT ON TABLE t IS 'a'b'` (invalid) - After: `COMMENT ON TABLE t IS 'a''b'` (correct) This is consistent with the existing string value treatment. https://github.com/apache/spark/blob/875982c4a65bc703bce88fb17683b423d8f88255/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala#L384-L385 ### Does this PR introduce _any_ user-facing change? Yes. A table or schema comment containing a single quote is now stored correctly instead of failing with a SQL syntax error. ### How was this patch tested? Pass the CIs with a newly added test case. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Claude Code (Claude Opus 4.8) Closes apache#56506 from dongjoon-hyun/SPARK-57446. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
What changes were proposed in this pull request?
This PR applies
escapeSqlto thecommentargument of the JDBC dialect comment querymethods:
JdbcDialect.getTableCommentQuery,JdbcDialect.getSchemaCommentQuery, and theMySQLDialect.getTableCommentQueryoverride.Why are the changes needed?
The
commentwas embedded into a single-quoted SQL literal without escaping, so a commentcontaining
'breaks the statement.escapeSqldoubles single quotes ('->''):COMMENT ON TABLE t IS 'a'b'(invalid)COMMENT ON TABLE t IS 'a''b'(correct)This is consistent with the existing string value treatment.
spark/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala
Lines 384 to 385 in 875982c
Does this PR introduce any user-facing change?
Yes. A table or schema comment containing a single quote is now stored correctly instead
of failing with a SQL syntax error.
How was this patch tested?
Pass the CIs with a newly added test case.
Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code (Claude Opus 4.8)