refactor for SPARK-34079-multi-column-scalar-subquery#4
Closed
attilapiros wants to merge 1 commit into
Closed
Conversation
peter-toth
pushed a commit
that referenced
this pull request
Jan 17, 2023
### What changes were proposed in this pull request? This PR introduces sasl retry count in RetryingBlockTransferor. ### Why are the changes needed? Previously a boolean variable, saslTimeoutSeen, was used. However, the boolean variable wouldn't cover the following scenario: 1. SaslTimeoutException 2. IOException 3. SaslTimeoutException 4. IOException Even though IOException at #2 is retried (resulting in increment of retryCount), the retryCount would be cleared at step #4. Since the intention of saslTimeoutSeen is to undo the increment due to retrying SaslTimeoutException, we should keep a counter for SaslTimeoutException retries and subtract the value of this counter from retryCount. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? New test is added, courtesy of Mridul. Closes apache#39611 from tedyu/sasl-cnt. Authored-by: Ted Yu <yuzhihong@gmail.com> Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
peter-toth
pushed a commit
that referenced
this pull request
Apr 11, 2023
…edExpression() ### What changes were proposed in this pull request? In `EquivalentExpressions.addExpr()`, add a guard `supportedExpression()` to make it consistent with `addExprTree()` and `getExprState()`. ### Why are the changes needed? This fixes a regression caused by apache#39010 which added the `supportedExpression()` to `addExprTree()` and `getExprState()` but not `addExpr()`. One example of a use case affected by the inconsistency is the `PhysicalAggregation` pattern in physical planning. There, it calls `addExpr()` to deduplicate the aggregate expressions, and then calls `getExprState()` to deduplicate the result expressions. Guarding inconsistently will cause the aggregate and result expressions go out of sync, eventually resulting in query execution error (or whole-stage codegen error). ### Does this PR introduce _any_ user-facing change? This fixes a regression affecting Spark 3.3.2+, where it may manifest as an error running aggregate operators with higher-order functions. Example running the SQL command: ```sql select max(transform(array(id), x -> x)), max(transform(array(id), x -> x)) from range(2) ``` example error message before the fix: ``` java.lang.IllegalStateException: Couldn't find max(transform(array(id#0L), lambdafunction(lambda x#2L, lambda x#2L, false)))#4 in [max(transform(array(id#0L), lambdafunction(lambda x#1L, lambda x#1L, false)))#3] ``` after the fix this error is gone. ### How was this patch tested? Added new test cases to `SubexpressionEliminationSuite` for the immediate issue, and to `DataFrameAggregateSuite` for an example of user-visible symptom. Closes apache#40473 from rednaxelafx/spark-42851. Authored-by: Kris Mok <kris.mok@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
peter-toth
pushed a commit
that referenced
this pull request
Aug 22, 2023
…edExpression() ### What changes were proposed in this pull request? In `EquivalentExpressions.addExpr()`, add a guard `supportedExpression()` to make it consistent with `addExprTree()` and `getExprState()`. ### Why are the changes needed? This fixes a regression caused by apache#39010 which added the `supportedExpression()` to `addExprTree()` and `getExprState()` but not `addExpr()`. One example of a use case affected by the inconsistency is the `PhysicalAggregation` pattern in physical planning. There, it calls `addExpr()` to deduplicate the aggregate expressions, and then calls `getExprState()` to deduplicate the result expressions. Guarding inconsistently will cause the aggregate and result expressions go out of sync, eventually resulting in query execution error (or whole-stage codegen error). ### Does this PR introduce _any_ user-facing change? This fixes a regression affecting Spark 3.3.2+, where it may manifest as an error running aggregate operators with higher-order functions. Example running the SQL command: ```sql select max(transform(array(id), x -> x)), max(transform(array(id), x -> x)) from range(2) ``` example error message before the fix: ``` java.lang.IllegalStateException: Couldn't find max(transform(array(id#0L), lambdafunction(lambda x#2L, lambda x#2L, false)))#4 in [max(transform(array(id#0L), lambdafunction(lambda x#1L, lambda x#1L, false)))#3] ``` after the fix this error is gone. ### How was this patch tested? Added new test cases to `SubexpressionEliminationSuite` for the immediate issue, and to `DataFrameAggregateSuite` for an example of user-visible symptom. Closes apache#40473 from rednaxelafx/spark-42851. Authored-by: Kris Mok <kris.mok@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit ef0a76e) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
peter-toth
pushed a commit
that referenced
this pull request
Aug 22, 2023
### What changes were proposed in this pull request? This PR introduces sasl retry count in RetryingBlockTransferor. ### Why are the changes needed? Previously a boolean variable, saslTimeoutSeen, was used. However, the boolean variable wouldn't cover the following scenario: 1. SaslTimeoutException 2. IOException 3. SaslTimeoutException 4. IOException Even though IOException at #2 is retried (resulting in increment of retryCount), the retryCount would be cleared at step #4. Since the intention of saslTimeoutSeen is to undo the increment due to retrying SaslTimeoutException, we should keep a counter for SaslTimeoutException retries and subtract the value of this counter from retryCount. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? New test is added, courtesy of Mridul. Closes apache#39611 from tedyu/sasl-cnt. Authored-by: Ted Yu <yuzhihonggmail.com> Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com> Closes apache#39709 from akpatnam25/SPARK-42090-backport-3.3. Authored-by: Ted Yu <yuzhihong@gmail.com> Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
peter-toth
pushed a commit
that referenced
this pull request
Apr 23, 2024
### What changes were proposed in this pull request? In the `Window` node, both `partitionSpec` and `orderSpec` must be orderable, but the current type check only verifies `orderSpec` is orderable. This can cause an error in later optimizing phases. Given a query: ``` with t as (select id, map(id, id) as m from range(0, 10)) select rank() over (partition by m order by id) from t ``` Before the PR, it fails with an `INTERNAL_ERROR`: ``` org.apache.spark.SparkException: [INTERNAL_ERROR] grouping/join/window partition keys cannot be map type. SQLSTATE: XX000 at org.apache.spark.SparkException$.internalError(SparkException.scala:92) at org.apache.spark.SparkException$.internalError(SparkException.scala:96) at org.apache.spark.sql.catalyst.optimizer.NormalizeFloatingNumbers$.needNormalize(NormalizeFloatingNumbers.scala:103) at org.apache.spark.sql.catalyst.optimizer.NormalizeFloatingNumbers$.org$apache$spark$sql$catalyst$optimizer$NormalizeFloatingNumbers$$needNormalize(NormalizeFloatingNumbers.scala:94) ... ``` After the PR, it fails with a `EXPRESSION_TYPE_IS_NOT_ORDERABLE`, which is expected: ``` org.apache.spark.sql.catalyst.ExtendedAnalysisException: [EXPRESSION_TYPE_IS_NOT_ORDERABLE] Column expression "m" cannot be sorted because its type "MAP<BIGINT, BIGINT>" is not orderable. SQLSTATE: 42822; line 2 pos 53; Project [RANK() OVER (PARTITION BY m ORDER BY id ASC NULLS FIRST ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)#4] +- Project [id#1L, m#0, RANK() OVER (PARTITION BY m ORDER BY id ASC NULLS FIRST ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)#4, RANK() OVER (PARTITION BY m ORDER BY id ASC NULLS FIRST ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)#4] +- Window [rank(id#1L) windowspecdefinition(m#0, id#1L ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS RANK() OVER (PARTITION BY m ORDER BY id ASC NULLS FIRST ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)#4], [m#0], [id#1L ASC NULLS FIRST] +- Project [id#1L, m#0] +- SubqueryAlias t +- SubqueryAlias t +- Project [id#1L, map(id#1L, id#1L) AS m#0] +- Range (0, 10, step=1, splits=None) at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:52) ... ``` ### How was this patch tested? Unit test. Closes apache#45730 from chenhao-db/SPARK-47572. Authored-by: Chenhao Li <chenhao.li@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
peter-toth
pushed a commit
that referenced
this pull request
Jun 28, 2024
… throw internal error
### What changes were proposed in this pull request?
This PR fixes the error messages and classes when Python UDFs are used in higher order functions.
### Why are the changes needed?
To show the proper user-facing exceptions with error classes.
### Does this PR introduce _any_ user-facing change?
Yes, previously it threw internal error such as:
```python
from pyspark.sql.functions import transform, udf, col, array
spark.range(1).select(transform(array("id"), lambda x: udf(lambda y: y)(x))).collect()
```
Before:
```
py4j.protocol.Py4JJavaError: An error occurred while calling o74.collectToPython.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 15 in stage 0.0 failed 1 times, most recent failure: Lost task 15.0 in stage 0.0 (TID 15) (ip-192-168-123-103.ap-northeast-2.compute.internal executor driver): org.apache.spark.SparkException: [INTERNAL_ERROR] Cannot evaluate expression: <lambda>(lambda x_0#3L)#2 SQLSTATE: XX000
at org.apache.spark.SparkException$.internalError(SparkException.scala:92)
at org.apache.spark.SparkException$.internalError(SparkException.scala:96)
```
After:
```
pyspark.errors.exceptions.captured.AnalysisException: [INVALID_LAMBDA_FUNCTION_CALL.UNEVALUABLE] Invalid lambda function call. Python UDFs should be used in a lambda function at a higher order function. However, "<lambda>(lambda x_0#3L)" was a Python UDF. SQLSTATE: 42K0D;
Project [transform(array(id#0L), lambdafunction(<lambda>(lambda x_0#3L)#2, lambda x_0#3L, false)) AS transform(array(id), lambdafunction(<lambda>(lambda x_0#3L), namedlambdavariable()))#4]
+- Range (0, 1, step=1, splits=Some(16))
```
### How was this patch tested?
Unittest was added
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes apache#47079 from HyukjinKwon/SPARK-48706.
Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Kent Yao <yao@apache.org>
peter-toth
added a commit
that referenced
this pull request
May 3, 2026
### What changes were proposed in this pull request? Four bug fixes and two small cleanups in `PlanMerger`: **Bug fixes** in `PlanMerger.scala`: 1. **Tagged `(Filter, Filter)` reuse preserves `mergedChild`'s appended columns.** When the reuse check finds an existing `propagatedFilter` alias, the branch now rebuilds the Filter over `mergedChild` (via `cp.withNewChildren(Seq(mergedChild))`) instead of returning `cp` unchanged. If the recursion extended `cp.child`'s output with new columns (e.g. a computed `d = a + b` from a user Project below the Filter), returning `cp` would drop those columns while `npMapping` still pointed into them, leaving the enclosing `Aggregate` with unresolved references. 2. **`(np: Filter, cp)` create-new does not re-append `cpFilter`.** `cpFilter`, when set, was produced by a deeper `(np, cp: Filter)` (or `(Join, Join)` pass-through) and is already part of `mergedChild`'s output. Appending it a second time via `++ cpFilter.toSeq` duplicated the attribute in the outer Project's projectList. 3. **`(np, cp: Filter)` create-new does not re-append `npFilter`.** Symmetric to #2 on the np side. 4. **`(np, cp: Filter)` with a `MERGED_FILTER_TAG`-tagged `cp` drops the tagged Filter.** cp's condition is `OR(pf_0, pf_1, ...)` and cp's aggregate expressions already carry individual `FILTER (WHERE pf_i)` clauses. Synthesising a new `propagatedFilter_X = OR(pf_0, pf_1, ...)` would just add `FILTER AND(OR(...), pf_i)` wrapping upstream (simplifying to `FILTER pf_i`) plus a redundant alias in the Project. The branch now drops cp's Filter and returns `cpFilter = None` so cp's aggregates are left untouched. **Cleanups** in `PlanMerger.merge`: - Unify the local variable name to `newMergedPlan` across all three branches (was `newMergedPlan` in one and `newMergePlan` in the other two) -- matches the `MergedPlan` case class name. - Replace `cache(i).merged` with `mp.merged`; `mp` and `cache(i)` are the same object inside the `collectFirst` pattern. ### Why are the changes needed? Fix #1 is a correctness bug. Fixes #2-#4 are plan-shape bugs that produce duplicated attributes or redundant `OR`-of-propagated-filter aliases in the merged plan. The cleanups are minor readability improvements. ### Does this PR introduce _any_ user-facing change? No. All changes are internal to the optimizer; they produce cleaner merged plans for queries that `MergeSubplans` already handled. ### How was this patch tested? Four new tests in `MergeSubplansSuite`, one per fix: - `(np: Filter, cp)` create-new must not re-append cpFilter into the Project -- exercises #2 via a `Join` with a `Filter` on the right child, routing a cpFilter up through `(Join, Join)` so that `mergedChild.output` already contains the attribute the branch used to re-append. - `(np, cp: Filter)` create-new must not re-append npFilter into the Project -- exercises #3, mirror shape on the np side. - tagged `(Filter, Filter)` reuse must keep mergedChild's appended columns -- exercises #1 with three subqueries (sq1/sq2 create the tagged structure; sq3's Filter sits above a user Project introducing `d = a + b`, so the `(Filter, Filter)` tagged recursion extends `mergedChild` with `d`). - `(np, cp: Filter)` drops a tagged cp Filter without synthesising a redundant alias -- exercises #4 with three subqueries (sq1/sq2 create the tagged structure; sq3 has no filter). ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Claude Opus 4.7
peter-toth
added a commit
that referenced
this pull request
May 4, 2026
### What changes were proposed in this pull request? Four bug fixes and two small cleanups in `PlanMerger`: **Bug fixes** in `PlanMerger.scala`: 1. **Tagged `(Filter, Filter)` reuse preserves `mergedChild`'s appended columns.** When the reuse check finds an existing `propagatedFilter` alias, the branch now rebuilds the Filter over `mergedChild` (via `cp.withNewChildren(Seq(mergedChild))`) instead of returning `cp` unchanged. If the recursion extended `cp.child`'s output with new columns (e.g. a computed `d = a + b` from a user Project below the Filter), returning `cp` would drop those columns while `npMapping` still pointed into them, leaving the enclosing `Aggregate` with unresolved references. 2. **`(np: Filter, cp)` does not duplicate a `cpFilter` already present in `mergedChild`.** `cpFilter`, when set, was produced by a deeper `(np, cp: Filter)` (or `(Join, Join)` pass-through) and is already part of `mergedChild`'s output. Appending it a second time via `++ cpFilter.toSeq` duplicated the attribute in the outer Project's projectList. 3. **`(np, cp: Filter)` does not duplicate an `npFilter` already present in `mergedChild`.** Symmetric to #2 on the np side. 4. **`(np, cp: Filter)` drops a `MERGED_FILTER_TAG`-tagged `cp` Filter without synthesising a redundant alias.** cp's condition is `OR(pf_0, pf_1, ...)` and cp's aggregate expressions already carry individual `FILTER (WHERE pf_i)` clauses. Synthesising a new `propagatedFilter_X = OR(pf_0, pf_1, ...)` would just add `FILTER AND(OR(...), pf_i)` wrapping upstream (simplifying to `FILTER pf_i`) plus a redundant alias in the Project. The branch now drops cp's Filter and returns `cpFilter = None` so cp's aggregates are left untouched. **Cleanups** in `PlanMerger.merge`: - Unify the local variable name to `newMergedPlan` across all three branches (was `newMergedPlan` in one and `newMergePlan` in the other two) -- matches the `MergedPlan` case class name. - Replace `cache(i).merged` with `mp.merged`; `mp` and `cache(i)` are the same object inside the `collectFirst` pattern. ### Why are the changes needed? Fix #1 is a correctness bug. Fixes #2-#4 are plan-shape bugs that produce duplicated attributes or redundant `OR`-of-propagated-filter aliases in the merged plan. The cleanups are minor readability improvements. ### Does this PR introduce _any_ user-facing change? No. All changes are internal to the optimizer; they produce cleaner merged plans for queries that `MergeSubplans` already handled. ### How was this patch tested? Four new tests in `MergeSubplansSuite`, one per fix: - `(np: Filter, cp)` does not duplicate a cpFilter already present in mergedChild -- exercises #2 via a `Join` with a `Filter` on the right child, routing a cpFilter up through `(Join, Join)` so that `mergedChild.output` already contains the attribute the branch used to re-append. - `(np, cp: Filter)` does not duplicate an npFilter already present in mergedChild -- exercises #3, mirror shape on the np side. - tagged `(Filter, Filter)` reuse must keep mergedChild's appended columns -- exercises #1 with three subqueries (sq1/sq2 create the tagged structure; sq3's Filter sits above a user Project introducing `d = a + b`, so the `(Filter, Filter)` tagged recursion extends `mergedChild` with `d`). - `(np, cp: Filter)` drops a tagged cp Filter without synthesising a redundant alias -- exercises #4 with three subqueries (sq1/sq2 create the tagged structure; sq3 has no filter). ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Claude Opus 4.7
peter-toth
pushed a commit
that referenced
this pull request
May 14, 2026
### What changes were proposed in this pull request? Address the open follow-ups from [SPARK-56681](https://issues.apache.org/jira/browse/SPARK-56681) (umbrella for PATH / SPARK-56605 cleanup) in a single cleanup PR. Items #1 and #2 were already wired by SPARK-56639; this PR covers the remainder. | # | Item | Resolution | |---|---|---| | #1 | `FunctionResolution.resolveProcedure` was dead code | Already wired by SPARK-56639 (no action). | | #2 | Frozen view / SQL-function PATH wiring unfinished | Already done by SPARK-56639 (no action). | | #3 | `AnalysisContext.resolutionPathEntries` threadlocal | Audit only: confirmed `withNewAnalysisContext` / `reset()` correctly clear it. Full removal needs a coordinated refactor to plumb the path through `RelationResolution` / `FunctionResolution` method calls; flagged as a follow-up. | | #4 | `Analyzer.executeAndCheck` clobbers outer `SQLConf.withExistingConf` | Extracted `runWithSessionConf` helper, added `SQLConf.getExistingConfIfSet`. `executeAndCheck` and `executeSameContext` now share one path that yields to any outer scope. | | #5 | `VariableResolution.allowUnqualifiedSessionTempVariableLookup` force-loads default catalog | Replaced the hot-path catalog read with `CatalogManager.isSystemSessionOnPath`, which inspects stored session-path entries directly. No catalog load on column resolution. | | apache#6 | `DROP VARIABLE` PATH gate asymmetric with `DECLARE` / `CREATE` | Removed the gate. DDL on session variables (`DECLARE` / `CREATE` / `DROP`) always targets `system.session` directly; only DML (`SET VAR`, `SELECT x`) goes through PATH. | | apache#7 | `lookupFunctionType` exception swallow too broad | Narrowed from `NonFatal` to the explicit not-found list (`NoSuchFunctionException`, `NoSuchNamespaceException`, `CatalogNotFoundException`, `FORBIDDEN_OPERATION`). Other exceptions propagate. | | apache#8 | `lookupFunctionType` fan-out had wasteful `system.*` candidates | Filtered them out — `system.session`, `system.builtin`, `system.ai` are already resolved earlier in the same method. | | apache#9 | Three near-duplicate path-resolution helpers | Lifted into `CatalogManager.resolutionPathEntriesForAnalysis(pinnedEntries, viewCatalogAndNamespace)`. Relation, routine, and procedure resolution all route through it. | | apache#10 | Tests for the new error paths and gates | Added a DECLARE / SET VAR / DROP cycle test under non-default PATH and a struct-variable field-vs-qualified ambiguity test in `sql-session-variables.sql`. | | apache#11 | `ProtoToParsedPlanTestSuite.analyzerIsolationConf` was a bare `SQLConf` | Clone `spark.sessionState.conf` and only override `PATH_ENABLED=false`, so all `sparkConf` overrides (ANSI, alias config, ...) propagate automatically. | | Bonus | `ResolveSetVariable` hardcoded `SYSTEM.SESSION` regardless of actual PATH | `unresolvedVariableError` now takes `Seq[Seq[String]]` path entries with **required** `Origin` (no overloads). DML lookup failures (`SET VAR`, `FETCH ... INTO`) report the full SQL path as a bracketed list, byte-for-byte consistent with `UNRESOLVED_ROUTINE` and `TABLE_OR_VIEW_NOT_FOUND`. DDL name validation in `ResolveCatalogs` continues to report `[system.session]` since PATH does not apply there. Origin is plumbed through `VariableManager.set` so all error sites carry a `queryContext` pointing at the offending variable identifier (parser opt-ins via `withOrigin(identifierReference)` so the highlight is the variable name, not the whole statement). | ### Why are the changes needed? These are the cleanup items called out on SPARK-56681 from the post-merge source review of SPARK-56605. They eliminate dead code paths, plug user-visible bugs (force-loading a misconfigured default catalog on column resolution; clobbering pinned session configs; swallowing real catalog errors as `UNRESOLVED_ROUTINE`), remove the asymmetry between DDL and DML on session variables, and make `UNRESOLVED_VARIABLE` self-consistent with the other "not found" errors. ### Does this PR introduce _any_ user-facing change? Yes. - **`UNRESOLVED_VARIABLE.searchPath`** is now rendered as a bracketed list. For DML lookups (`SET VAR`, `FETCH ... INTO`), the list reflects the actual SQL PATH that was consulted instead of a hardcoded `SYSTEM.SESSION`. For DDL name validation (`DECLARE` / `DROP` with a non-session namespace), the list is `[`` `system`.`session` ``]` since PATH does not apply. - **`UNRESOLVED_VARIABLE`** now always carries a `queryContext` that highlights just the offending variable identifier (e.g. `"builtin.var1"`, `"ses.var1"`), not the whole `DECLARE` / `SET VAR` statement. - **`DROP TEMPORARY VARIABLE`** no longer raises `UNRESOLVED_VARIABLE` when the SQL PATH does not contain `system.session`. DDL on session variables ignores PATH, matching the existing behaviour of `DECLARE OR REPLACE VARIABLE`. - **`lookupFunctionType`** no longer swallows non–`NotFound` errors. A catalog reporting `PERMISSION_DENIED` (or similar) for a function lookup now propagates instead of silently producing `UNRESOLVED_ROUTINE`. ### How was this patch tested? - Added `sql-session-variables.sql` regression test for the struct-variable field-vs-qualified ambiguity (`DECLARE VARIABLE session STRUCT<a INT>` → `SELECT session.a` succeeds → `DROP` → `SELECT session.a` falls through to `UNRESOLVED_COLUMN`). - Updated `SetPathSuite`: DECLARE / SET VAR / DROP cycle under a non-default PATH; bonus test asserts the actual rendered search path and the variable-identifier `queryContext`. - Updated `SqlScriptingExecutionSuite` for the new bracketed `searchPath` and identifier-pinned `queryContext`. - Regenerated `sql-session-variables.sql.out` for the new error shape. - Added `resolutionPathEntriesForAnalysis` stubs to mocked `CatalogManager` instances in `PlanResolutionSuite`, `AlignAssignmentsSuiteBase`, and `TableLookupCacheSuite`. - Ran focused suites locally; all pass: - `build/sbt 'sql/testOnly *SetPathSuite *SqlScriptingExecutionSuite *ExecuteImmediateEndToEndSuite'` - `build/sbt 'sql/testOnly *SimpleSQLViewSuite *SQLFunctionSuite'` - `build/sbt 'sql/testOnly *PlanResolutionSuite *UpdateTableAlignAssignmentsSuite *MergeIntoTableAlignAssignmentsSuite'` - `build/sbt 'catalyst/testOnly *TableLookupCacheSuite *AnalysisSuite *AnalysisErrorSuite *LookupFunctionsSuite'` - `build/sbt 'sql/testOnly *FunctionQualificationSuite *RelationQualificationSuite *DataSourceV2FunctionSuite'` - `build/sbt 'sql/testOnly *SQLQuerySuite'` - `build/sbt 'connect/testOnly *ProtoToParsedPlanTestSuite'` - `build/sbt 'sql/testOnly *SQLQueryTestSuite -- -z sql-session-variables.sql'` - Full `org.apache.spark.sql.catalyst.analysis.*`, `org.apache.spark.sql.catalyst.parser.*`, and `org.apache.spark.sql.analysis.resolver.*` suites. - `scalastyle` and `scalafmt` clean across catalyst, sql, and connect modules. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Cursor Claude Opus 4.7 Closes apache#55647 from srielau/SPARK-56681-patch-clean-up. Authored-by: Serge Rielau <serge@rielau.com> Signed-off-by: Daniel Tenedorio <daniel.tenedorio@databricks.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.