Skip to content

[SPARK-49941][CORE] Rename errorClass to condition in errors of the JSON format#48431

Closed
MaxGekk wants to merge 8 commits into
apache:masterfrom
MaxGekk:json-error-msg-condition
Closed

[SPARK-49941][CORE] Rename errorClass to condition in errors of the JSON format#48431
MaxGekk wants to merge 8 commits into
apache:masterfrom
MaxGekk:json-error-msg-condition

Conversation

@MaxGekk

@MaxGekk MaxGekk commented Oct 12, 2024

Copy link
Copy Markdown
Member

What changes were proposed in this pull request?

In the PR, I propose to rename the errorClass to condition in errors in the JSON formats: MINIMAL and STANDARD.

For example:

{
  "condition" : "DIVIDE_BY_ZERO",
  "sqlState" : "22012",
  "messageParameters" : { "config" : "CONFIG"}
}

The name condition was taken because it is used the SQL standard:
Screenshot 2024-10-09 at 19 38 51
and no need extra words as a suffix or prefix in the context of error message format.

Why are the changes needed?

To follow new naming convention introduced by #44902.

Does this PR introduce any user-facing change?

Yes.

How was this patch tested?

By running the affected tests:

$ build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite"

Was this patch authored or co-authored using generative AI tooling?

No.

@MaxGekk MaxGekk changed the title [WIP] Rename errorClass to condition in errors of the JSON format [SPARK-49941][CORE] Rename errorClass to condition in errors of the JSON format Oct 12, 2024
@MaxGekk MaxGekk marked this pull request as ready for review October 12, 2024 19:19
@MaxGekk MaxGekk requested a review from cloud-fan October 12, 2024 19:19
@MaxGekk

MaxGekk commented Oct 12, 2024

Copy link
Copy Markdown
Member Author

@srielau @panbingkun @nchammas @cloud-fan Could you review the PR, please.

@MaxGekk MaxGekk changed the title [SPARK-49941][CORE] Rename errorClass to condition in errors of the JSON format [SPARK-49941][CORE] Rename errorClass to condition in errors of the JSON format Oct 12, 2024
@panbingkun

Copy link
Copy Markdown
Contributor

LGTM.
nit: Do we need to update the following?
image

@MaxGekk

MaxGekk commented Oct 14, 2024

Copy link
Copy Markdown
Member Author

@panbingkun Thank you for review.

  1. UIUtils: The name errorClass is regexp group name. Not related to the changes.
  2. SQLJsonProtocolSuite: it is just an example in a test. It could handle old input.

@panbingkun

panbingkun commented Oct 15, 2024

Copy link
Copy Markdown
Contributor
  1. UIUtils

Yeah!

  • Yes, it is indeed the group name of a regular expression. Not related to the changes, and calling it as errorClass will not affect the final functionality.
    (PS: As a follow-up, I'm not sure if we need to rename it to condiiton to reduce misunderstandings? Because from UT, it seems that its purpose is to obtain the value of condiiton (In the past, it was called errorClass)
    val e1 = "Job aborted due to stage failure: Task 0 in stage 1.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1.0 (TID 1) (10.221.98.22 executor driver): org.apache.spark.SparkArithmeticException: [DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set \"spark.sql.ansi.enabled\" to \"false\" to bypass this error.\n== SQL (line 1, position 8) ==\nselect a/b from src\n ^^^\n\n\tat org.apache.spark.sql.errors.QueryExecutionErrors$.divideByZeroError(QueryExecutionErrors.scala:226)\n\tat org.apache.spark.sql.errors.QueryExecutionErrors.divideByZeroError(QueryExecutionErrors.scala)\n\tat org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(generated.java:54)\n\tat org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)\n\tat org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43)\n\tat org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:388)\n\tat org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)\n\tat org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)\n\tat org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)\n\tat org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)\n\tat org.apache.spark.rdd.RDD.iterator(RDD.scala:328)\n\tat org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)\n\tat org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)\n\tat org.apache.spark.scheduler.Task.run(Task.scala:141)\n\tat org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:592)\n\tat org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1474)\n\tat org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:595)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat java.lang.Thread.run(Thread.java:750)\n\nDriver stacktrace:"
    val cell1 = UIUtils.errorMessageCell(e1)
    assert(cell1 === <td>{"DIVIDE_BY_ZERO"}{UIUtils.detailsUINode(isMultiline = true, e1)}</td>)

    (Because after this PR, I believe that in the error log, there should only be condiiton and no errorClass.)
  • Okay.

@nchammas nchammas left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but is there a summary of why we are going with condition rather than errorCondition?

I remember seeing a discussion somewhere about this, but it's not on the ticket associated with this PR.

@MaxGekk

MaxGekk commented Oct 17, 2024

Copy link
Copy Markdown
Member Author

@nchammas Thank you for your comment. I have updated PR's description and added reasons for the name.

val g = generator.useDefaultPrettyPrinter()
g.writeStartObject()
g.writeStringField("errorClass", "LEGACY")
g.writeStringField("condition", "LEGACY")

@beliefer beliefer Feb 22, 2025

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about errorCondition? The name condition makes confusion for developers who not read the document common/utils/src/main/resources/error/README.md.

@github-actions

github-actions Bot commented Jun 3, 2025

Copy link
Copy Markdown

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions Bot added the Stale label Jun 3, 2025
@github-actions github-actions Bot closed this Jun 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants