[SPARK-42052][SQL] Codegen Support for HiveSimpleUDF#40397
Conversation
|
cc @cloud-fan |
|
|
||
| def evaluate(): Any | ||
|
|
||
| final def doGenCode(ctx: CodegenContext, ev: ExprCode, dataType: DataType): ExprCode = { |
There was a problem hiding this comment.
It's weird to implement codegen in the evaluator. If we really want to deduplicate the code, let's add HiveUDFExpressionBase later.
There was a problem hiding this comment.
OK, Let's reserve some redundant logic first.
| } | ||
| } | ||
|
|
||
| abstract class HiveUDFEvaluatorBase[UDFType <: AnyRef]( |
There was a problem hiding this comment.
can we move evaluators to a separated file?
| lazy val function = funcWrapper.createFunction[UDF]() | ||
| private val isUDFDeterministic = { | ||
| val udfType = evaluator.function.getClass.getAnnotation(classOf[HiveUDFType]) | ||
| udfType != null && udfType.deterministic() && !udfType.stateful() |
There was a problem hiding this comment.
the code seems to be the same with generic UDF. maybe we can move it to HiveUDFEvaluatorBase
| case (child, idx) => | ||
| evaluator.setArg(idx, child.eval(input)) |
There was a problem hiding this comment.
| case (child, idx) => | |
| evaluator.setArg(idx, child.eval(input)) | |
| case (child, idx) => evaluator.setArg(idx, child.eval(input)) |
| | $resultTerm = ($resultType) $refEvaluator.evaluate(); | ||
| | ${ev.isNull} = $resultTerm == null; | ||
| |} catch (Throwable e) { | ||
| | throw QueryExecutionErrors.failedExecuteUserDefinedFunctionError( |
There was a problem hiding this comment.
shall we move the try-catch to evaluator.evaluate()?
There was a problem hiding this comment.
BTW this seems like an unrelated change. The previous code does not rethrow the exception.
| udfType != null && udfType.deterministic() && !udfType.stateful() | ||
| } | ||
|
|
||
| def returnType: DataType |
There was a problem hiding this comment.
We have to add this method here because it will be used in exception handling.
| lazy val function = funcWrapper.createFunction[UDFType]() | ||
|
|
||
| @transient | ||
| val isUDFDeterministic = { |
There was a problem hiding this comment.
It should be lazy val, as it accesses a lazy val.
|
@cloud-fan Can we merge it to master? After it I will try to refactor HiveGenericUDTF & HiveUDAFFunction. Thanks! |
|
thanks, merging to master! |
…pleUDF (#1288) ### What changes were proposed in this pull request? - As a subtask of [SPARK-42050](https://issues.apache.org/jira/browse/SPARK-42050), this PR adds Codegen Support for HiveSimpleUDF - Extract a`HiveUDFEvaluatorBase` class for the common behaviors of HiveSimpleUDFEvaluator & HiveGenericUDFEvaluator. ### Why are the changes needed? - Improve codegen coverage and performance. - Following #39949. Make the code more concise. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Add new UT. Pass GA. Closes #40397 from panbingkun/refactor_HiveSimpleUDF. Authored-by: panbingkun <pbk1982@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
HiveUDFEvaluatorBaseclass for the common behaviors of HiveSimpleUDFEvaluator & HiveGenericUDFEvaluator.Why are the changes needed?
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Add new UT.
Pass GA.