[SPARK-21603][SQL]The wholestage codegen will be much slower then that is closed when the function is too long#18810
[SPARK-21603][SQL]The wholestage codegen will be much slower then that is closed when the function is too long#18810eatoncys wants to merge 20 commits into
Conversation
| */ | ||
| def existTooLongFunction(): Boolean = { | ||
| classFunctions.exists { case (className, functions) => | ||
| functions.exists{ case (name, code) => |
There was a problem hiding this comment.
Could you elaborate why only the number of characters for a function is a decision factor to enable or disable the whole-stage codegen?
There was a problem hiding this comment.
@kiszk Because when the JVM parameter -XX:+DontCompileHugeMethods is true, it can not get the JIT optimization when the byte code of a function is longer than 8000, I just estimate a function lines by 8000 byte code, maybe there are some other good ways.
There was a problem hiding this comment.
Got it. Thank you for your explanation. It would be good to add comment for this reasoning.
We have seen the similar control at here. Can we unify these control mechanisms into one?
There was a problem hiding this comment.
Does this line counts include code comments in the functions? Do we need to strip comments?
| .doc("The maximum number of function length that will be supported before" + | ||
| " deactivating whole-stage codegen.") | ||
| .intConf | ||
| .createWithDefault(1500) |
There was a problem hiding this comment.
Would it be possible to explain why 1500 is the good value as default?
|
|
||
| benchmark.addCase(s"codegen = T") { iter => | ||
| sparkSession.conf.set("spark.sql.codegen.wholeStage", "true") | ||
| sparkSession.conf.set("spark.sql.codegen.MaxFunctionLength", "10000") |
There was a problem hiding this comment.
Why doesn't this benchmark use the default number 1500?
There was a problem hiding this comment.
Ok, I have added a test use the default number 1500, thanks.
|
ok to test |
|
Test build #80318 has finished for PR 18810 at commit
|
| val existLongFunction = ctx.existTooLongFunction | ||
| if (existLongFunction) { | ||
| logWarning(s"Function is too long, Whole-stage codegen disabled for this plan:\n " | ||
| + s"$treeString") |
There was a problem hiding this comment.
This could be very big. Please follow what did in #18658
There was a problem hiding this comment.
@gatorsmile , thank you for review, the treeString not contains the code, it only contains the tree string of the Physical plan like below:
*HashAggregate(keys=[k1#2395L, k2#2396, k3#2397], functions=[partial_sum(id#2392L)...
+- *Project [id#2392L, (id#2392L & 1023) AS k1#2395L, cast((id#2392L & 1023) as double) AS k2#2396...
+- *Range (0, 655360, step=1, splits=1)
So, I think it will not be very big.
|
|
||
| /** | ||
| * Returns the length of codegen function is too long or not | ||
| */ |
There was a problem hiding this comment.
Adding a checking logics here, instead of returning Boolean?
There was a problem hiding this comment.
Ok, I have modified it, thanks.
|
Test build #80325 has finished for PR 18810 at commit
|
| */ | ||
| private val placeHolderToComments = new mutable.HashMap[String, String] | ||
|
|
||
| /** |
There was a problem hiding this comment.
nit: Returns if the length ....
nit: Please remove extra space before is too long.
There was a problem hiding this comment.
Ok, I have modified it, thanks.
| val WHOLESTAGE_MAX_FUNCTION_LEN = buildConf("spark.sql.codegen.MaxFunctionLength") | ||
| .internal() | ||
| .doc("The maximum number of function length that will be supported before" + | ||
| " deactivating whole-stage codegen.") |
| val (ctx, cleanedSource) = doCodeGen() | ||
| val existLongFunction = ctx.existTooLongFunction | ||
| if (existLongFunction) { | ||
| logWarning(s"Function is too long, Whole-stage codegen disabled for this plan:\n " |
There was a problem hiding this comment.
Simply explain why Whole-stage codegen is disabled. Found too long generated codes and JIT optimization might not work. Whole-stage codegen disabled for this plan...
There was a problem hiding this comment.
Btw, we can also add log message like You can change the config spark.sql.codegen.MaxFunctionLength to adjust the function length limit.
In case users wants to run whole-stage codegen intentionally.
|
Test build #80326 has finished for PR 18810 at commit
|
|
Test build #80329 has finished for PR 18810 at commit
|
|
cc @gatorsmile |
| "disable logging or -1 to apply no limit.") | ||
| .createWithDefault(1000) | ||
|
|
||
| val WHOLESTAGE_MAX_FUNCTION_LEN = buildConf("spark.sql.codegen.MaxFunctionLength") |
There was a problem hiding this comment.
MaxFunctionLength -> maxLinesPerFunction
There was a problem hiding this comment.
Ok, I have modified it, thanks
| */ | ||
| private val placeHolderToComments = new mutable.HashMap[String, String] | ||
|
|
||
| /** |
There was a problem hiding this comment.
length is misleading. Here, it is just number of lines.
There was a problem hiding this comment.
Ok, I have modified it, thanks
| */ | ||
| def existTooLongFunction(): Boolean = { | ||
| classFunctions.exists { case (className, functions) => | ||
| functions.exists{ case (name, code) => |
There was a problem hiding this comment.
Like what @viirya said, could you need to check whether it excludes the comments?
There was a problem hiding this comment.
Ok, I have modified it to count lines without comments and extra new lines
|
Test build #80470 has finished for PR 18810 at commit
|
| logWarning(s"Found too long generated codes and JIT optimization might not work, " + | ||
| s"Whole-stage codegen disabled for this plan, " + | ||
| s"You can change the config spark.sql.codegen.MaxFunctionLength " + | ||
| s"to adjust the function length limit:\n " |
|
Thanks! Merging to master. |
|
(copied from jira for just-in-case) Just for your information, I checked the performance changes of TPCDS before/after the pr #18810; the pr affected Q17/Q66 only (that is, they have too long codegen'd functions). The changes are as follows (just run TPCDSQueryBenchmark); It seems their queries have gen'd funcs with 2800~2900 lines, so if we set 2900 at spark.sql.codegen.maxLinesPerFunction, we could keep the previous performance w/o pr18810. |
|
yea, I'll do |
|
Btw, as for merged prs, I'm just monitoring TPCDS perf. in here. Also, I wrote a script before to run TPCDS on pending prs: https://github.com/maropu/spark-tpcds-datagen#helper-scripts-for-benchmarks. |
|
Thank you for tracking it! Could you adjust the conf to a higher number (e.g. 4097) and rerun the perf? |
|
ok, I'll make a pr as follow-up. |
What changes were proposed in this pull request?
Close the whole stage codegen when the function lines is longer than the maxlines which will be setted by
spark.sql.codegen.MaxFunctionLength parameter, because when the function is too long , it will not get the JIT optimizing.
A benchmark test result is 10x slower when the generated function is too long :
ignore("max function length of wholestagecodegen") {
val N = 20 << 15
}
How was this patch tested?
Run the unit test