Skip to content

[VL] Rework co-fallback mechanism of bloom-filter might_contain/agg#5435

Merged
zhztheplayer merged 12 commits into
apache:mainfrom
zhztheplayer:wip-bloomfilter-fix
Apr 19, 2024
Merged

[VL] Rework co-fallback mechanism of bloom-filter might_contain/agg#5435
zhztheplayer merged 12 commits into
apache:mainfrom
zhztheplayer:wip-bloomfilter-fix

Conversation

@zhztheplayer

@zhztheplayer zhztheplayer commented Apr 17, 2024

Copy link
Copy Markdown
Member

Velox's bloom-filter agg/filter functions are logically different with Spark's version. This makes their resident Gluten/Spark Filter/Aggregate operators logically different with Spark's version. For such logical differences, we should use different functions to distinguish between implementations rather than reusing Spark's function type in Velox backend.

Patch incorporates:

  1. Fix [VL] Add a bad test case when bloom_filter_agg is fallen back while might_contain is not #5433.
  2. Add VeloxBloomFilterMightContain / VeloxBloomFilterAggregate.
  3. Add VeloxBloomFilter inheriting Spark's BloomFilter to make sure fallen-backvelox_might_contain expression work correctly. Should support fallen-back velox_bloom_filter_agg in future.
  4. Remove rule FallbackBloomFilterAggIfNeeded ([VL] Make bloom_filter_agg fall back when might_contain is not transformable #3994).
  5. Add logical rule BloomFilterMightContainJointRewriteRule in Velox backend module only, to convert BloomFilterMightContain / BloomFilterAggregate to VeloxBloomFilterMightContain / VeloxBloomFilterAggregate when Gluten is enabled.

The patch makes the relevant code safer than before, since we'll have explicit function pair of BloomFilterMightContain / BloomFilterAggregate and VeloxBloomFilterMightContain / VeloxBloomFilterAggregate. Thus we can easierly detect a mismatch between agg/filter function by checking their function types, rather than failing the query execution at runtime.

@apache apache deleted a comment from github-actions Bot Apr 17, 2024
@apache apache deleted a comment from github-actions Bot Apr 17, 2024
@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI

@zhztheplayer zhztheplayer marked this pull request as ready for review April 18, 2024 03:32
@apache apache deleted a comment from github-actions Bot Apr 18, 2024
@zhztheplayer zhztheplayer marked this pull request as draft April 18, 2024 05:35
@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI

1 similar comment
@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI

@zhztheplayer

zhztheplayer commented Apr 19, 2024

Copy link
Copy Markdown
Member Author

Example Velox might_contain code with vanilla code-gen

public Object generate(Object[] references) {
return new GeneratedIteratorForCodegenStage1(references);
}

/*wsc_codegenStageId*/
final class GeneratedIteratorForCodegenStage1 extends org.apache.spark.sql.execution.BufferedRowIterator {
private Object[] references;
private scala.collection.Iterator[] inputs;
private scala.collection.Iterator localtablescan_input_0;
private org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter[] filter_mutableStateArray_1 = new org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter[1];
private org.apache.spark.util.sketch.VeloxBloomFilter[] filter_mutableStateArray_0 = new org.apache.spark.util.sketch.VeloxBloomFilter[1];

public GeneratedIteratorForCodegenStage1(Object[] references) {
this.references = references;
}

public void init(int index, scala.collection.Iterator[] inputs) {
partitionIndex = index;
this.inputs = inputs;
localtablescan_input_0 = inputs[0];

filter_mutableStateArray_1[0] = new org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter(1, 0);

filter_mutableStateArray_0[0] = org.apache.spark.util.sketch.VeloxBloomFilter.readFrom(((byte[]) references[2] /* bloomFilterData */));
}

protected void processNext() throws java.io.IOException {
while ( localtablescan_input_0.hasNext()) {
InternalRow localtablescan_row_0 = (InternalRow) localtablescan_input_0.next();
((org.apache.spark.sql.execution.metric.SQLMetric) references[0] /* numOutputRows */).add(1);
do {
long localtablescan_value_0 = localtablescan_row_0.getLong(0);

boolean filter_isNull_0 = false;
boolean filter_value_0 = false;
if (!filter_isNull_0) {
filter_value_0 = filter_mutableStateArray_0[0].mightContainLong((Long)localtablescan_value_0);
}
if (filter_isNull_0 || !filter_value_0) continue;

((org.apache.spark.sql.execution.metric.SQLMetric) references[1] /* numOutputRows */).add(1);

filter_mutableStateArray_1[0].reset();

filter_mutableStateArray_1[0].write(0, localtablescan_value_0);
append((filter_mutableStateArray_1[0].getRow()));

} while(false);
if (shouldStop()) return;
}
}

}

bloom filter

bloom filter

bloom filter

bloom filter

bloom filter

bloom filter

bloom filter

bloom filter

bloom filter
fixup

fixup

fixup
@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI

@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI

1 similar comment
@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI

@zhztheplayer zhztheplayer marked this pull request as ready for review April 19, 2024 03:02
@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI

Comment on lines +55 to +58
override def onExecutorStart(conf: SparkConf): Unit = {
UDFResolver.resolveUdfConf(conf, isDriver = false)
initialize(conf)
}

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved UDF resolution to this file

@marin-ma

@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI

@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI

1 similar comment
@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI

@zhztheplayer

Copy link
Copy Markdown
Member Author

/Benchmark Velox

collectWithSubqueries(df.queryExecution.executedPlan) {
case h if h.isInstanceOf[HashAggregateExecBaseTransformer] => h
}.size == 0,
}.size == 2,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you document this change?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to add documents. Will change the test name to Test bloom_filter_agg offloaded with filter fallen back to clarify in next patch

@marin-ma marin-ma left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@zhztheplayer zhztheplayer merged commit 884ef64 into apache:main Apr 19, 2024
@GlutenPerfBot

Copy link
Copy Markdown
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_5435_time.csv log/native_master_04_17_2024_9b3f59a1c_time.csv difference percentage
q1 34.77 36.61 1.835 105.28%
q2 26.90 24.12 -2.786 89.65%
q3 37.16 37.01 -0.143 99.61%
q4 42.08 38.03 -4.054 90.37%
q5 70.58 70.92 0.339 100.48%
q6 5.85 5.81 -0.037 99.37%
q7 85.22 86.12 0.897 101.05%
q8 85.93 82.86 -3.073 96.42%
q9 122.34 123.45 1.108 100.91%
q10 44.65 45.70 1.048 102.35%
q11 19.77 20.35 0.582 102.94%
q12 27.40 28.15 0.748 102.73%
q13 54.95 54.46 -0.491 99.11%
q14 17.57 17.88 0.314 101.79%
q15 31.45 30.69 -0.757 97.59%
q16 12.92 14.00 1.081 108.37%
q17 103.21 101.27 -1.932 98.13%
q18 144.87 143.83 -1.044 99.28%
q19 14.75 13.61 -1.138 92.28%
q20 29.10 27.86 -1.241 95.73%
q21 295.38 285.86 -9.518 96.78%
q22 15.06 14.42 -0.641 95.75%
total 1321.91 1303.01 -18.904 98.57%

@GlutenPerfBot

Copy link
Copy Markdown
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_5435_time.csv log/native_master_04_17_2024_9b3f59a1c_time.csv difference percentage
q1 34.73 36.61 1.875 105.40%
q2 24.08 24.12 0.038 100.16%
q3 39.19 37.01 -2.173 94.46%
q4 39.96 38.03 -1.928 95.17%
q5 69.15 70.92 1.774 102.57%
q6 7.31 5.81 -1.493 79.56%
q7 83.39 86.12 2.731 103.28%
q8 85.84 82.86 -2.979 96.53%
q9 119.28 123.45 4.172 103.50%
q10 46.25 45.70 -0.555 98.80%
q11 19.69 20.35 0.656 103.33%
q12 26.31 28.15 1.837 106.98%
q13 55.44 54.46 -0.976 98.24%
q14 17.76 17.88 0.121 100.68%
q15 29.44 30.69 1.253 104.26%
q16 14.52 14.00 -0.517 96.44%
q17 100.38 101.27 0.894 100.89%
q18 141.23 143.83 2.596 101.84%
q19 15.55 13.61 -1.931 87.58%
q20 27.19 27.86 0.676 102.49%
q21 284.93 285.86 0.929 100.33%
q22 14.40 14.42 0.018 100.12%
total 1295.99 1303.01 7.017 100.54%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants