[multistage] add lookup join support to physical optimizer#18158
[multistage] add lookup join support to physical optimizer#18158dang-stripe wants to merge 3 commits intoapache:masterfrom
Conversation
| return rootNode; | ||
| } | ||
| return transform(rootNode, null); | ||
| } |
There was a problem hiding this comment.
i've skipped lite mode since it adds quite a bit of scope. the broker gets assigned as worker for the join fragment so we'd need reassign it to a server and handle routing.
There was a problem hiding this comment.
We have to approach lookup joins a bit differently to support it in a generic way that's compatible with the other optimizations. But for now this looks good to me.
|
thanks @dang-stripe. will take a look at this over the weekend. I am curious: have you seen any workloads where physical optimizer performs better than the other optimizer? |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #18158 +/- ##
============================================
- Coverage 63.96% 63.37% -0.59%
- Complexity 1594 1627 +33
============================================
Files 3178 3230 +52
Lines 193466 196793 +3327
Branches 29880 30427 +547
============================================
+ Hits 123752 124724 +972
- Misses 59942 62073 +2131
- Partials 9772 9996 +224
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
@ankitsultana thanks for looking! yep we're seeing around 50% reduction in p50/p75 broker latency for one of our workloads. this workload involves rather complex queries doing up to 5-6 joins + window operations. we both enabled physical optimizer for the queries and colocated segments for each partition to a single server, so we're seeing a large reduction in shuffle exchanges. |
| return rootNode; | ||
| } | ||
| return transform(rootNode, null); | ||
| } |
There was a problem hiding this comment.
We have to approach lookup joins a bit differently to support it in a generic way that's compatible with the other optimizations. But for now this looks good to me.
| @@ -1126,5 +1023,109 @@ | |||
| ] | |||
| } | |||
| ] | |||
| }, | |||
| "physical_opt_lookup_join": { | |||
There was a problem hiding this comment.
Does this work if we are doing lookups on multiple tables? Maybe add a test for that too
There was a problem hiding this comment.
yep it does. added a test for it.
| ] | ||
| }, | ||
| { | ||
| "description": "Lookup join with aggregation — lookup join feeds into GROUP BY with leaf/final aggregate split", |
There was a problem hiding this comment.
is it possible to add a model query that you folks are using in your production? want to make sure that regressions can be caught by unit tests itself
There was a problem hiding this comment.
yep added two tests cases below to cover the patterns we're doing in production
f264a62 to
d74b37e
Compare
|
@ankitsultana thanks for the review! addressed feedback. let me know if there's anything else i should address. |
|
@dang-stripe : should be good to merge since this only impacts lookup joins. will wait for tests to pass. I'll raise another PR hopefully soon to change the approach though. There's a different way we need to go about this to support lookup joins more broadly. Will share a design doc too. It also ties in with sub-plan based execution so it will enable a lot more optimizations in the future. |
|
@ankitsultana got it. i'm curious, what's the high level approach to doing it properly? |
Addresses: #17961
Problem
Using lookup joins in the physical optimizer via query hint
joinOptions(join_strategy='lookup')would fail with"Right input must be leaf operator". The optimizer inserted aBROADCAST_EXCHANGEon the dimension table side, splitting it into a separate fragment, butLookupJoinOperatorneeds the dimension table as aLeafOperatorin the same fragment because it expects to have access to aDimensionTableDataManager.Solution
This adds lookup join support to the V2 physical optimizer by:
LOOKUP_LOCAL_EXCHANGEwhich acts as a pseudo-exchange so the fragment is classified as a leaf stage and the dim table is visible in EXPLAIN plansLookupJoinRuleto isolate every lookup join in its own plan fragment with exchanges on all sides (IDENTITY_EXCHANGE above the join and for non-dim table, LOOKUP_LOCAL_EXCHANGE for dim table)graph TD subgraph before ["Before: Dim table split into separate fragment"] A0[Downstream operators] --> A1["PhysicalJoin<br/>join_strategy=lookup"] A1 --> B1[IDENTITY_EXCHANGE] A1 --> C1["IDENTITY_EXCHANGE<br/>❌ splits dim into separate fragment"] B1 --> D1[FactTableScan] C1 --> E1[DimTableScan] style C1 fill:#f88,stroke:#c00 style E1 fill:#f88,stroke:#c00 end subgraph after ["After: LookupJoinRule isolates the lookup join"] A2[Downstream operators] --> IX_ABOVE["IDENTITY_EXCHANGE<br/>(above — isolation)"] IX_ABOVE --> J2["PhysicalJoin<br/>join_strategy=lookup"] J2 --> IX_LEFT["IDENTITY_EXCHANGE<br/>(left — leaf boundary)"] J2 --> LLE["LOOKUP_LOCAL_EXCHANGE<br/>(right — pseudo, no split)"] IX_LEFT --> FS2[FactTableScan] LLE --> DS2[DimTableScan] style LLE fill:#8f8,stroke:#0a0 style IX_ABOVE fill:#8cf,stroke:#06c style IX_LEFT fill:#8cf,stroke:#06c end before ~~~ afterAnother alternative was to match MSE v1 behavior by skipping exchange on right side, but I didn't go down this path since omitting exchanges could break assumptions in other rules (every stage boundary has a PhysicalExchange) and we wouldn't get visibility in explain plans.
Future work
Testing
We've deployed this on our production clusters and have compared production query results (multiple chained joins using dim tables for date spines and fx rates) with and without the physical optimizer. We haven't seen issues with correctness.
cc @ankitsultana @shauryachats