Skip to content

[multistage] add lookup join support to physical optimizer#18158

Open
dang-stripe wants to merge 3 commits intoapache:masterfrom
dang-stripe:dang-lookup-join-physical-optimizer
Open

[multistage] add lookup join support to physical optimizer#18158
dang-stripe wants to merge 3 commits intoapache:masterfrom
dang-stripe:dang-lookup-join-physical-optimizer

Conversation

@dang-stripe
Copy link
Copy Markdown
Contributor

@dang-stripe dang-stripe commented Apr 10, 2026

Addresses: #17961

Problem

Using lookup joins in the physical optimizer via query hint joinOptions(join_strategy='lookup') would fail with "Right input must be leaf operator". The optimizer inserted a BROADCAST_EXCHANGE on the dimension table side, splitting it into a separate fragment, but LookupJoinOperator needs the dimension table as a LeafOperator in the same fragment because it expects to have access to a DimensionTableDataManager.

Solution

This adds lookup join support to the V2 physical optimizer by:

  • Adds a LOOKUP_LOCAL_EXCHANGE which acts as a pseudo-exchange so the fragment is classified as a leaf stage and the dim table is visible in EXPLAIN plans
  • Adding a LookupJoinRule to isolate every lookup join in its own plan fragment with exchanges on all sides (IDENTITY_EXCHANGE above the join and for non-dim table, LOOKUP_LOCAL_EXCHANGE for dim table)
  graph TD
      subgraph before ["Before: Dim table split into separate fragment"]
          A0[Downstream operators] --> A1["PhysicalJoin<br/>join_strategy=lookup"]
          A1 --> B1[IDENTITY_EXCHANGE]
          A1 --> C1["IDENTITY_EXCHANGE<br/>❌ splits dim into separate fragment"]
          B1 --> D1[FactTableScan]
          C1 --> E1[DimTableScan]
          style C1 fill:#f88,stroke:#c00
          style E1 fill:#f88,stroke:#c00
      end

      subgraph after ["After: LookupJoinRule isolates the lookup join"]
          A2[Downstream operators] --> IX_ABOVE["IDENTITY_EXCHANGE<br/>(above — isolation)"]
          IX_ABOVE --> J2["PhysicalJoin<br/>join_strategy=lookup"]
          J2 --> IX_LEFT["IDENTITY_EXCHANGE<br/>(left — leaf boundary)"]
          J2 --> LLE["LOOKUP_LOCAL_EXCHANGE<br/>(right — pseudo, no split)"]
          IX_LEFT --> FS2[FactTableScan]
          LLE --> DS2[DimTableScan]
          style LLE fill:#8f8,stroke:#0a0
          style IX_ABOVE fill:#8cf,stroke:#06c
          style IX_LEFT fill:#8cf,stroke:#06c
      end

      before ~~~ after
Loading

Another alternative was to match MSE v1 behavior by skipping exchange on right side, but I didn't go down this path since omitting exchanges could break assumptions in other rules (every stage boundary has a PhysicalExchange) and we wouldn't get visibility in explain plans.

Future work

  • Support lookup joins in MSE lite mode
  • Auto-detect lookup joins when either side of join is dim table joining on primary keys

Testing

We've deployed this on our production clusters and have compared production query results (multiple chained joins using dim tables for date spines and fx rates) with and without the physical optimizer. We haven't seen issues with correctness.

cc @ankitsultana @shauryachats

return rootNode;
}
return transform(rootNode, null);
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i've skipped lite mode since it adds quite a bit of scope. the broker gets assigned as worker for the join fragment so we'd need reassign it to a server and handle routing.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to approach lookup joins a bit differently to support it in a generic way that's compatible with the other optimizations. But for now this looks good to me.

@ankitsultana
Copy link
Copy Markdown
Contributor

thanks @dang-stripe. will take a look at this over the weekend.

I am curious: have you seen any workloads where physical optimizer performs better than the other optimizer?

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 10, 2026

Codecov Report

❌ Patch coverage is 85.57692% with 15 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.37%. Comparing base (a8a6ea2) to head (d74b37e).
⚠️ Report is 51 commits behind head on master.

Files with missing lines Patch % Lines
.../planner/physical/v2/opt/rules/LookupJoinRule.java 81.03% 5 Missing and 6 partials ⚠️
.../physical/v2/PlanFragmentAndMailboxAssignment.java 90.24% 1 Missing and 3 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18158      +/-   ##
============================================
- Coverage     63.96%   63.37%   -0.59%     
- Complexity     1594     1627      +33     
============================================
  Files          3178     3230      +52     
  Lines        193466   196793    +3327     
  Branches      29880    30427     +547     
============================================
+ Hits         123752   124724     +972     
- Misses        59942    62073    +2131     
- Partials       9772     9996     +224     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-11 63.33% <85.57%> (-0.59%) ⬇️
java-21 63.36% <85.57%> (-0.56%) ⬇️
temurin 63.37% <85.57%> (-0.59%) ⬇️
unittests 63.37% <85.57%> (-0.59%) ⬇️
unittests1 55.37% <85.57%> (-0.48%) ⬇️
unittests2 34.95% <0.00%> (+0.57%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@dang-stripe
Copy link
Copy Markdown
Contributor Author

have you seen any workloads where physical optimizer performs better than the other optimizer?

@ankitsultana thanks for looking! yep we're seeing around 50% reduction in p50/p75 broker latency for one of our workloads. this workload involves rather complex queries doing up to 5-6 joins + window operations. we both enabled physical optimizer for the queries and colocated segments for each partition to a single server, so we're seeing a large reduction in shuffle exchanges.

return rootNode;
}
return transform(rootNode, null);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to approach lookup joins a bit differently to support it in a generic way that's compatible with the other optimizations. But for now this looks good to me.

@@ -1126,5 +1023,109 @@
]
}
]
},
"physical_opt_lookup_join": {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work if we are doing lookups on multiple tables? Maybe add a test for that too

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep it does. added a test for it.

@ankitsultana ankitsultana added multi-stage Related to the multi-stage query engine mse-physical-optimizer Multi-stage engine physical query optimizer labels Apr 12, 2026
]
},
{
"description": "Lookup join with aggregation — lookup join feeds into GROUP BY with leaf/final aggregate split",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to add a model query that you folks are using in your production? want to make sure that regressions can be caught by unit tests itself

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep added two tests cases below to cover the patterns we're doing in production

@dang-stripe dang-stripe force-pushed the dang-lookup-join-physical-optimizer branch from f264a62 to d74b37e Compare April 14, 2026 17:25
@dang-stripe
Copy link
Copy Markdown
Contributor Author

@ankitsultana thanks for the review! addressed feedback. let me know if there's anything else i should address.

@ankitsultana
Copy link
Copy Markdown
Contributor

@dang-stripe : should be good to merge since this only impacts lookup joins. will wait for tests to pass.

I'll raise another PR hopefully soon to change the approach though. There's a different way we need to go about this to support lookup joins more broadly. Will share a design doc too. It also ties in with sub-plan based execution so it will enable a lot more optimizations in the future.

@dang-stripe
Copy link
Copy Markdown
Contributor Author

@ankitsultana got it. i'm curious, what's the high level approach to doing it properly?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

mse-physical-optimizer Multi-stage engine physical query optimizer multi-stage Related to the multi-stage query engine

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants