[fix](nereids) partition topn opt requires chosen window func partition key to be a subset of co-located ones by CalvinKirs · Pull Request #64764 · apache/doris

CalvinKirs · 2026-06-24T03:23:32Z

What problem does this PR solve?

When a single LogicalWindow holds multiple window functions, a filter like rn <= k may be converted into a partitionTopN and pushed below the whole window node. The generated partitionTopN keeps the per-partition top-k of the chosen window function, so it prunes the input rows that are shared by all co-located window functions.

This is only correct when the chosen window function's partition key is a subset of every other co-located window function's partition key (i.e. the chosen one is coarser). Otherwise the pruning drops rows the other windows still need and produces a wrong result, e.g.

-- independent partitions: chosen rn1(g1) prunes rows rn2(g2) needs
row_number() over (partition by g1 order by ord_key) as rn1,  -- chosen
row_number() over (partition by g2 order by ord_key) as rn2

-- chosen is finer than a co-located window: chosen rn(c1,c2) prunes rows rk(c1) needs
row_number() over (partition by c1, c2 order by c3) as rn,    -- chosen (finer)
rank()       over (partition by c1 order by c3) as rk         -- coarser

LogicalWindow.getPushDownWindowFuncAndLimit() already required the order keys of all co-located window functions to be compatible (#56622), but it never checked the partition keys. This PR additionally requires windowFunc.getPartitionKeys().containsAll(chosenWindowFunc.getPartitionKeys()) for every co-located window function; otherwise the partition-topn optimization is disabled.

Why the subset rule is correct (and not just equality)

The pruning keeps the per-P0 (chosen partition) top-k by the order key. For another window function W partitioned by P1:

P0 ⊆ P1 (chosen coarser) → safe. Any row that could change W's value for a surviving row r is in the same P1 partition with a smaller order value; being in the same P1 it is also in the same P0 partition with order <= r, so its P0-rank is within top-k and it is kept. Nothing W needs is pruned.
P0 ⊋ P1 or independent → unsafe. A finer/independent P0 can drop a row that ranks early in P1, corrupting W.

So equality is unnecessarily strict; the precise safe condition is the subset relation, expressed with containsAll (which also makes the check order-insensitive in the partition-key list, matching the set semantics of PARTITION BY).

When the optimization still applies

Single window, multiple windows with the same partition key, and the subset case above all keep firing. Example where the chosen rank(partition by g1) is coarser than row_number(partition by g1, g2), so a VPartitionTopN(partition by g1) is still generated:

select id, g1, g2, ord_key, rk, rn
from (
  select id, g1, g2, ord_key,
    rank()       over (partition by g1 order by ord_key) as rk,   -- chosen (coarser)
    row_number() over (partition by g1, g2 order by ord_key) as rn
  from multi_window_cases
) q
where rk <= 2;

  6:VANALYTIC          partition by: g1, g2   order by: ord_key   <- computes rn
  4:VANALYTIC          partition by: g1       order by: ord_key   <- computes rk
  1:VPartitionTopN     partition by: g1       order by: ord_key   partition limit: 2   <- safe (g1 ⊆ {g1,g2})
  0:VOlapScanNode

Reproduce (the wrong-result case)

CREATE TABLE multi_window_cases (
  id      INT,
  g1      VARCHAR(8),
  g2      VARCHAR(8),
  ord_key INT,
  amt     INT
)
DUPLICATE KEY(id)
DISTRIBUTED BY HASH(id) BUCKETS 1
PROPERTIES ("replication_num" = "1");

INSERT INTO multi_window_cases VALUES
(1,'A','X',1,10),(2,'A','X',2,20),(3,'A','Y',3,30),(4,'B','X',4,40),
(5,'B','Y',5,50),(6,'B','Y',6,60),(7,'C','X',7,70),(8,'C','Z',8,80);

SELECT id, g1, g2, ord_key, rn1, rn2
FROM (
  SELECT id, g1, g2, ord_key,
    row_number() OVER (PARTITION BY g1 ORDER BY ord_key) AS rn1,
    row_number() OVER (PARTITION BY g2 ORDER BY ord_key) AS rn2
  FROM multi_window_cases
) q
WHERE rn1 <= 1
ORDER BY id;

Wrong result (before), rn2 should be 1,3,4:

+----+----+----+---------+-----+-----+
| id | g1 | g2 | ord_key | rn1 | rn2 |
+----+----+----+---------+-----+-----+
|  1 | A  | X  |       1 |   1 |   1 |
|  4 | B  | X  |       4 |   1 |   2 |   <- wrong (should be 3)
|  7 | C  | X  |       7 |   1 |   3 |   <- wrong (should be 4)
+----+----+----+---------+-----+-----+

Correct result (after, matches MySQL 8.4):

+------+------+------+---------+-----+-----+
| id   | g1   | g2   | ord_key | rn1 | rn2 |
+------+------+------+---------+-----+-----+
|    1 | A    | X    |       1 |   1 |   1 |
|    4 | B    | X    |       4 |   1 |   3 |
|    7 | C    | X    |       7 |   1 |   4 |
+------+------+------+---------+-----+-----+

EXPLAIN before (buggy)

A VPartitionTopN(partition by g1) is inserted below both analytic nodes, so it prunes rows before rn2 is computed:

  8:VSORT              order by: id
  7:VANALYTIC          partition by: g2,  order by: ord_key   <- computes rn2
  |  predicates: (rn1 <= 1)
  6:VSORT              order by: g2, ord_key
  4:VANALYTIC          partition by: g1,  order by: ord_key   <- computes rn1
  3:VSORT              order by: g1, ord_key
  1:VPartitionTopN     partition by: g1,  order by: ord_key   <- prunes input (WRONG)
  0:VOlapScanNode

EXPLAIN after (fixed)

No VPartitionTopN; both window functions are computed over the full input and rn1 <= 1 stays as an ordinary predicate above them:

  7:VSORT              order by: id
  6:VANALYTIC          partition by: g2,  order by: ord_key   <- computes rn2 (full input)
  |  predicates: (rn1 <= 1)
  5:VSORT              order by: g2, ord_key
  3:VANALYTIC          partition by: g1,  order by: ord_key   <- computes rn1
  2:VSORT              order by: g1, ord_key
  0:VOlapScanNode

Release note

Fix wrong result of multiple window functions (row_number/rank/dense_rank) with incompatible partition keys when a top-n filter (e.g. rn <= k) is applied; the partition-topn pushdown is now restricted to the cases where it is provably safe (the chosen function's partition key is a subset of the co-located ones).

Check List (For author)

Test
- Regression test (regression-test/suites/query_p0/partition_topn/check_partitionkey.groovy, .../push_down_filter_through_window/push_down_multi_filter_through_window.groovy)
- Unit test (GeneratePartitionTopnFromWindowTest: testMultipleWindowsWithDifferentPartitions, testMultipleWindowsSubsetPartitionGeneratesTopn)
Behavior changed:
- Function behavior changed (returns correct results for the cases above)

hello-stephen · 2026-06-24T03:23:37Z

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

What problem was fixed (it's best to include specific error reporting information). How it was fixed.
Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
What features were added. Why was this function added?
Which code was refactored and why was this part of the code refactored?
Which functions were optimized and what is the difference before and after the optimization?

CalvinKirs · 2026-06-24T03:28:23Z

run buildall

CalvinKirs · 2026-06-24T03:30:42Z

/review

github-actions

I reviewed the partition-topn/window change and did not find a blocking issue. The new guard in LogicalWindow.getPushDownWindowFuncAndLimit() covers the shared-input pruning hazard for all current callers that insert LogicalPartitionTopN below a LogicalWindow, including filter-derived and limit/topn-derived paths. The added unit test and regression test both target the prior wrong-result shape, and the regression expected rn2 values prove full-window evaluation before filtering.

Critical checkpoint conclusions:

Goal/test proof: the PR fixes the different-partition co-located window wrong-result case and includes both FE unit and regression coverage for that shape.
Scope/focus: the code change is small and localized to the partition-topn eligibility check.
Concurrency/lifecycle/config/compatibility/persistence/write paths: not involved beyond reading the existing enable_partition_topn session variable.
Parallel paths: CreatePartitionTopNFromWindow, PushDownLimit, and PushDownTopNThroughWindow all flow through the updated helper; physical window grouping later separates partition groups, but the logical rewrite can run before that, so the guard is needed.
Tests/results: regression uses a hardcoded table, drop-before-use, deterministic order by, an EXPLAIN notContains("VPartitionTopN") guard, and generated expected output. I did not run tests locally.
Performance/observability: this intentionally disables an unsafe optimization for mixed partition keys; no new observability is needed.

Subagent conclusions: optimizer-rewrite reported no candidates after checking rewrite semantics and partition-topn execution paths. tests-session-config reported no candidates after checking regression/session/config/style risks. No candidates were accepted, dismissed, or deduplicated into inline comments. Convergence round 1 ended with both live subagents reporting NO_NEW_VALUABLE_FINDINGS for the same final ledger/comment set.

User focus: no additional user-provided review focus was supplied.

github-actions · 2026-06-24T03:45:46Z

PR approved by at least one committer and no changes requested.

github-actions · 2026-06-24T03:45:49Z

PR approved by anyone and no changes requested.

hello-stephen · 2026-06-24T04:30:35Z

TPC-H: Total hot run time: 28948 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 0a203998c28dc320c84bd9429e331662ce158a36, data reload: false

------ Round 1 ----------------------------------
============================================
q1	17612	3964	3987	3964
q2	2024	319	187	187
q3	10290	1429	816	816
q4	4681	473	341	341
q5	7552	858	562	562
q6	186	171	142	142
q7	771	828	620	620
q8	9316	1619	1660	1619
q9	5518	4527	4467	4467
q10	6756	1776	1515	1515
q11	442	274	243	243
q12	627	419	286	286
q13	18071	3345	2769	2769
q14	269	261	244	244
q15	q16	781	772	717	717
q17	1024	920	975	920
q18	6929	5739	5582	5582
q19	1313	1339	1055	1055
q20	481	429	263	263
q21	5928	2593	2336	2336
q22	431	357	300	300
Total cold run time: 101002 ms
Total hot run time: 28948 ms

----- Round 2, with runtime_filter_mode=off -----
============================================
q1	4328	4249	4212	4212
q2	338	352	212	212
q3	4664	4977	4385	4385
q4	2080	2163	1380	1380
q5	4380	4294	4262	4262
q6	253	176	130	130
q7	1726	1645	1540	1540
q8	2788	2220	2156	2156
q9	8254	8267	8153	8153
q10	4838	4742	4348	4348
q11	562	410	387	387
q12	787	795	560	560
q13	3279	3621	2919	2919
q14	312	297	282	282
q15	q16	713	722	656	656
q17	1358	1339	1309	1309
q18	8218	7307	7267	7267
q19	1174	1159	1078	1078
q20	2263	2197	1946	1946
q21	5242	4568	4459	4459
q22	511	444	393	393
Total cold run time: 58068 ms
Total hot run time: 52034 ms

hello-stephen · 2026-06-24T04:41:30Z

TPC-DS: Total hot run time: 172179 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 0a203998c28dc320c84bd9429e331662ce158a36, data reload: false

query5	4318	625	481	481
query6	430	208	168	168
query7	4843	513	310	310
query8	363	209	191	191
query9	8747	4054	4016	4016
query10	451	302	256	256
query11	5941	2344	2151	2151
query12	162	106	101	101
query13	1284	602	412	412
query14	6385	5392	5024	5024
query14_1	4376	4365	4400	4365
query15	202	198	178	178
query16	990	448	404	404
query17	940	703	592	592
query18	2441	487	384	384
query19	207	186	143	143
query20	114	112	104	104
query21	219	138	119	119
query22	13600	13614	13469	13469
query23	17333	16600	16133	16133
query23_1	16208	16293	16346	16293
query24	7655	1768	1331	1331
query24_1	1324	1340	1316	1316
query25	566	454	394	394
query26	1299	320	188	188
query27	2653	596	350	350
query28	4410	2032	2047	2032
query29	1107	613	495	495
query30	317	237	203	203
query31	1111	1087	984	984
query32	109	68	58	58
query33	521	334	259	259
query34	1217	1127	651	651
query35	756	780	669	669
query36	1420	1386	1199	1199
query37	154	102	88	88
query38	1873	1717	1657	1657
query39	935	929	903	903
query39_1	873	868	885	868
query40	226	118	97	97
query41	65	65	59	59
query42	87	85	85	85
query43	318	323	281	281
query44	1414	762	776	762
query45	195	182	171	171
query46	1083	1228	757	757
query47	2387	2382	2232	2232
query48	363	409	305	305
query49	603	454	345	345
query50	1003	339	256	256
query51	4422	4369	4327	4327
query52	81	82	69	69
query53	248	265	192	192
query54	270	217	189	189
query55	72	70	65	65
query56	233	223	210	210
query57	1454	1393	1304	1304
query58	242	213	209	209
query59	1549	1633	1405	1405
query60	279	242	227	227
query61	148	145	145	145
query62	689	650	586	586
query63	241	196	190	190
query64	2521	760	598	598
query65	4860	4775	4790	4775
query66	1778	482	333	333
query67	29663	29655	29567	29567
query68	3293	1640	882	882
query69	399	293	269	269
query70	1066	975	970	970
query71	295	227	219	219
query72	2781	2625	2281	2281
query73	863	746	421	421
query74	5077	4932	4768	4768
query75	2630	2592	2224	2224
query76	2307	1196	787	787
query77	356	377	273	273
query78	12365	12451	11807	11807
query79	1429	1166	760	760
query80	583	476	378	378
query81	450	278	241	241
query82	584	160	121	121
query83	353	273	245	245
query84	308	147	117	117
query85	846	486	398	398
query86	351	317	266	266
query87	1826	1833	1786	1786
query88	3676	2794	2764	2764
query89	425	381	343	343
query90	1965	186	179	179
query91	173	161	132	132
query92	63	58	55	55
query93	1518	1517	859	859
query94	525	359	294	294
query95	684	464	353	353
query96	1048	814	361	361
query97	2690	2723	2584	2584
query98	220	206	198	198
query99	1185	1150	1027	1027
Total cold run time: 257030 ms
Total hot run time: 172179 ms

hello-stephen · 2026-06-24T04:46:26Z

ClickBench: Total hot run time: 25.26 s

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 0a203998c28dc320c84bd9429e331662ce158a36, data reload: false

query1	0.00	0.00	0.00
query2	0.10	0.05	0.06
query3	0.25	0.13	0.13
query4	1.61	0.14	0.14
query5	0.25	0.22	0.22
query6	1.23	1.04	1.05
query7	0.04	0.01	0.00
query8	0.06	0.03	0.03
query9	0.38	0.34	0.30
query10	0.58	0.54	0.54
query11	0.19	0.14	0.13
query12	0.18	0.15	0.15
query13	0.46	0.47	0.47
query14	1.02	1.02	1.02
query15	0.62	0.58	0.59
query16	0.33	0.35	0.32
query17	1.06	1.13	1.10
query18	0.22	0.22	0.21
query19	2.06	1.97	1.96
query20	0.02	0.01	0.01
query21	15.43	0.22	0.16
query22	4.91	0.06	0.06
query23	16.13	0.30	0.13
query24	2.98	0.43	0.31
query25	0.11	0.06	0.04
query26	0.71	0.21	0.16
query27	0.05	0.04	0.03
query28	3.47	0.90	0.56
query29	12.50	4.32	3.45
query30	0.28	0.15	0.15
query31	2.77	0.60	0.31
query32	3.23	0.60	0.49
query33	3.27	3.20	3.22
query34	15.59	4.20	3.50
query35	3.56	3.52	3.52
query36	0.56	0.45	0.43
query37	0.09	0.06	0.07
query38	0.06	0.03	0.04
query39	0.04	0.03	0.03
query40	0.18	0.16	0.16
query41	0.09	0.03	0.03
query42	0.04	0.02	0.02
query43	0.04	0.04	0.03
Total cold run time: 96.75 s
Total hot run time: 25.26 s

hello-stephen · 2026-06-24T06:13:34Z

FE Regression Coverage Report

Increment line coverage 100.00% (2/2) 🎉
Increment coverage report
Complete coverage report

…on key to be a subset of co-located ones When a single LogicalWindow holds multiple window functions, a filter such as `rn <= k` may be turned into a partitionTopN and pushed below the whole window node. The generated partitionTopN keeps the per-partition top-k of the chosen window function and thus prunes the input rows shared by ALL co-located window functions. This is only correct when the chosen window function's partition key is a SUBSET of every other co-located window function's partition key (i.e. the chosen one is coarser). Then any row that could change another window's value for a surviving row lies in the same chosen-partition with an order value not greater than the surviving row's, so its chosen-rank is within top-k and it is kept. When this does not hold the pruning drops rows the other windows still need and produces a wrong result, e.g. row_number() over (partition by g1 order by c) as rn1, -- chosen row_number() over (partition by g2 order by c) as rn2 -- independent row_number() over (partition by c1, c2 order by c) as rn, -- chosen (finer) rank() over (partition by c1 order by c) as rk -- coarser getPushDownWindowFuncAndLimit() previously only required the order keys of all co-located window functions to be compatible (apache#56622). It now also requires the partition keys to satisfy the above subset relation, otherwise the optimization is disabled.

CalvinKirs · 2026-06-24T08:08:34Z

run buildall

hello-stephen · 2026-06-24T08:37:27Z

TPC-H: Total hot run time: 28618 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit d3f75947072f69cf1194320cc3f8287890fc1cd2, data reload: false

------ Round 1 ----------------------------------
============================================
q1	17696	4015	3979	3979
q2	2007	318	190	190
q3	10308	1374	837	837
q4	4677	471	333	333
q5	7518	876	576	576
q6	187	177	139	139
q7	749	861	623	623
q8	9556	1523	1527	1523
q9	6014	4405	4451	4405
q10	6787	1791	1556	1556
q11	445	281	244	244
q12	656	435	312	312
q13	18127	3424	2785	2785
q14	280	257	248	248
q15	q16	791	781	723	723
q17	1280	1215	748	748
q18	6808	5828	5497	5497
q19	1448	1374	1053	1053
q20	488	388	266	266
q21	6007	2645	2277	2277
q22	448	360	304	304
Total cold run time: 102277 ms
Total hot run time: 28618 ms

----- Round 2, with runtime_filter_mode=off -----
============================================
q1	4421	4256	4451	4256
q2	319	344	221	221
q3	4577	4960	4401	4401
q4	2088	2146	1370	1370
q5	4444	4302	4358	4302
q6	230	177	129	129
q7	1688	1695	1828	1695
q8	2491	2160	2175	2160
q9	8223	8137	8073	8073
q10	4788	4793	4331	4331
q11	587	411	395	395
q12	862	812	537	537
q13	3392	3685	3093	3093
q14	311	303	287	287
q15	q16	714	782	692	692
q17	1366	1341	1384	1341
q18	8009	7366	6838	6838
q19	1110	1067	1112	1067
q20	2246	2290	1970	1970
q21	5278	4644	4388	4388
q22	519	452	414	414
Total cold run time: 57663 ms
Total hot run time: 51960 ms

hello-stephen · 2026-06-24T08:48:37Z

TPC-DS: Total hot run time: 173856 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit d3f75947072f69cf1194320cc3f8287890fc1cd2, data reload: false

query5	4313	636	507	507
query6	439	184	178	178
query7	4848	543	318	318
query8	377	213	209	209
query9	8793	4064	4072	4064
query10	446	319	262	262
query11	5927	2339	2147	2147
query12	159	106	110	106
query13	1285	638	470	470
query14	6711	5405	5098	5098
query14_1	4446	4431	4400	4400
query15	206	213	174	174
query16	1011	456	434	434
query17	1125	710	594	594
query18	2729	497	363	363
query19	208	188	147	147
query20	115	113	107	107
query21	223	145	119	119
query22	13742	13647	13421	13421
query23	17585	16606	16222	16222
query23_1	16352	16283	16252	16252
query24	7489	1815	1329	1329
query24_1	1380	1325	1337	1325
query25	563	466	406	406
query26	1285	327	181	181
query27	2648	545	366	366
query28	4458	2053	2006	2006
query29	1097	626	496	496
query30	306	246	204	204
query31	1137	1090	973	973
query32	112	62	61	61
query33	542	349	258	258
query34	1178	1168	688	688
query35	747	808	668	668
query36	1400	1418	1241	1241
query37	155	101	90	90
query38	1898	1695	1716	1695
query39	959	946	914	914
query39_1	912	907	924	907
query40	221	121	99	99
query41	66	64	66	64
query42	88	88	88	88
query43	336	337	283	283
query44	1535	799	785	785
query45	204	192	183	183
query46	1098	1265	789	789
query47	2388	2329	2206	2206
query48	418	426	306	306
query49	621	457	336	336
query50	989	352	257	257
query51	4452	4444	4340	4340
query52	83	82	69	69
query53	260	279	195	195
query54	263	211	202	202
query55	72	69	69	69
query56	237	214	209	209
query57	1414	1409	1331	1331
query58	246	213	216	213
query59	1591	1685	1441	1441
query60	280	244	232	232
query61	149	148	154	148
query62	705	646	583	583
query63	232	191	193	191
query64	2458	772	632	632
query65	4899	4767	4829	4767
query66	1740	474	387	387
query67	29858	29834	29747	29747
query68	3369	1572	957	957
query69	410	289	275	275
query70	1075	941	973	941
query71	285	237	213	213
query72	2832	2744	2339	2339
query73	846	779	470	470
query74	5682	4987	4763	4763
query75	2629	2580	2241	2241
query76	2360	1209	817	817
query77	366	391	299	299
query78	12398	12547	11913	11913
query79	1414	1157	800	800
query80	1289	478	434	434
query81	522	282	241	241
query82	630	162	129	129
query83	341	288	274	274
query84	293	147	115	115
query85	902	518	420	420
query86	422	317	279	279
query87	1839	1849	1779	1779
query88	3780	2786	2773	2773
query89	438	387	331	331
query90	1930	191	179	179
query91	172	163	138	138
query92	63	59	55	55
query93	1629	1531	897	897
query94	716	345	338	338
query95	674	456	348	348
query96	1041	777	375	375
query97	2674	2697	2560	2560
query98	216	208	197	197
query99	1184	1182	1009	1009
Total cold run time: 261049 ms
Total hot run time: 173856 ms

hello-stephen · 2026-06-24T08:53:39Z

ClickBench: Total hot run time: 25.37 s

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit d3f75947072f69cf1194320cc3f8287890fc1cd2, data reload: false

query1	0.00	0.01	0.01
query2	0.10	0.06	0.05
query3	0.26	0.13	0.13
query4	1.61	0.14	0.14
query5	0.25	0.22	0.24
query6	1.29	1.08	1.12
query7	0.04	0.00	0.00
query8	0.06	0.04	0.04
query9	0.43	0.32	0.32
query10	0.57	0.57	0.57
query11	0.19	0.15	0.14
query12	0.19	0.15	0.15
query13	0.47	0.49	0.48
query14	1.03	1.02	1.00
query15	0.64	0.60	0.60
query16	0.33	0.34	0.34
query17	1.13	1.14	1.09
query18	0.23	0.23	0.22
query19	2.04	1.96	1.90
query20	0.02	0.01	0.01
query21	15.43	0.22	0.14
query22	4.85	0.05	0.06
query23	16.16	0.31	0.12
query24	2.96	0.41	0.32
query25	0.13	0.06	0.03
query26	0.73	0.22	0.15
query27	0.06	0.04	0.04
query28	3.49	1.00	0.55
query29	12.52	4.31	3.44
query30	0.27	0.16	0.15
query31	2.77	0.61	0.32
query32	3.22	0.61	0.50
query33	3.15	3.22	3.30
query34	15.80	4.29	3.53
query35	3.52	3.52	3.55
query36	0.55	0.43	0.44
query37	0.09	0.07	0.06
query38	0.05	0.04	0.03
query39	0.04	0.03	0.03
query40	0.17	0.16	0.15
query41	0.08	0.04	0.03
query42	0.04	0.03	0.03
query43	0.04	0.04	0.04
Total cold run time: 97 s
Total hot run time: 25.37 s

hello-stephen · 2026-06-24T11:30:06Z

FE Regression Coverage Report

Increment line coverage 1.72% (2/116) 🎉
Increment coverage report
Complete coverage report

github-actions · 2026-06-25T06:05:54Z

PR approved by at least one committer and no changes requested.

CalvinKirs · 2026-06-25T06:10:56Z

run external

CalvinKirs · 2026-06-25T06:11:12Z

run nonConcurrent

yx-keith

LGTM

hello-stephen · 2026-06-25T08:06:27Z

FE Regression Coverage Report

Increment line coverage 0.69% (2/291) 🎉
Increment coverage report
Complete coverage report

hello-stephen · 2026-06-25T08:14:32Z

FE Regression Coverage Report

Increment line coverage 0.69% (2/291) 🎉
Increment coverage report
Complete coverage report

CalvinKirs · 2026-06-25T09:47:13Z

run external

CalvinKirs · 2026-06-27T01:34:35Z

/review

github-actions

Reviewed PR 64764. I found one small but concrete hygiene issue to fix before merging: the new generated regression .out file leaves an extra blank line at EOF, and git diff --check flags the patch.

Critical checkpoints:

Goal and behavior: the PR adds a partition-key compatibility guard for partition top-n generation from co-located window functions. The changed helper is used by the filter, limit, and top-n pushdown paths, and the intended subset/incompatible partition cases were reviewed.
Scope: the code change is focused in LogicalWindow; the remaining changes are targeted FE unit/regression coverage and expected output.
Concurrency, lifecycle, persistence, FE/BE protocol, and storage compatibility: not involved in this PR.
Configuration/session behavior: no new config is added; the existing enable_partition_topn session variable remains the gate.
Parallel paths: CreatePartitionTopNFromWindow, PushDownLimit, and PushDownTopNThroughWindow all route through the updated helper, so the new guard applies across the relevant generation paths.
Tests/results: the new tests cover incompatible partition keys and subset-safe cases, but the new .out file needs the EOF whitespace fix below.
Existing comments and user focus: there were no existing inline review threads and no additional user-provided focus points.

Subagent conclusions: optimizer-rewrite found no optimizer correctness candidate. tests-session-config proposed TSC-001, which was merged as MAIN-001 and submitted below. Java test line-length was dismissed because test LineLength is suppressed by the FE checkstyle suppressions. Convergence round 1 ended with both live subagents replying NO_NEW_VALUABLE_FINDINGS for the same final ledger/comment set.

no valid

CalvinKirs requested review from 924060929, englefly, morrySnow and starocean999 as code owners June 24, 2026 03:23

github-actions Bot reviewed Jun 24, 2026

View reviewed changes

morrySnow previously approved these changes Jun 24, 2026

View reviewed changes

morrySnow added dev/3.1.x dev/4.0.x dev/4.1.x labels Jun 24, 2026

github-actions Bot added the approved Indicates a PR has been approved by one committer. label Jun 24, 2026

github-actions Bot added the reviewed label Jun 24, 2026

CalvinKirs dismissed morrySnow’s stale review via d3f7594 June 24, 2026 08:06

CalvinKirs force-pushed the topn-partition branch from 0a20399 to d3f7594 Compare June 24, 2026 08:06

github-actions Bot removed the approved Indicates a PR has been approved by one committer. label Jun 24, 2026

CalvinKirs changed the title ~~[fix](nereids) partition topn opt requires all window funcs share the same partition key~~ [fix](nereids) partition topn opt requires chosen window func partition key to be a subset of co-located ones Jun 24, 2026

englefly approved these changes Jun 25, 2026

View reviewed changes

github-actions Bot added the approved Indicates a PR has been approved by one committer. label Jun 25, 2026

yx-keith approved these changes Jun 25, 2026

View reviewed changes

github-actions Bot previously requested changes Jun 27, 2026

View reviewed changes

Comment thread regression-test/data/query_p0/partition_topn/check_partitionkey.out

CalvinKirs merged commit 7bc98f6 into apache:master Jun 27, 2026
35 of 36 checks passed

github-actions Bot added dev/4.0.x-conflict dev/4.1.x-conflict labels Jun 27, 2026

Uh oh!

Conversation

CalvinKirs commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

Why the subset rule is correct (and not just equality)

When the optimization still applies

Reproduce (the wrong-result case)

EXPLAIN before (buggy)

EXPLAIN after (fixed)

Release note

Check List (For author)

Uh oh!

hello-stephen commented Jun 24, 2026

Uh oh!

CalvinKirs commented Jun 24, 2026

Uh oh!

CalvinKirs commented Jun 24, 2026

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jun 24, 2026

Uh oh!

github-actions Bot commented Jun 24, 2026

Uh oh!

hello-stephen commented Jun 24, 2026

Uh oh!

hello-stephen commented Jun 24, 2026

Uh oh!

hello-stephen commented Jun 24, 2026

Uh oh!

hello-stephen commented Jun 24, 2026

FE Regression Coverage Report

Uh oh!

CalvinKirs commented Jun 24, 2026

Uh oh!

hello-stephen commented Jun 24, 2026

Uh oh!

hello-stephen commented Jun 24, 2026

Uh oh!

hello-stephen commented Jun 24, 2026

Uh oh!

hello-stephen commented Jun 24, 2026

FE Regression Coverage Report

Uh oh!

github-actions Bot commented Jun 25, 2026

Uh oh!

CalvinKirs commented Jun 25, 2026

Uh oh!

CalvinKirs commented Jun 25, 2026

Uh oh!

yx-keith left a comment

Choose a reason for hiding this comment

Uh oh!

hello-stephen commented Jun 25, 2026

FE Regression Coverage Report

Uh oh!

hello-stephen commented Jun 25, 2026

FE Regression Coverage Report

Uh oh!

CalvinKirs commented Jun 25, 2026

Uh oh!

CalvinKirs commented Jun 27, 2026

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

CalvinKirs commented Jun 24, 2026 •

edited

Loading