Skip to content

[opt](function) optimize array aggregate execution#64913

Draft
Mryange wants to merge 1 commit into
apache:masterfrom
Mryange:optimize-array-agg-fast-path
Draft

[opt](function) optimize array aggregate execution#64913
Mryange wants to merge 1 commit into
apache:masterfrom
Mryange:optimize-array-agg-fast-path

Conversation

@Mryange

@Mryange Mryange commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Array aggregate functions such as array_avg, array_sum, array_min, array_max, and array_product previously evaluated each array by creating and driving the generic aggregate-function path. This added unnecessary per-row aggregate state, arena, nullable wrapper, and virtual-call overhead for simple array-local reductions.

Root cause: array aggregation reused the general aggregate-function execution machinery even though these functions only need a small per-row reduction state over the nested array data.

This change adds a direct ColumnArrayView based fast path with lightweight ArrayAggState reducers for sum, avg, min, max, and product. DecimalV3 sum/avg/product keep using the FE-planned result type and scale, while ordinary array-mapped functions now trust the FE return type instead of reconstructing a BE-side aggregate function only for return-type checking.

Local optest result for array/array_avg_arr improved from about 1.685s to 0.85-0.91s in warm runs.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen

Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@Mryange Mryange marked this pull request as draft June 27, 2026 13:59
@Mryange

Mryange commented Jun 27, 2026

Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-H: Total hot run time: 29083 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 55fac1431fb5189ebe8b728c3bc2218d9fb6c8a7, data reload: false

------ Round 1 ----------------------------------
============================================
q1	17590	3980	3958	3958
q2	2032	297	186	186
q3	10342	1413	838	838
q4	4684	471	350	350
q5	7534	865	581	581
q6	188	168	137	137
q7	799	863	623	623
q8	9512	1562	1717	1562
q9	6113	4477	4507	4477
q10	6806	1822	1518	1518
q11	445	268	247	247
q12	666	424	299	299
q13	18129	3348	2749	2749
q14	262	261	244	244
q15	q16	790	788	700	700
q17	1024	948	1045	948
q18	6942	5607	5745	5607
q19	1578	1369	981	981
q20	485	402	270	270
q21	5812	2596	2513	2513
q22	436	365	295	295
Total cold run time: 102169 ms
Total hot run time: 29083 ms

----- Round 2, with runtime_filter_mode=off -----
============================================
q1	4377	4330	4362	4330
q2	322	337	223	223
q3	4591	4938	4386	4386
q4	2047	2147	1383	1383
q5	4408	4440	4278	4278
q6	232	180	132	132
q7	1726	1884	1644	1644
q8	2477	2267	2133	2133
q9	8185	8110	8046	8046
q10	4852	4759	4264	4264
q11	579	397	366	366
q12	731	855	637	637
q13	3266	3598	2989	2989
q14	295	308	282	282
q15	q16	719	724	648	648
q17	1368	1349	1339	1339
q18	7890	7398	6727	6727
q19	1118	1105	1131	1105
q20	2187	2240	1951	1951
q21	5251	4504	4378	4378
q22	507	452	405	405
Total cold run time: 57128 ms
Total hot run time: 51646 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-DS: Total hot run time: 171078 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 55fac1431fb5189ebe8b728c3bc2218d9fb6c8a7, data reload: false

query5	4345	646	481	481
query6	458	185	162	162
query7	4828	570	274	274
query8	337	184	162	162
query9	8763	4017	4006	4006
query10	436	334	252	252
query11	5961	2304	2157	2157
query12	157	101	99	99
query13	1289	611	436	436
query14	6284	5316	4934	4934
query14_1	4259	4253	4261	4253
query15	218	205	176	176
query16	1001	433	419	419
query17	1071	678	585	585
query18	2444	461	336	336
query19	192	183	139	139
query20	110	104	103	103
query21	212	137	112	112
query22	13667	13597	13374	13374
query23	17291	16408	15992	15992
query23_1	16158	16223	16213	16213
query24	7580	1758	1317	1317
query24_1	1326	1308	1305	1305
query25	548	466	396	396
query26	1303	326	168	168
query27	2650	585	342	342
query28	4484	2025	2027	2025
query29	1197	645	484	484
query30	308	227	198	198
query31	1117	1111	960	960
query32	112	61	61	61
query33	540	325	283	283
query34	1225	1148	657	657
query35	767	796	678	678
query36	1418	1362	1238	1238
query37	152	108	95	95
query38	1886	1712	1683	1683
query39	931	920	887	887
query39_1	871	863	895	863
query40	237	130	106	106
query41	75	70	68	68
query42	91	89	89	89
query43	325	321	289	289
query44	1426	767	767	767
query45	225	194	184	184
query46	1085	1226	748	748
query47	2330	2401	2169	2169
query48	410	425	296	296
query49	591	434	328	328
query50	979	350	259	259
query51	4414	4434	4308	4308
query52	82	81	71	71
query53	250	275	194	194
query54	285	232	214	214
query55	73	73	67	67
query56	243	245	249	245
query57	1428	1378	1306	1306
query58	252	219	222	219
query59	1542	1628	1405	1405
query60	309	259	236	236
query61	174	174	175	174
query62	703	648	582	582
query63	235	192	196	192
query64	2592	770	590	590
query65	4874	4799	4826	4799
query66	1828	468	340	340
query67	28729	28723	28661	28661
query68	2990	1665	947	947
query69	409	294	257	257
query70	1051	963	940	940
query71	302	265	217	217
query72	2864	2624	2335	2335
query73	824	806	462	462
query74	5100	4962	4787	4787
query75	2592	2534	2171	2171
query76	2362	1174	819	819
query77	340	378	287	287
query78	12550	12455	11962	11962
query79	1522	1160	774	774
query80	1270	479	396	396
query81	522	272	240	240
query82	599	149	120	120
query83	319	273	239	239
query84	283	144	117	117
query85	903	494	411	411
query86	415	308	299	299
query87	1881	1820	1768	1768
query88	3715	2783	2743	2743
query89	427	380	323	323
query90	1859	180	176	176
query91	172	161	127	127
query92	63	57	53	53
query93	1676	1405	891	891
query94	726	351	294	294
query95	673	479	353	353
query96	1122	787	348	348
query97	2679	2693	2573	2573
query98	219	202	203	202
query99	1172	1148	1024	1024
Total cold run time: 257642 ms
Total hot run time: 171078 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
ClickBench: Total hot run time: 25.39 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 55fac1431fb5189ebe8b728c3bc2218d9fb6c8a7, data reload: false

query1	0.00	0.00	0.01
query2	0.09	0.06	0.05
query3	0.26	0.14	0.14
query4	1.61	0.14	0.14
query5	0.24	0.22	0.22
query6	1.26	1.11	1.03
query7	0.04	0.01	0.01
query8	0.05	0.04	0.03
query9	0.37	0.35	0.31
query10	0.55	0.54	0.54
query11	0.19	0.14	0.14
query12	0.17	0.15	0.16
query13	0.46	0.46	0.49
query14	1.00	1.01	1.01
query15	0.61	0.59	0.60
query16	0.31	0.33	0.31
query17	1.12	1.13	1.13
query18	0.22	0.21	0.21
query19	1.97	1.96	1.96
query20	0.01	0.01	0.02
query21	15.43	0.20	0.13
query22	4.99	0.05	0.06
query23	16.13	0.32	0.12
query24	2.93	0.40	0.32
query25	0.11	0.04	0.05
query26	0.74	0.21	0.14
query27	0.04	0.03	0.03
query28	3.50	0.92	0.54
query29	12.50	4.32	3.44
query30	0.28	0.16	0.15
query31	2.77	0.59	0.32
query32	3.23	0.59	0.50
query33	3.20	3.32	3.35
query34	15.65	4.21	3.52
query35	3.55	3.53	3.53
query36	0.55	0.43	0.44
query37	0.09	0.07	0.06
query38	0.05	0.04	0.04
query39	0.04	0.03	0.03
query40	0.18	0.17	0.14
query41	0.09	0.03	0.03
query42	0.04	0.03	0.03
query43	0.04	0.04	0.04
Total cold run time: 96.66 s
Total hot run time: 25.39 s

@hello-stephen

Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 95.10% (233/245) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 72.33% (27781/38411)
Line Coverage 55.62% (297267/534439)
Region Coverage 52.24% (247275/473316)
Branch Coverage 53.38% (107150/200734)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants