Skip to content

[fix](join): avoid join amplification in scalar subquery window rewrite#63763

Draft
starocean999 wants to merge 2 commits into
apache:masterfrom
starocean999:master_0527
Draft

[fix](join): avoid join amplification in scalar subquery window rewrite#63763
starocean999 wants to merge 2 commits into
apache:masterfrom
starocean999:master_0527

Conversation

@starocean999

Copy link
Copy Markdown
Contributor

Problem Summary:
The rewrite in AggScalarSubQueryToWindowFunction.java partitioned the generated window function only by the correlated slots. That assumption is not sufficient when the outer query contains an additional relation whose rows are duplicated under the same correlated key. In that situation, multiple distinct outer rows are merged into the same window partition, so the rewritten window aggregate is evaluated over join-amplified rows rather than over the logical row set seen by the original correlated scalar subquery. The rule also kept mutable matching state on the rewriter instance, which needed to be reset per rewrite attempt.

Fix:
The fix changes the rewrite to partition the generated window function by all visible slots from the outer-only relation, so different outer rows that share the same correlated key remain isolated in separate window partitions. This preserves the original scalar-subquery semantics even when the outer plan has join-expanded rows.

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen

Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@starocean999

Copy link
Copy Markdown
Contributor Author

/review

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a remaining correctness hole in the scalar-subquery window rewrite. The new partitioning fixes the case where an outer-only relation has a visible distinguishing column, but the rule can still fire when the distinguishing columns have been pruned from the Apply left output, leaving only the correlated key and still allowing join amplification.

Critical checkpoint conclusions:

  • Goal/test proof: The goal is to avoid join amplification in the scalar-subquery-to-window rewrite. The added test proves one case with a visible dim.did, but does not prove cases where duplicate outer-only rows are not distinguished by the visible output.
  • Scope/focus: The change is focused, but the implementation is incomplete for pruned duplicate-distinguishing columns.
  • Concurrency/lifecycle: No new concurrency, locking, or lifecycle concerns identified.
  • Configuration/compatibility: No new configs or incompatible serialization/storage changes identified in the actual PR diff.
  • Parallel paths: The relevant parallel path is the same rewrite when column pruning leaves only correlated slots from the outer-only relation; it still has the original amplification problem.
  • Tests/results: Existing new regression output is deterministic, but coverage should add the non-visible duplicate-column case or the rule should bail out when it cannot partition by the full outer-only relation identity.
  • Observability/transactions/persistence: Not applicable.
  • Performance: Partitioning by extra slots can cost more, but no blocking performance issue beyond the correctness concern.

User focus points: No additional user-provided review focus was present.

@starocean999

Copy link
Copy Markdown
Contributor Author

/review

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review summary: no additional blocking issues found in the current PR diff beyond the existing inline thread already recorded for the scalar-subquery window rewrite.

Critical checkpoint conclusions:

  • Goal and proof: The PR narrows the scalar-subquery-to-window rewrite by partitioning on the full visible outer-only relation row image and adds FE/regression coverage for the safe and pruned-output cases. The change appears to accomplish that goal in the reviewed diff.
  • Scope: The current diff is small and focused on the rewrite and its tests.
  • Concurrency: No new concurrency or shared mutable runtime state is introduced beyond rule-instance analysis state that is reset per check.
  • Lifecycle/static initialization: No new static initialization or special lifecycle concern found.
  • Configuration/compatibility/protocol: No new configs, storage format, function symbol, or FE-BE protocol compatibility changes in the current PR diff.
  • Parallel paths: The relevant scalar-subquery rewrite path and regression coverage were reviewed; no additional parallel rule path requiring the same change was identified.
  • Conditional checks: The new guard that disables the rewrite when the outer-only relation output is pruned is consistent with preserving row identity.
  • Tests: Added unit and regression tests cover the intended safe rewrite and no-rewrite fallback; no additional user-provided review focus was specified.
  • Observability/transactions/data writes/memory/performance: Not materially applicable to this optimizer rewrite; no new observability, transaction, write-path, memory-accounting, or obvious performance issue found.

Focus response: review_focus.txt contained no additional focus points.

@starocean999

Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-H: Total hot run time: 31585 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 98db5635b25115bdb94a35e899516abf1b4718b4, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	18112	4161	4037	4037
q2	q3	10796	1399	795	795
q4	4717	472	340	340
q5	8251	2236	2098	2098
q6	358	177	141	141
q7	954	782	641	641
q8	9389	1664	1585	1585
q9	5476	4953	4992	4953
q10	6447	2197	1896	1896
q11	437	266	241	241
q12	691	416	299	299
q13	18218	3262	2757	2757
q14	266	260	236	236
q15	q16	819	766	703	703
q17	1528	926	971	926
q18	6937	5667	6111	5667
q19	1325	1264	1131	1131
q20	518	413	259	259
q21	6022	2750	2575	2575
q22	443	368	305	305
Total cold run time: 101704 ms
Total hot run time: 31585 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	5019	4778	4841	4778
q2	q3	4848	5288	4777	4777
q4	2104	2181	1414	1414
q5	4790	4732	4595	4595
q6	237	174	125	125
q7	1735	1589	1421	1421
q8	2244	1941	1916	1916
q9	7276	7276	7322	7276
q10	4710	4697	4197	4197
q11	565	405	390	390
q12	724	736	535	535
q13	3054	3350	2804	2804
q14	276	274	252	252
q15	q16	674	705	611	611
q17	1475	1487	1479	1479
q18	7268	6720	6709	6709
q19	1088	1100	1097	1097
q20	2219	2206	1927	1927
q21	5268	4568	4355	4355
q22	512	449	420	420
Total cold run time: 56086 ms
Total hot run time: 51078 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-DS: Total hot run time: 171454 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 98db5635b25115bdb94a35e899516abf1b4718b4, data reload: false

query5	4317	655	509	509
query6	323	223	195	195
query7	4277	573	299	299
query8	329	233	218	218
query9	8812	4085	4074	4074
query10	460	336	307	307
query11	5783	2404	2244	2244
query12	190	128	126	126
query13	1273	600	446	446
query14	6125	5460	5201	5201
query14_1	4497	4475	4454	4454
query15	233	210	185	185
query16	1031	456	434	434
query17	1164	699	570	570
query18	2710	478	356	356
query19	215	194	161	161
query20	135	130	128	128
query21	208	143	126	126
query22	13664	13456	13356	13356
query23	17298	16515	16164	16164
query23_1	16396	16226	16481	16226
query24	7428	1755	1301	1301
query24_1	1343	1305	1311	1305
query25	540	461	409	409
query26	1317	302	173	173
query27	2710	559	340	340
query28	4457	1989	2010	1989
query29	1002	613	494	494
query30	312	238	191	191
query31	1148	1076	956	956
query32	103	87	80	80
query33	534	347	307	307
query34	1178	1151	668	668
query35	768	811	707	707
query36	1374	1412	1173	1173
query37	154	101	95	95
query38	3214	3266	3079	3079
query39	920	915	882	882
query39_1	869	873	871	871
query40	234	143	128	128
query41	71	70	68	68
query42	111	107	109	107
query43	328	332	292	292
query44	
query45	213	208	208	208
query46	1072	1231	764	764
query47	2373	2368	2266	2266
query48	410	420	305	305
query49	654	510	396	396
query50	995	366	280	280
query51	4386	4321	4229	4229
query52	109	107	99	99
query53	265	275	207	207
query54	328	293	287	287
query55	102	94	87	87
query56	326	328	314	314
query57	1432	1404	1316	1316
query58	316	281	283	281
query59	1580	1684	1454	1454
query60	325	335	310	310
query61	186	185	181	181
query62	744	653	599	599
query63	249	204	214	204
query64	2424	861	719	719
query65	
query66	1741	487	356	356
query67	29781	28951	29538	28951
query68	
query69	460	329	305	305
query70	1010	946	1022	946
query71	303	275	275	275
query72	2951	2659	2616	2616
query73	865	762	419	419
query74	5084	4923	4755	4755
query75	2709	2604	2271	2271
query76	2280	1154	754	754
query77	393	407	343	343
query78	12482	12426	11850	11850
query79	1474	1077	752	752
query80	674	548	447	447
query81	453	280	241	241
query82	1394	154	122	122
query83	342	267	245	245
query84	258	133	113	113
query85	890	571	438	438
query86	385	338	329	329
query87	3435	3399	3241	3241
query88	3599	2711	2718	2711
query89	452	389	338	338
query90	1943	179	182	179
query91	178	169	141	141
query92	84	92	78	78
query93	1474	1490	943	943
query94	538	359	313	313
query95	702	453	346	346
query96	1138	740	344	344
query97	2704	2711	2581	2581
query98	239	233	230	230
query99	1192	1151	1031	1031
Total cold run time: 254653 ms
Total hot run time: 171454 ms

@hello-stephen

Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 32.00% (16/50) 🎉
Increment coverage report
Complete coverage report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants