Skip to content

Kafka-14748: Relax non-null FK left-join requirement#14107

Merged
wcarlson5 merged 8 commits intoapache:trunkfrom
florin-akermann:KAFKA-14748
Dec 6, 2023
Merged

Kafka-14748: Relax non-null FK left-join requirement#14107
wcarlson5 merged 8 commits intoapache:trunkfrom
florin-akermann:KAFKA-14748

Conversation

@florin-akermann
Copy link
Copy Markdown
Contributor

Relax non-null FK left-join requirement.

Testing Strategy: Inject extractor which returns null on first or second element.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@florin-akermann florin-akermann marked this pull request as draft July 26, 2023 21:36
@florin-akermann florin-akermann marked this pull request as ready for review July 26, 2023 21:36
@florin-akermann florin-akermann marked this pull request as draft July 26, 2023 21:40
@florin-akermann florin-akermann marked this pull request as ready for review July 26, 2023 21:44
@mjsax mjsax added the streams label Jul 26, 2023
@florin-akermann florin-akermann marked this pull request as draft August 11, 2023 19:05
@florin-akermann florin-akermann force-pushed the KAFKA-14748 branch 3 times, most recently from 19dfb4b to fdf411c Compare August 17, 2023 21:13
@florin-akermann florin-akermann force-pushed the KAFKA-14748 branch 3 times, most recently from e51c5cc to 47a84fc Compare November 11, 2023 22:56
@florin-akermann florin-akermann marked this pull request as ready for review November 11, 2023 23:00
@wcarlson5
Copy link
Copy Markdown
Contributor

I'm going to start my review of this today. Hopefully can be actionable sometime tomorrow.

Copy link
Copy Markdown
Contributor

@wcarlson5 wcarlson5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of comments

</code>
</pre>
</p>

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for remembering the docs update!


private void forward(final Record<K, Change<V>> record, final KO foreignKey, final Instruction deleteKeyNoPropagate) {
final SubscriptionWrapper<K> wrapper = new SubscriptionWrapper<>(
hash(record),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How many times is this record rehased?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are branches where it is hashed twice. I puhsed a change to cache the hash value.

instruction = PROPAGATE_ONLY_IF_FK_VAL_AVAILABLE;
}
final KO newForeignKey = foreignKeyExtractor.apply(record.value().newValue);
final KO newForeignKey = record.value().newValue == null ? null : foreignKeyExtractor.apply(record.value().newValue);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems redundant with line 165. That check the newValue is not null

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed!, thanks, removed.

@florin-akermann
Copy link
Copy Markdown
Contributor Author

Just a couple of comments

Thank you @wcarlson5 for the comments.
I pushed according changes.

While I have your attention could we revisit the adjutments done to the optimization 'drop null key records on repartitioning' as part of #14174 (comment).
To me it seems odd to apply the optimization in some cases and not in others.
E.g. as described in the linked comment, there could be this confusing case where null-key records don't get propagated because of a explicit repartition call at the end of the topology.

I was wondering whether it would make sense to adjust the optimization from 'filter null-key records in repartition nodes if no left or outer-join is downstream' to 'filter null-key records in repartition nodes if no left or outer-join node is part of this branch of the topology (aka no left or outer-join is up- or downstream of this repartition node)'

@wcarlson5
Copy link
Copy Markdown
Contributor

I'm not sure, either option for the optimization is fine with me as long as it's well documented.

I'm good with how the PR is for now. I'm going to merge it to get it in before feature freeze for 3.7

@wcarlson5 wcarlson5 merged commit 4a958c6 into apache:trunk Dec 6, 2023
gaurav-narula pushed a commit to gaurav-narula/kafka that referenced this pull request Jan 24, 2024
Relax non-null FK left-join requirement.

Testing Strategy: Inject extractor which returns null on first or second element.

Reviewers: Walker Carlson <wcarlson@apace.org>
yyu1993 pushed a commit to yyu1993/kafka that referenced this pull request Feb 15, 2024
Relax non-null FK left-join requirement.

Testing Strategy: Inject extractor which returns null on first or second element.

Reviewers: Walker Carlson <wcarlson@apace.org>
AnatolyPopov pushed a commit to aiven/kafka that referenced this pull request Feb 16, 2024
Relax non-null FK left-join requirement.

Testing Strategy: Inject extractor which returns null on first or second element.

Reviewers: Walker Carlson <wcarlson@apace.org>
clolov pushed a commit to clolov/kafka that referenced this pull request Apr 5, 2024
Relax non-null FK left-join requirement.

Testing Strategy: Inject extractor which returns null on first or second element.

Reviewers: Walker Carlson <wcarlson@apace.org>
if (oldForeignKey != null && !Arrays.equals(serialize(newForeignKey), serialize(oldForeignKey))) {
forward(record, oldForeignKey, DELETE_KEY_AND_PROPAGATE);
}
forward(record, newForeignKey, PROPAGATE_NULL_IF_NO_FK_VAL_AVAILABLE);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

@florin-akermann florin-akermann May 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mjsax omg, thanks for the flag! Looks like @AyoubOm is addressing it in https://issues.apache.org/jira/browse/KAFKA-16394 already? Else i'll adress it.

@mjsax mjsax added the kip Requires or implements a KIP label Feb 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kip Requires or implements a KIP streams

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants