Skip to content

MINOR: Improving log for outstanding requests on close and cleanup#16304

Merged
chia7712 merged 1 commit intoapache:trunkfrom
lianetm:close-logs
Jun 13, 2024
Merged

MINOR: Improving log for outstanding requests on close and cleanup#16304
chia7712 merged 1 commit intoapache:trunkfrom
lianetm:close-logs

Conversation

@lianetm
Copy link
Copy Markdown
Member

@lianetm lianetm commented Jun 12, 2024

Adding log to identify cases where the consumer network thread close may leave pending requests due to close timeout expired.
Clean up renaming for clarity and removing unused params.
No changes in logic.

Copy link
Copy Markdown
Member

@AndrewJSchofield AndrewJSchofield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice tidy-up. The new "commitSyncAllConsumed" method name makes much more sense.

Copy link
Copy Markdown
Member

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lianetm thanks for this nice cleanup. one small question is left. PTAL

// Visible for testing
static void runAtClose(final Collection<Optional<? extends RequestManager>> requestManagers,
final NetworkClientDelegate networkClientDelegate,
final Timer timer) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this timer was expected to be used by pollOnClose. It seems to me pollOnClose should take a timeout, so the request managers has the chance to return on time in the future.

For another, the timer should call timer.update() at the end. Otherwise, the later operations (sendUnsentRequests) will get incorrect remaining timeout.

Copy link
Copy Markdown
Member Author

@lianetm lianetm Jun 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, that was probably the thinking behind the timer here, but it ended up unused and that actually seemed sensible to me here (because it's used somewhere else). This is how I see it: pollOnClose is only building requests without blocking on any activity that would need a time boundary (just like poll() doesn't take a timer to limit its execution). It's then the sendUnsentRequests the one that takes the requests created by pollOnClose and performs the operation that needs the timer (and it has it already).

Taking the pollOnClose on the FetchRequestManager as example (actually the only manager where it's used), it only generates a fetch request to close the fetch sessions (no timer needed). Makes sense?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for sharing, and I agree that the usage of pollOnClose you described, so +1 to remove timer :)

Copy link
Copy Markdown
Member

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lianetm
Copy link
Copy Markdown
Member Author

lianetm commented Jun 12, 2024

Thanks @chia7712 and @AndrewJSchofield for the reviews!

@chia7712 chia7712 merged commit fe98888 into apache:trunk Jun 13, 2024
apourchet added a commit to apourchet/kafka that referenced this pull request Jun 13, 2024
commit f380cd1
Author: Edoardo Comar <ecomar@uk.ibm.com>
Date:   Thu Jun 13 15:01:08 2024 +0100

    MINOR: Add integration tag to AdminFenceProducersIntegrationTest (apache#16326)

    Add @tag("integration") to AdminFenceProducersIntegrationTest

    Reviewers: Chris Egerton <chrise@aiven.io>

commit 11c85a9
Author: Dongnuo Lyu <139248811+dongnuo123@users.noreply.github.com>
Date:   Thu Jun 13 05:11:01 2024 -0400

    MINOR: Make online downgrade failure logs less noisy and update the timeouts scheduled in `convertToConsumerGroup` (apache#16290)

    This patch:
    - changes the order of the checks in `validateOnlineDowngrade`, so that only when the last member using the consumer protocol leave and the group still has classic member(s), `online downgrade is disabled` is logged if the policy doesn't allow downgrade.
    - changes the session timeout in `convertToConsumerGroup` from `consumerGroupSessionTimeoutMs` to `member.classicProtocolSessionTimeout().get()`.

    Reviewers: David Jacot <djacot@confluent.io>

commit ea60666
Author: Ken Huang <100591800+m1a2st@users.noreply.github.com>
Date:   Thu Jun 13 17:11:37 2024 +0900

    KAFKA-16921 [1/N] Migrate all junit 4 code to junit 5 for connect module (apache#16253)

    Reviewers: Chia-Ping Tsai <chia7712@gmail.com>

commit 596b945
Author: gongxuanzhang <gongxuanzhang@foxmail.com>
Date:   Thu Jun 13 15:39:32 2024 +0800

    KAFKA-16643 Add ModifierOrder checkstyle rule (apache#15890)

    Reviewers: Chia-Ping Tsai <chia7712@gmail.com>

commit 103ff5c
Author: Antoine Pourchet <antoine@responsive.dev>
Date:   Thu Jun 13 01:32:39 2024 -0600

    KAFKA-15045: (KIP-924 pt. 24) internal TaskAssignor rename to LegacyTaskAssignor (apache#16318)

    Since the new public API for TaskAssignor shared a name, this rename will prevent users from confusing the internal definition with the public one.

    Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>

commit e59c887
Author: brenden20 <118419078+brenden20@users.noreply.github.com>
Date:   Thu Jun 13 02:30:05 2024 -0500

    KAFKA-16557 Implemented OffsetFetchRequestState toStringBase and added a test for it (apache#16291)

    Reviewers: Lianet Magrans <lianetmr@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>

commit dd6fcc6
Author: TingIāu "Ting" Kì <kitingiao@gmail.com>
Date:   Thu Jun 13 14:35:33 2024 +0800

    KAFKA-16901 Add unit tests for ConsumerRecords#records(String) (apache#16227)

    Reviewers: Chia-Ping Tsai <chia7712@gmail.com>

commit fe98888
Author: Lianet Magrans <98415067+lianetm@users.noreply.github.com>
Date:   Thu Jun 13 08:31:16 2024 +0200

    MINOR: Improving log for outstanding requests on close and cleanup (apache#16304)

    Reviewers: Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>

commit 9ddd58b
Author: Chris Egerton <chrise@aiven.io>
Date:   Thu Jun 13 05:43:33 2024 +0200

    MINOR: Add readiness check for connector and separate Kafka cluster in ExactlyOnceSourceIntegrationTest::testSeparateOffsetsTopic (apache#16306)

    Reviewers: Greg Harris <gharris1727@gmail.com>

commit 0a203a9
Author: TingIāu "Ting" Kì <kitingiao@gmail.com>
Date:   Thu Jun 13 09:47:51 2024 +0800

    KAFKA-16938 non-dynamic props gets corrupted due to circular reference between DynamicBrokerConfig and DynamicConfig. (apache#16302)

    Reviewers: Chia-Ping Tsai <chia7712@gmail.com>

commit 6d1f8f8
Author: Gantigmaa Selenge <39860586+tinaselenge@users.noreply.github.com>
Date:   Thu Jun 13 02:42:39 2024 +0100

    MINOR: Clean up for KafkaAdminClientTest (apache#16285)

    Reviewers: Chia-Ping Tsai <chia7712@gmail.com>

commit e76e1da
Author: Chris Egerton <chrise@aiven.io>
Date:   Thu Jun 13 02:18:23 2024 +0200

    KAFKA-16935: Automatically wait for cluster startup in embedded Connect integration tests (apache#16288)

    Reviewers: Greg Harris <gharris1727@gmail.com>
apourchet added a commit to apourchet/kafka that referenced this pull request Jun 13, 2024
commit 4333af5
Author: A. Sophie Blee-Goldman <ableegoldman@gmail.com>
Date:   Thu Jun 13 11:27:50 2024 -0700

    KAFKA-15045: (KIP-924 pt. 25) Rename old internal StickyTaskAssignor to LegacyStickyTaskAssignor (apache#16322)

    To avoid confusion in 3.8/until we fully remove all the old task assignors and internal config, we should rename the old internal assignor classes like the StickyTaskAssignor so that they won't be mixed up with the new version of the assignor (which is also named StickyTaskAssignor)

    Reviewers: Bruno Cadonna <cadonna@apache.org>, Josep Prat <josep.prat@aiven.io>

commit f380cd1
Author: Edoardo Comar <ecomar@uk.ibm.com>
Date:   Thu Jun 13 15:01:08 2024 +0100

    MINOR: Add integration tag to AdminFenceProducersIntegrationTest (apache#16326)

    Add @tag("integration") to AdminFenceProducersIntegrationTest

    Reviewers: Chris Egerton <chrise@aiven.io>

commit 11c85a9
Author: Dongnuo Lyu <139248811+dongnuo123@users.noreply.github.com>
Date:   Thu Jun 13 05:11:01 2024 -0400

    MINOR: Make online downgrade failure logs less noisy and update the timeouts scheduled in `convertToConsumerGroup` (apache#16290)

    This patch:
    - changes the order of the checks in `validateOnlineDowngrade`, so that only when the last member using the consumer protocol leave and the group still has classic member(s), `online downgrade is disabled` is logged if the policy doesn't allow downgrade.
    - changes the session timeout in `convertToConsumerGroup` from `consumerGroupSessionTimeoutMs` to `member.classicProtocolSessionTimeout().get()`.

    Reviewers: David Jacot <djacot@confluent.io>

commit ea60666
Author: Ken Huang <100591800+m1a2st@users.noreply.github.com>
Date:   Thu Jun 13 17:11:37 2024 +0900

    KAFKA-16921 [1/N] Migrate all junit 4 code to junit 5 for connect module (apache#16253)

    Reviewers: Chia-Ping Tsai <chia7712@gmail.com>

commit 596b945
Author: gongxuanzhang <gongxuanzhang@foxmail.com>
Date:   Thu Jun 13 15:39:32 2024 +0800

    KAFKA-16643 Add ModifierOrder checkstyle rule (apache#15890)

    Reviewers: Chia-Ping Tsai <chia7712@gmail.com>

commit 103ff5c
Author: Antoine Pourchet <antoine@responsive.dev>
Date:   Thu Jun 13 01:32:39 2024 -0600

    KAFKA-15045: (KIP-924 pt. 24) internal TaskAssignor rename to LegacyTaskAssignor (apache#16318)

    Since the new public API for TaskAssignor shared a name, this rename will prevent users from confusing the internal definition with the public one.

    Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>

commit e59c887
Author: brenden20 <118419078+brenden20@users.noreply.github.com>
Date:   Thu Jun 13 02:30:05 2024 -0500

    KAFKA-16557 Implemented OffsetFetchRequestState toStringBase and added a test for it (apache#16291)

    Reviewers: Lianet Magrans <lianetmr@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>

commit dd6fcc6
Author: TingIāu "Ting" Kì <kitingiao@gmail.com>
Date:   Thu Jun 13 14:35:33 2024 +0800

    KAFKA-16901 Add unit tests for ConsumerRecords#records(String) (apache#16227)

    Reviewers: Chia-Ping Tsai <chia7712@gmail.com>

commit fe98888
Author: Lianet Magrans <98415067+lianetm@users.noreply.github.com>
Date:   Thu Jun 13 08:31:16 2024 +0200

    MINOR: Improving log for outstanding requests on close and cleanup (apache#16304)

    Reviewers: Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>

commit 9ddd58b
Author: Chris Egerton <chrise@aiven.io>
Date:   Thu Jun 13 05:43:33 2024 +0200

    MINOR: Add readiness check for connector and separate Kafka cluster in ExactlyOnceSourceIntegrationTest::testSeparateOffsetsTopic (apache#16306)

    Reviewers: Greg Harris <gharris1727@gmail.com>

commit 0a203a9
Author: TingIāu "Ting" Kì <kitingiao@gmail.com>
Date:   Thu Jun 13 09:47:51 2024 +0800

    KAFKA-16938 non-dynamic props gets corrupted due to circular reference between DynamicBrokerConfig and DynamicConfig. (apache#16302)

    Reviewers: Chia-Ping Tsai <chia7712@gmail.com>

commit 6d1f8f8
Author: Gantigmaa Selenge <39860586+tinaselenge@users.noreply.github.com>
Date:   Thu Jun 13 02:42:39 2024 +0100

    MINOR: Clean up for KafkaAdminClientTest (apache#16285)

    Reviewers: Chia-Ping Tsai <chia7712@gmail.com>

commit e76e1da
Author: Chris Egerton <chrise@aiven.io>
Date:   Thu Jun 13 02:18:23 2024 +0200

    KAFKA-16935: Automatically wait for cluster startup in embedded Connect integration tests (apache#16288)

    Reviewers: Greg Harris <gharris1727@gmail.com>
lianetm added a commit to lianetm/kafka that referenced this pull request Jun 17, 2024
…pache#16304)

Reviewers: Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
lianetm added a commit to lianetm/kafka that referenced this pull request Jun 17, 2024
…pache#16304)

Reviewers: Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
lucasbru pushed a commit that referenced this pull request Jun 18, 2024
…16376)

* MINOR: Improving log for outstanding requests on close and cleanup (#16304)

Reviewers: Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>

* KAFKA-16954: fix consumer close to release assignment in background (#16343)

This PR fixes consumer close to avoid updating the subscription state object in the app thread. Now the close simply triggers an UnsubscribeEvent that is handled in the background to trigger callbacks, clear assignment, and send leave heartbeat. Note that after triggering the event, the unsubscribe will continuously process background events until the event completes, to ensure that it allows for callbacks to run in the app thread.
The logic around what happens if the unsubscribe fails remain unchanged: close will log, keep the first exception and carry on.

It also removes the redundant LeaveOnClose event (it used to do the the exact same thing as the UnsubscribeEvent, both calling membershipMgr.leaveGroup).

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants