Skip to content

CAMEL-23459: Add TTL cleanup for pending async tasks to prevent memory leak#23125

Merged
oscerd merged 2 commits into
mainfrom
fix/CAMEL-23459
May 12, 2026
Merged

CAMEL-23459: Add TTL cleanup for pending async tasks to prevent memory leak#23125
oscerd merged 2 commits into
mainfrom
fix/CAMEL-23459

Conversation

@oscerd
Copy link
Copy Markdown
Contributor

@oscerd oscerd commented May 11, 2026

Summary

Fixes memory leak in camel-docling component where pending async conversion tasks accumulate indefinitely in the shared pendingAsyncTasks map.

Changes

  • AsyncTaskEntry: New wrapper class that tracks CompletableFuture with creation timestamp
  • asyncTaskTtl: New configuration option (default 86400000ms / 24 hours)
  • Scheduled cleanup: Background task runs every 10% of TTL (minimum 1 second) to evict expired entries
  • Logging: DEBUG-level logging for lifecycle, evictions, and cleanup summary
  • Documentation: Updated docling-component.adoc with new configuration option
  • Tests: Added DoclingAsyncTaskTtlTest with three test scenarios (Awaitility-based, no Thread.sleep)

Risk Assessment

Low risk - The change is additive and backward compatible:

  • Existing behavior unchanged (tasks still tracked and consumed normally)
  • New cleanup only removes entries that would never be consumed
  • Default TTL (24 hours) is conservative for typical use cases
  • Cleanup runs in background thread, no impact on processing

Testing

  • Unit tests verify TTL eviction after configured timeout
  • Unit tests verify tasks not evicted before TTL expires
  • Unit tests verify multiple tasks evicted correctly
  • camel-docling module: 49 tests, 0 failures, 0 errors
  • Full reactor build (mvn clean install -DskipTests): SUCCESS, no uncommitted regen artifacts

Review feedback addressed (7c1479f)

  • Replaced Thread.sleep in tests with Awaitility during/atMost pattern (@gnodet)
  • Lowered cleanup-interval floor from 60s → 1s so short TTLs actually fire (@gnodet)
  • Corrected misleading test comment and added scheduler-jitter headroom (@gnodet)
  • Lowered routine lifecycle logs from INFODEBUG (@gnodet)
  • Regenerated stale top-level catalog/DSL artifacts (docling.json, DoclingComponentBuilderFactory, DoclingEndpointBuilderFactory) (@davsclaus, @gnodet)
  • Updated pre-existing DoclingAsyncConversionTest to use AsyncTaskEntry (drive-by — the old test was still inserting raw CompletableFuture and throwing ClassCastException)

Related


Bob on behalf of Andrea Cosentino

…y leak

DoclingComponent now tracks pending async-conversion task IDs with timestamps
in AsyncTaskEntry wrapper objects. A scheduled cleanup task runs periodically
(every 10% of TTL, minimum 1 minute) to evict entries older than the configured
asyncTaskTtl (default 24 hours).

- Added AsyncTaskEntry class to wrap CompletableFuture with creation timestamp
- Added asyncTaskTtl configuration option (default 86400000ms / 24 hours)
- Implemented scheduled cleanup using Camel's ScheduledExecutorService
- Added DEBUG logging for evictions with task ID and age
- Updated documentation in docling-component.adoc
- Added unit tests for TTL eviction behavior

Signed-off-by: Andrea Cosentino <ancosen@gmail.com>
@oscerd oscerd force-pushed the fix/CAMEL-23459 branch from e322ab6 to 608c8cf Compare May 11, 2026 13:38
@github-actions
Copy link
Copy Markdown
Contributor

🌟 Thank you for your contribution to the Apache Camel project! 🌟
🤖 CI automation will test this PR automatically.

🐫 Apache Camel Committers, please review the following items:

  • First-time contributors require MANUAL approval for the GitHub Actions to run
  • You can use the command /component-test (camel-)component-name1 (camel-)component-name2.. to request a test from the test bot although they are normally detected and executed by CI.
  • You can label PRs using skip-tests and test-dependents to fine-tune the checks executed by this PR.
  • Build and test logs are available in the summary page. Only Apache Camel committers have access to the summary.

⚠️ Be careful when sharing logs. Review their contents before sharing them publicly.

@davsclaus
Copy link
Copy Markdown
Contributor

Changes not staged for commit:
(use "git add ..." to update what will be committed)
(use "git restore ..." to discard changes in working directory)
modified: catalog/camel-catalog/src/generated/resources/org/apache/camel/catalog/components/docling.json
modified: dsl/camel-componentdsl/src/generated/java/org/apache/camel/builder/component/dsl/DoclingComponentBuilderFactory.java
modified: dsl/camel-endpointdsl/src/generated/java/org/apache/camel/builder/endpoint/dsl/DoclingEndpointBuilderFactory.java

Copy link
Copy Markdown
Contributor

@gnodet gnodet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work addressing the memory leak, Andrea — the TTL-based eviction approach is clean and backward-compatible.

A few items to address before this can merge:

Blocking

  1. Missing regenerated files — CI is failing on the "uncommitted changes" check. As Claus already noted, catalog/camel-catalog/.../docling.json, dsl/camel-componentdsl/.../DoclingComponentBuilderFactory.java, and dsl/camel-endpointdsl/.../DoclingEndpointBuilderFactory.java need to be regenerated and committed.

  2. Thread.sleep() in test codetestAsyncTaskNotEvictedBeforeTtl() uses Thread.sleep(1000). The project convention requires Awaitility instead (see CLAUDE.md "Asynchronous Testing" section). See inline comment for a suggested replacement.

Non-blocking suggestions

  1. Test comment vs actual behavior — The comment in testAsyncTaskTtlEviction says "for 2s TTL that's 200ms" but the code enforces a 1-minute minimum (Math.max(ttl / 10, 60000)). If the cleanup interval is actually 60s, the await().atMost(5, SECONDS) assertion may be fragile. Worth double-checking this passes reliably.

  2. Logging leveldoStart(), doStop(), and periodic cleanup all log at INFO. For a component library, routine lifecycle and housekeeping are typically DEBUG to avoid noisy output for end users. See inline comment.

Claude Code on behalf of Guillaume Nodet

- Replace Thread.sleep with Awaitility during/atMost pattern in DoclingAsyncTaskTtlTest
- Lower cleanup-interval floor from 60s to 1s so short TTLs (e.g. test TTLs) actually fire;
  cleanup is cheap when no tasks are expired so the lower minimum is safe
- Override isUseAdviceWith() in DoclingAsyncTaskTtlTest so the component's doStart sees the
  test-configured TTL instead of the default
- Update DoclingAsyncConversionTest to use AsyncTaskEntry (was still inserting
  CompletableFuture into the map, causing ClassCastException)
- Lower lifecycle/housekeeping LOG.info calls to LOG.debug
- Regenerate top-level catalog/DSL artifacts (docling.json, DoclingComponentBuilderFactory,
  DoclingEndpointBuilderFactory) so the asyncTaskTtl option is exposed there too

Bob on behalf of Andrea Cosentino
@oscerd
Copy link
Copy Markdown
Contributor Author

oscerd commented May 12, 2026

Pushed 7c1479f addressing all of @gnodet's and @davsclaus's review feedback:

Blocking items

  • Stale generated files (@davsclaus / @gnodet "Blocking 1"): regenerated and committed catalog/.../docling.json, DoclingComponentBuilderFactory.java, and DoclingEndpointBuilderFactory.java. Verified that a clean mvn clean install -DskipTests from the repo root now produces no further uncommitted regen artifacts.
  • Thread.sleep in tests (@gnodet "Blocking 2"): replaced with the suggested Awaitility during/atMost pattern.

Non-blocking suggestions (both applied)

  • Test comment vs actual behavior (@gnodet Added MediaTray support. #3): this turned out to be more than a comment error — with the 60s minimum, the cleanup-interval was actually 60s for the 2s test TTL, so the existing atMost(5, SECONDS) could never fire. Lowered the cleanup-interval floor from 60s → 1s (cleanup is cheap when the map is empty, so the lower minimum is safe and makes short TTLs actually meaningful), corrected the test comment, bumped atMost to 8s for headroom, and added isUseAdviceWith() = true so the test-configured TTL is in effect by the time doStart runs.
  • Logging level (@gnodet Trunk #4): lowered doStart/doStop/cleanup-summary lines from LOG.info to LOG.debug.

Drive-by
While running the test suite I also found that the pre-existing DoclingAsyncConversionTest wasn't updated when the PR changed the map's value type to AsyncTaskEntry — it was still inserting raw CompletableFuture and producing ClassCastException on 3 tests. Updated the test to wrap in AsyncTaskEntry.

Verification

  • camel-docling: 49 tests, 0 failures, 0 errors (was 3 ClassCastException errors on DoclingAsyncConversionTest + flaky DoclingAsyncTaskTtlTest).
  • Full reactor mvn clean install -DskipTests from root: SUCCESS in 8:42.
  • git status clean after the full reactor build — no further regen artifacts.

cc @Croway @davsclaus @gnodet for re-review when convenient.

Claude Code on behalf of Andrea Cosentino

@github-actions
Copy link
Copy Markdown
Contributor

🧪 CI tested the following changed modules:

  • catalog/camel-catalog
  • components/camel-ai/camel-docling
  • dsl/camel-componentdsl
  • dsl/camel-endpointdsl

⚠️ Some tests are disabled on GitHub Actions (@DisabledIfSystemProperty(named = "ci.env.name")) and require manual verification:

  • components/camel-ai/camel-docling: 6 test(s) disabled on GitHub Actions
All tested modules (11 modules)
  • Camel :: AI :: Docling
  • Camel :: Catalog :: Camel Catalog
  • Camel :: Component DSL
  • Camel :: Endpoint DSL
  • Camel :: JBang :: MCP
  • Camel :: JBang :: Plugin :: Route Parser
  • Camel :: JBang :: Plugin :: TUI
  • Camel :: JBang :: Plugin :: Validate
  • Camel :: Launcher :: Container
  • Camel :: YAML DSL :: Validator
  • Camel :: YAML DSL :: Validator Maven Plugin

⚙️ View full build and test results

Copy link
Copy Markdown
Contributor

@gnodet gnodet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All feedback addressed — Thread.sleep replaced with Awaitility, misleading comment fixed, LOG.info demoted to DEBUG, and generated files regenerated. CI is green. LGTM.

Claude Code on behalf of Guillaume Nodet

@oscerd oscerd merged commit 22a687b into main May 12, 2026
7 checks passed
@oscerd oscerd deleted the fix/CAMEL-23459 branch May 12, 2026 12:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants