CAMEL-23273: Camel-Jbang-mcp: Sanitize sensitive data in POM content passed to migration tools by oscerd · Pull Request #22344 · apache/camel

oscerd · 2026-03-30T16:15:45Z

Add PomSanitizer utility to detect and mask sensitive data (passwords, tokens, API keys, secrets) in POM content before processing. Add sanitizePom boolean parameter (default: true) to camel_migration_analyze, camel_dependency_check, and camel_migration_wildfly_karaf tools. Update tool descriptions with sanitization guidance.

Changes

PomSanitizer: Regex-based detection of sensitive XML element values with masking. Preserves property placeholders (${...}). Javadoc documents known limitations (false positives/negatives of tag-name-based heuristic).
PomSanitizer.process(): Shared helper used by all three tool methods to avoid code duplication. Returns processed content and a single summary warning when sensitive data is found.
sanitizePom parameter: Added to camel_migration_analyze, camel_dependency_check, and camel_migration_wildfly_karaf. Defaults to true.
Tests: 16 unit tests for PomSanitizer (detection, masking, placeholders, process helper). Integration tests in MigrationToolsTest, MigrationWildflyKarafToolsTest, and DependencyCheckToolsTest verifying sanitization, bypass, and post-sanitization correctness.

Target

I checked that the commit is targeting the correct branch (Camel 4 uses the main branch)

Tracking

If this is a large change, bug fix, or code improvement, I checked there is a JIRA issue filed for the change (usually before you start working on it).

Apache Camel coding standards and style

I checked that each commit in the pull request has a meaningful subject line and body.
I have run mvn clean install -DskipTests locally from root folder and I have committed all auto-generated changes.

github-actions · 2026-03-30T16:16:27Z

🌟 Thank you for your contribution to the Apache Camel project! 🌟
🤖 CI automation will test this PR automatically.

🐫 Apache Camel Committers, please review the following items:

First-time contributors require MANUAL approval for the GitHub Actions to run
You can use the command /component-test (camel-)component-name1 (camel-)component-name2.. to request a test from the test bot although they are normally detected and executed by CI.
You can label PRs using build-all, build-dependents, skip-tests and test-dependents to fine-tune the checks executed by this PR.
Build and test logs are available in the summary page. Only Apache Camel committers have access to the summary.

⚠️ Be careful when sharing logs. Review their contents before sharing them publicly.

github-actions · 2026-03-30T16:32:30Z

🧪 CI tested the following changed modules:

dsl/camel-jbang/camel-jbang-mcp

gnodet

Review Summary

Claude Code on behalf of Guillaume Nodet

Overview: This PR adds a PomSanitizer utility to detect and mask sensitive data (passwords, tokens, API keys) in POM content before processing by MCP migration tools. It adds a sanitizePom boolean parameter (default: true) to camel_migration_analyze, camel_dependency_check, and camel_migration_wildfly_karaf tools. Includes 21 unit tests for the sanitizer and 3 integration tests.

Verdict: Request changes

Blocking

Rebase needed against current main — This PR was branched before dba5a0f7194e (CAMEL-23270), which added @Tool.Annotations(readOnlyHint, destructiveHint, openWorldHint) to all MCP tools. The PR's versions of MigrationTools.java, DependencyCheckTools.java, and MigrationWildflyKarafTools.java do not include the annotations parameter on @Tool. Merging as-is will either cause conflicts or silently drop the annotations. Please rebase onto current main.

Major

Code duplication — The 13-line sanitization block is copy-pasted identically across all three tool methods:

String processedPom = pomContent;
List<String> sanitizationWarnings = new ArrayList<>();
if (sanitizePom == null || sanitizePom) {
    PomSanitizer.SanitizationResult sr = PomSanitizer.sanitize(pomContent);
    processedPom = sr.pomContent();
    for (String pattern : sr.detectedPatterns()) {
        sanitizationWarnings.add("Sensitive data detected and masked: " + pattern);
    }
}

Consider extracting a helper into PomSanitizer, e.g.:

record ProcessedPom(String content, List<String> warnings) {}
static ProcessedPom process(String pomContent, Boolean sanitize) { ... }

This keeps each tool method clean and ensures consistent behavior if the sanitization logic evolves.

Missing integration tests for MigrationTools and MigrationWildflyKarafTools — Sanitization was added to all three tools, but integration tests were only added to DependencyCheckToolsTest. The other two tool test classes should also verify that:
- sanitization masks sensitive data and produces warnings
- sanitizePom=false bypasses sanitization
- analysis still works correctly after sanitization

Minor

<servers> is a settings.xml element, not a pom.xml element — The <servers> section belongs to Maven's settings.xml, not pom.xml. A valid POM should never contain <servers>. While stripping it as a safety net for accidental pastes is harmless, the Javadoc should clarify this (e.g., "Strips <servers> sections which belong to settings.xml and may be accidentally included").

Nit

wasSanitized field is computed but never consumed — SanitizationResult.wasSanitized() is only used in tests, never by the tool methods themselves (they check detectedPatterns instead). Consider removing it or documenting it's for testing/logging only.
Per-pattern warning messages are verbose — The loop prefixes each detected pattern with "Sensitive data detected and masked: ". A single summary warning (e.g., "Sensitive data detected and masked: db.password, api.token") would be more concise in the tool response.

Overall the approach is sound — POM sanitization is a sensible security measure for MCP tools that accept user-provided POM content. The PomSanitizer regex patterns are well-crafted and the property placeholder preservation (${...}) is a nice touch. The test coverage for the sanitizer itself is thorough. The main action items are rebasing onto current main and reducing the code duplication.

gnodet · 2026-03-31T05:09:30Z

Claude Code on behalf of Guillaume Nodet

Follow-up on point 4 from my review — on further thought, stripping <distributionManagement> is questionable and should probably be reconsidered:

<distributionManagement> contains repository URLs and IDs, not credentials. The actual credentials for those repositories live in settings.xml under <servers> (matched by <id>). At most, the URLs reveal internal hostnames, but that's not in the same category as passwords/tokens/API keys.

Stripping it removes legitimate POM information with no real security benefit. I'd suggest removing <distributionManagement> from the sanitizer's scope — or at minimum, making it opt-in rather than default behavior.

gnodet · 2026-03-31T05:11:17Z

Claude Code on behalf of Guillaume Nodet

Another follow-up on the section stripping — the <servers> stripping has the same issue:

<servers> is a settings.xml element, not a pom.xml element. It should never appear in a valid POM file. Since all three tools accept "pom.xml file content" as input, stripping <servers> is guarding against something that can't legitimately exist in the input.

In summary, both section-stripping patterns are questionable:

<servers> — belongs to settings.xml, not pom.xml
<distributionManagement> — valid POM element but contains URLs/IDs, not credentials

I'd suggest removing both section-stripping patterns and keeping only the sensitive element value masking (passwords, tokens, API keys in properties, etc.), which is the part that genuinely adds security value.

gnodet · 2026-03-31T05:12:53Z

Claude Code on behalf of Guillaume Nodet

One more observation on the sensitive data detection approach — the regex is purely tag-name-based, matching any XML element whose name contains keywords like "password", "token", "secret", etc. This has some limitations worth considering:

False positives — non-secret values in elements that happen to contain a keyword:

<password-policy>strict</password-policy> — config value, not a secret
<token-refresh-interval>300</token-refresh-interval> — numeric setting
<secret-sharing-enabled>true</secret-sharing-enabled> — boolean flag

False negatives — actual secrets in elements with non-obvious names:

<db.connection>jdbc:mysql://user:s3cret@host/db</db.connection> — credential embedded in a URL
<my.credential>actual-secret</my.credential> — note: "credential" (singular) is not in the keyword list, only "credentials" (plural)

The heuristic is reasonable as a best-effort safety net, but worth documenting these limitations — especially since false positives could mask legitimate configuration values that the migration analysis might need.

oscerd · 2026-03-31T07:27:01Z

Claude Code on behalf of Andrea Cosentino

Thank you for the thorough review! All feedback has been addressed in the latest commit:

Blocking #1 (Rebase): Checked — no conflicts exist against current main (no changes to the MCP module since the branch point). CAMEL-23270 has not landed on main yet, so no rebase is needed at this time.

Major #2 (Code duplication): Extracted PomSanitizer.process(pomContent, sanitize) helper that returns a ProcessedPom(content, warnings) record. All three tool methods now use this single entry point instead of the duplicated 13-line block.

Major #3 (Missing integration tests): Added MigrationToolsTest and MigrationWildflyKarafToolsTest with sanitization tests covering: sensitive data masking with warnings, sanitizePom=false bypass, analysis correctness after sanitization, and clean POM baseline.

Follow-up: Remove section stripping: Removed both <servers> and <distributionManagement> stripping as suggested. <servers> belongs to settings.xml, and <distributionManagement> contains URLs/IDs, not credentials. Only sensitive element value masking remains.

Follow-up: Document regex limitations: Added comprehensive Javadoc to PomSanitizer documenting false positives (e.g., <password-policy>strict</password-policy>) and false negatives (e.g., credentials in JDBC URLs, singular <my.credential>).

Nit #5 (wasSanitized unused): Removed the field from SanitizationResult.

Nit #6 (Verbose warnings): Consolidated per-pattern warnings into a single summary: "Sensitive data detected and masked: db.password, api.token".

All 189 tests pass, code is formatted.

Croway · 2026-04-01T08:07:48Z

Why should we care about this? We're adding more to the LLM context just to work around a bad practice. Do we really want to be opinionated about that?

oscerd · 2026-04-01T08:14:25Z

Why should we care about this? We're adding more to the LLM context just to work around a bad practice. Do we really want to be opinionated about that?

Considering the amount of slop we are going to see and face, maybe it will be good to be opinionated.

gnodet · 2026-04-01T08:27:51Z

Claude Code on behalf of Guillaume Nodet

Suggestion: plugin-aware sanitization instead of tag-name heuristics

The current regex approach matches any XML element whose tag name contains keywords like "password", "token", "secret", etc. This has inherent limitations:

False positives — non-secret values get masked:

<password-policy>strict</password-policy> — a config value, not a credential
<token-refresh-interval>300</token-refresh-interval> — a numeric setting
These masked values might be useful for the migration analysis

False negatives — actual secrets are missed:

Credentials stored in elements with non-obvious names
<my.credential>secret</my.credential> — "credential" (singular) isn't in the keyword list

A more accurate approach would be plugin/mojo-aware:

Maintain a catalog of known plugins and their sensitive configuration parameters:

maven-deploy-plugin       → [password]
maven-jarsigner-plugin    → [storepass, keypass]
docker-maven-plugin       → [password, authConfig/password]
sql-maven-plugin          → [password]
maven-scm-plugin          → [password, passphrase]
...

Parse the POM as XML, identify each <plugin> by its <artifactId>, look up its sensitive config params in the catalog.
Trace property references: if a sensitive param uses ${prop.name}, resolve it back to <properties> and mask the property value there.

This is more work than the current regex, but it's precise — no false positives on config values, no false negatives on credentials in non-obviously-named elements, and no need to strip entire POM sections (<servers> belongs to settings.xml not pom.xml, and <distributionManagement> contains URLs not credentials).

…nt passed to migration tools Add PomSanitizer utility to detect and mask sensitive data (passwords, tokens, API keys, secrets) in POM content before processing. Strips <servers> and <distributionManagement> sections. Add sanitizePom boolean parameter (default: true) to camel_migration_analyze, camel_dependency_check, and camel_migration_wildfly_karaf tools. Update tool descriptions with sanitization guidance. Add 21 tests covering detection, masking, placeholder preservation, and tool integration. Signed-off-by: Andrea Cosentino <ancosen@gmail.com>

- Remove <servers> and <distributionManagement> section stripping (servers belongs to settings.xml, distributionManagement contains URLs not credentials) - Extract PomSanitizer.process() helper to eliminate code duplication across MigrationTools, DependencyCheckTools, and MigrationWildflyKarafTools - Consolidate per-pattern warnings into a single summary warning - Remove unused wasSanitized field from SanitizationResult - Document regex-based detection limitations (false positives/negatives) - Add sanitization integration tests for MigrationTools and MigrationWildflyKarafTools Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

oscerd · 2026-04-01T09:25:12Z

Claude Code on behalf of Andrea Cosentino

Rebased onto current main — the branch now includes CAMEL-23270 (@Tool.Annotations). All 189 tests pass.

Regarding the plugin-aware sanitization suggestion: I agree this would be more precise long-term, but it's a significantly larger effort (maintaining a plugin catalog, XML parsing, property tracing). The current tag-name heuristic is documented with its limitations (false positives/negatives in the Javadoc) and serves as a reasonable best-effort safety net for the initial implementation. We could evolve toward plugin-aware sanitization in a follow-up if the heuristic proves insufficient in practice.

github-actions Bot added the dsl label Mar 30, 2026

oscerd requested review from Croway and luigidemasi March 30, 2026 16:44

gnodet reviewed Mar 30, 2026

View reviewed changes

oscerd changed the title ~~CAMEL-23273 - Camel-Jbang-mcp: Warn about sensitive data in POM conte…~~ CAMEL-23273: Camel-Jbang-mcp: Sanitize sensitive data in POM content passed to migration tools Mar 31, 2026

oscerd requested a review from gnodet March 31, 2026 07:47

Croway approved these changes Apr 1, 2026

View reviewed changes

oscerd and others added 2 commits April 1, 2026 11:23

oscerd force-pushed the CAMEL-23273 branch from e70d594 to 1cc1f9d Compare April 1, 2026 09:25

oscerd merged commit 95ca2a8 into main Apr 1, 2026
5 checks passed

oscerd deleted the CAMEL-23273 branch April 1, 2026 11:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CAMEL-23273: Camel-Jbang-mcp: Sanitize sensitive data in POM content passed to migration tools#22344

CAMEL-23273: Camel-Jbang-mcp: Sanitize sensitive data in POM content passed to migration tools#22344
oscerd merged 2 commits into
mainfrom
CAMEL-23273

oscerd commented Mar 30, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Mar 30, 2026

Uh oh!

github-actions Bot commented Mar 30, 2026

Uh oh!

gnodet left a comment

Uh oh!

gnodet commented Mar 31, 2026

Uh oh!

gnodet commented Mar 31, 2026

Uh oh!

gnodet commented Mar 31, 2026

Uh oh!

oscerd commented Mar 31, 2026

Uh oh!

Croway commented Apr 1, 2026

Uh oh!

oscerd commented Apr 1, 2026

Uh oh!

gnodet commented Apr 1, 2026

Uh oh!

oscerd commented Apr 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

oscerd commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Target

Tracking

Apache Camel coding standards and style

Uh oh!

github-actions Bot commented Mar 30, 2026

Uh oh!

github-actions Bot commented Mar 30, 2026

Uh oh!

gnodet left a comment

Choose a reason for hiding this comment

Review Summary

Blocking

Major

Minor

Nit

Uh oh!

gnodet commented Mar 31, 2026

Uh oh!

gnodet commented Mar 31, 2026

Uh oh!

gnodet commented Mar 31, 2026

Uh oh!

oscerd commented Mar 31, 2026

Uh oh!

Croway commented Apr 1, 2026

Uh oh!

oscerd commented Apr 1, 2026

Uh oh!

gnodet commented Apr 1, 2026

Suggestion: plugin-aware sanitization instead of tag-name heuristics

Uh oh!

oscerd commented Apr 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

oscerd commented Mar 30, 2026 •

edited

Loading