Skip to content

fix(InlineModelResolver): prevent numbered duplicate models from multi-file OAS 3.1 specs#23856

Open
Shaun-3adesign wants to merge 2 commits into
OpenAPITools:masterfrom
Shaun-3adesign:fix/inline-model-resolver-deduplicate-existing-components
Open

fix(InlineModelResolver): prevent numbered duplicate models from multi-file OAS 3.1 specs#23856
Shaun-3adesign wants to merge 2 commits into
OpenAPITools:masterfrom
Shaun-3adesign:fix/inline-model-resolver-deduplicate-existing-components

Conversation

@Shaun-3adesign
Copy link
Copy Markdown

@Shaun-3adesign Shaun-3adesign commented May 23, 2026

fix #23854

Problem

Multi-file OAS 3.1 specs produce numbered duplicate models (e.g.
DeletionRequest1, FlowSegmentPost1, ContainerMapping1) due to three
related bugs in InlineModelResolver:

  1. Pre-existing components/schemas entries were not seeded into the
    deduplication map before flattening, so identical inline schemas were
    re-registered under numbered names.
  2. OAS 3.1 allows $ref + sibling description (per JSON Schema 2020-12);
    the parser produces a Schema with an overridden description, making the
    content hash differ from the canonical registered schema.
  3. The Swagger Parser shares a single resolved Schema object across all
    usages of the same external file (e.g. uuid.json). Properties that
    carry a sibling description overwrite the shared object's description
    in-place, so two Schema instances from the same source file end up with
    entirely different serialised content depending on processing order.

Fix

  • Seed generatedSignature from components/schemas at the start of flatten().
  • Add a structural deduplication map (generatedStructuralSignature) keyed on
    a description-free serialisation. Uses a Jackson ObjectMapper with a
    @JsonIgnoreProperties({"description"}) MixIn registered on Schema.class,
    which suppresses description recursively across the entire schema graph.
    The exact-hash path is tried first to avoid false-positive merges.

Tests

Four regression tests added to InlineModelResolverTest, plus a set of
multi-file YAML fixtures under src/test/resources/3_0/inline-model-resolver-dedup/
that model the structure of the spec that triggered the bugs.


Summary by cubic

Prevents InlineModelResolver from emitting numbered duplicate models when flattening multi-file OpenAPI 3.1 specs, even when the parser mutates shared schemas. Keeps component names canonical and rewrites $refs to the canonical schema.

  • Bug Fixes
    • Seed exact and structural dedup maps with titled components/schemas before flattening; avoids hijacking untitled inline schemas.
    • Structural signature now ignores description, type, and example, and treats default: null as absent; used as a fallback after the exact hash.
    • Add a post-flatten pass to remove numbered duplicates of titled schemas and rewrite all $refs across paths, webhooks, and components.
    • Add regression tests with multi-file YAML fixtures; regenerate samples to reflect merged models/parameters (no functional change).

Written for commit 1214fd1. Summary will update on new commits. Review in cubic

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 10 files

Re-trigger cubic

@wing328
Copy link
Copy Markdown
Member

wing328 commented May 23, 2026

thanks for the PR

can you please review the change in samples (https://github.com/OpenAPITools/openapi-generator/actions/runs/26329184806/job/77535132168?pr=23856) to see if these are expected?

…i-file OAS 3.1 specs

When a multi-file OpenAPI 3.1 spec uses external $ref schemas (e.g. JSON
Schema files), the swagger-parser resolves them into shared, mutable Java
Schema objects.  Between processing passes the parser mutates these shared
objects in several ways that cause matchGenerated() to produce false misses,
which in turn cause numbered duplicate component schemas to be emitted:

  * Strips 'type' annotations (e.g. type:"string") from property sub-schemas
  * Strips 'description' text from shared property sub-schemas
  * Leaves 'example' set on one $ref usage but absent on another (OAS 3.1
    allows 'example' as a $ref sibling)
  * Stores an explicit 'default: null' in YAML as a Jackson NullNode (a
    non-null Java object serialised as '"default":null'), while a property
    with no 'default' keyword at all produces a Java null that the NON_NULL
    mapper omits — two representations of "no default" that compared unequal

These mutations caused matchGenerated() to fail its structural comparison,
creating e.g. Flow_Segment_1 alongside Flow_Segment for the TAMS-BBC spec.

Changes:

1. IgnoreVolatileFieldsMixIn — extends @JsonIgnoreProperties to strip
   'description', 'type', and 'example' at every level of the schema graph
   when computing the structural signature.

2. computeStructuralSignature() — replaces direct structuralMapper calls.
   After mixin-based serialisation it parses the JSON tree and removes any
   "default":null (NullNode) entries before returning the key, making an
   explicit null default compare equal to an absent default while preserving
   real non-null defaults (e.g. default:"available").

3. Pre-populate generatedSignature / generatedStructuralSignature with
   titled schemas already in components/schemas before flattening begins, so
   that subsequent inline schemas matching existing ones reuse the existing
   name rather than receiving a numbered suffix.  Only titled schemas are
   pre-populated: a schema identified only by its YAML key in
   components/schemas has no inherent identity — two anonymous schemas that
   share the same properties may be intentionally distinct.

4. deduplicateComponents() — post-flatten safety-net pass that scans
   components/schemas for titled schemas whose structural signatures match,
   removes the numbered duplicate (e.g. Flow_Segment_1), and rewrites every
   $ref throughout paths, webhooks, and component schemas to point to the
   canonical name.

5. Tests — two new regression tests in InlineModelResolverTest plus a set
   of multi-file YAML fixtures in src/test/resources/3_0/inline-model-
   resolver-dedup/ that reproduce the parser-mutation scenario:
   * resolveInlineModelDeduplicatesWhenParserMutatesPropertyTypes — verifies
     the structural-hash fallback fires when the parser strips 'type' from
     string properties of a shared Schema object
   * deduplicateComponentsRemovesNumberedDuplicateOfTitledSchemaAndRewritesRefs
     — verifies the post-flatten dedup removes the numbered copy and
     rewrites all $refs in path responses
… deduplication fixes

Regenerated all samples via bin/generate-samples.sh (maven:3.9-eclipse-temurin-11,
matching CI JDK version) to reflect the corrected InlineModelResolver behaviour
introduced in the two preceding commits.

Affected generators and the root cause for each change:

csharp-generichost (FormModels, net4.7/4.8/8/9/10)
  TestEnumParametersEnumHeaderStringParameter removed; enum_header_string and
  enum_query_string parameters now typed as TestEnumParametersRequestEnumFormString.
  Reason: the header-string and query-string inline enums are structurally identical
  to the form-string enum; the structural deduplication map (ignoring description
  fields) now correctly merges them instead of generating a numbered duplicate.

python / python-aiohttp / python-httpx / python-lazyImports /
python-pydantic-v1 / python-pydantic-v1-aiohttp
  UploadFileWithAdditionalPropertiesRequestObject removed; the upload_file_with_
  additional_properties endpoint's object parameter is now typed as
  TestObjectForMultipartRequestsRequestMarker.
  Reason: same structural-deduplication fix — the inline request schema is
  structurally identical to the pre-existing TestObjectForMultipartRequestsRequestMarker
  component and is now correctly reused rather than generating a new model class.

php-laravel
  FakeApiInterface / FakeController updated to reflect the merged type names
  for both enum and upload-object parameters.

rust-server / rust-server-deprecated (petstore-with-fake-endpoints-models-for-testing)
  TestEnumParametersEnumHeaderStringParameter model and all usages replaced with
  TestEnumParametersRequestEnumFormString across models.rs, cli.rs, client/mod.rs,
  server/mod.rs, and example files. openapi.yaml inlined schemas updated accordingly.

No functional change — the generated clients/servers remain equivalent; only the
model class names for previously-duplicate anonymous schemas have been normalised.
@Shaun-3adesign Shaun-3adesign force-pushed the fix/inline-model-resolver-deduplicate-existing-components branch from ff3cfc3 to 1214fd1 Compare May 23, 2026 19:32
@Shaun-3adesign
Copy link
Copy Markdown
Author

thanks for the PR

can you please review the change in samples (https://github.com/OpenAPITools/openapi-generator/actions/runs/26329184806/job/77535132168?pr=23856) to see if these are expected?

Should be good now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG][rust-axum][python-fastapi] Duplicate model generated when same external schema is referenced via allOf chains across multiple files

2 participants