Open-API: TableRequirements should use union of subclasses#10434
Conversation
|
@flyrain could you please take a look? |
|
Thanks for raising this @anuragmantri Looking at the changes in the code, it will remove the inheritance: diff --git a/open-api/rest-catalog-open-api.py b/open-api/rest-catalog-open-api.py
index e13aeee..46c7eca 100644
--- a/open-api/rest-catalog-open-api.py
+++ b/open-api/rest-catalog-open-api.py
@@ -361,10 +361,10 @@ class RemovePartitionStatisticsUpdate(BaseUpdate):
class TableRequirement(BaseModel):
- type: str
+ pass
-class AssertCreate(TableRequirement):
+class AssertCreate(BaseModel):
"""
The table must not already exist; used for create transactions
"""
@@ -372,7 +372,7 @@ class AssertCreate(TableRequirement):
type: Literal['assert-create']
-class AssertTableUUID(TableRequirement):
+class AssertTableUUID(BaseModel):
"""
The table UUID must match the requirement's `uuid`
"""
@@ -381,7 +381,7 @@ class AssertTableUUID(TableRequirement):
uuid: str
-class AssertRefSnapshotId(TableRequirement):
+class AssertRefSnapshotId(BaseModel):
"""
The table branch or tag identified by the requirement's `ref` must reference the requirement's `snapshot-id`; if `snapshot-id` is `null` or missing, the ref must not already exist
"""
@@ -391,7 +391,7 @@ class AssertRefSnapshotId(TableRequirement):
snapshot_id: int = Field(..., alias='snapshot-id')
-class AssertLastAssignedFieldId(TableRequirement):
+class AssertLastAssignedFieldId(BaseModel):
"""
The table's last assigned column id must match the requirement's `last-assigned-field-id`
"""
@@ -400,7 +400,7 @@ class AssertLastAssignedFieldId(TableRequirement):
last_assigned_field_id: int = Field(..., alias='last-assigned-field-id')
-class AssertCurrentSchemaId(TableRequirement):
+class AssertCurrentSchemaId(BaseModel):
"""
The table's current schema id must match the requirement's `current-schema-id`
"""
@@ -409,7 +409,7 @@ class AssertCurrentSchemaId(TableRequirement):
current_schema_id: int = Field(..., alias='current-schema-id')
-class AssertLastAssignedPartitionId(TableRequirement):
+class AssertLastAssignedPartitionId(BaseModel):
"""
The table's last assigned partition id must match the requirement's `last-assigned-partition-id`
"""
@@ -4[18](https://github.com/apache/iceberg/actions/runs/9356980269/job/25755651734?pr=10434#step:7:19),7 +418,7 @@ class AssertLastAssignedPartitionId(TableRequirement):
last_assigned_partition_id: int = Field(..., alias='last-assigned-partition-id')
-class AssertDefaultSpecId(TableRequirement):
+class AssertDefaultSpecId(BaseModel):
"""
The table's default spec id must match the requirement's `default-spec-id`
"""
@@ -4[27](https://github.com/apache/iceberg/actions/runs/9356980269/job/25755651734?pr=10434#step:7:28),7 +427,7 @@ class AssertDefaultSpecId(TableRequirement):
default_spec_id: int = Field(..., alias='default-spec-id')
-class AssertDefaultSortOrderId(TableRequirement):
+class AssertDefaultSortOrderId(BaseModel):
"""
The table's default sort order id must match the requirement's `default-sort-order-id`
""" |
|
We don't have to use inheritance here, but I'm OK with it. |
|
Thanks @Fokko and @flyrain - I think we should keep the inheritance and remove the duplicates in sub types. I updated this PR to do that. @flyrain this pattern of discriminator field present also as child field is used in 3 other places
Should we remove child fields and move the descriptor fields to parent all these places to be consistent? |
flyrain
left a comment
There was a problem hiding this comment.
+1 Thanks @anuragmantri for working on it. We can remove the duplication for the other schema in different PRs.
3cf6254 to
dc09bf7
Compare
| The table must not already exist; used for create transactions | ||
| """ | ||
|
|
||
| type: Literal['assert-create'] |
There was a problem hiding this comment.
I'm sad to see this go, but it looks like it is a limitation of the open-api generator, since it could also get this information from the mapping field. I would love to get to know what others think
There was a problem hiding this comment.
It looks like we're using it in an odd way here. The example here also does not use inheritance but returns an union:
class Cat(BaseModel):
pet_type: Literal['cat']
meows: int
class Dog(BaseModel):
pet_type: Literal['dog']
barks: float
class Lizard(BaseModel):
pet_type: Literal['reptile', 'lizard']
scales: bool
class Model(BaseModel):
pet: Union[Cat, Dog, Lizard] = Field(..., discriminator='pet_type')
n: intIt is just that we try to add this field in there. I think the following is more correct:
diff --git a/open-api/rest-catalog-open-api.yaml b/open-api/rest-catalog-open-api.yaml
index f8f68d06df..b85207e57b 100644
--- a/open-api/rest-catalog-open-api.yaml
+++ b/open-api/rest-catalog-open-api.yaml
@@ -2589,18 +2589,6 @@ components:
type: object
required:
- type
- properties:
- type:
- type: "string"
- enum:
- - "assert-create"
- - "assert-table-uuid"
- - "assert-ref-snapshot-id"
- - "assert-last-assigned-field-id"
- - "assert-current-schema-id"
- - "assert-last-assigned-partition-id"
- - "assert-default-spec-id"
- - "assert-default-sort-order-id"
discriminator:
propertyName: type
mapping:
@@ -2612,6 +2600,15 @@ components:
assert-last-assigned-partition-id: '#/components/schemas/AssertLastAssignedPartitionId'
assert-default-spec-id: '#/components/schemas/AssertDefaultSpecId'
assert-default-sort-order-id: '#/components/schemas/AssertDefaultSortOrderId'
+ oneOf:
+ - $ref: '#/components/schemas/AssertCreate'
+ - $ref: '#/components/schemas/AssertTableUUID'
+ - $ref: '#/components/schemas/AssertRefSnapshotId'
+ - $ref: '#/components/schemas/AssertLastAssignedFieldId'
+ - $ref: '#/components/schemas/AssertCurrentSchemaId'
+ - $ref: '#/components/schemas/AssertLastAssignedPartitionId'
+ - $ref: '#/components/schemas/AssertDefaultSpecId'
+ - $ref: '#/components/schemas/AssertDefaultSortOrderId'
AssertCreate:
allOf:
@@ -2636,6 +2633,7 @@ components:
The table branch or tag identified by the requirement's `ref` must reference the requirement's `snapshot-id`;
if `snapshot-id` is `null` or missing, the ref must not already exist
required:
+ - type
- ref
- snapshot-id
properties:
@@ -2651,8 +2649,12 @@ components:
description:
The table's last assigned column id must match the requirement's `last-assigned-field-id`
required:
+ - type
- last-assigned-field-id
properties:
+ type:
+ type: string
+ enum: [ "assert-last-assigned-field-id" ]
last-assigned-field-id:
type: integer
@@ -2662,8 +2664,12 @@ components:
description:
The table's current schema id must match the requirement's `current-schema-id`
required:
+ - type
- current-schema-id
properties:
+ type:
+ type: string
+ enum: [ "assert-current-schema-id" ]
current-schema-id:
type: integer
@@ -2673,8 +2679,12 @@ components:
description:
The table's last assigned partition id must match the requirement's `last-assigned-partition-id`
required:
+ - type
- last-assigned-partition-id
properties:
+ type:
+ type: string
+ enum: [ "assert-last-assigned-partition-id" ]
last-assigned-partition-id:
type: integer
@@ -2684,8 +2694,12 @@ components:
description:
The table's default spec id must match the requirement's `default-spec-id`
required:
+ - type
- default-spec-id
properties:
+ type:
+ type: string
+ enum: [ "assert-default-spec-id" ]
default-spec-id:
type: integer
@@ -2695,8 +2709,12 @@ components:
description:
The table's default sort order id must match the requirement's `default-sort-order-id`
required:
+ - type
- default-sort-order-id
properties:
+ type:
+ type: string
+ enum: [ "assert-default-sort-order-id" ]
default-sort-order-id:
type: integerThis yields (just one example for readability):
+class TableRequirement(BaseModel):
+ __root__: Union[
+ AssertCreate,
+ AssertTableUUID,
+ AssertRefSnapshotId,
+ AssertLastAssignedFieldId,
+ AssertCurrentSchemaId,
+ AssertLastAssignedPartitionId,
+ AssertDefaultSpecId,
+ AssertDefaultSortOrderId,
+ ] = Field(..., discriminator='type')
+
+
+class AssertCreate(BaseModel):
+ """
+ The table must not already exist; used for create transactions
+ """
+
+ type: Literal['assert-create']The __root__ means that the class itself is one of these listed in the Union. This would be more in line with the official open-API documentation (example at the bottom of the page): https://swagger.io/docs/specification/data-models/inheritance-and-polymorphism/
There was a problem hiding this comment.
As I said, we don't have to use inheritance here. I'm OK with union.
There was a problem hiding this comment.
BTW, why do we need a field type in python classes to indicate its type(class name)?
There was a problem hiding this comment.
It makes sense, for example:
class AssertTableUUID(TableRequirement):
"""
The table UUID must match the requirement's `uuid`
"""
type: Literal['assert-table-uuid']
uuid: strWhen this class is converted to JSON using Pydantic, it will result in:
{
"type": "assert-create",
"uuid": "550e8400-e29b-41d4-a716-446655440000"
}I think I prefer the union as well.
There was a problem hiding this comment.
Thanks for your suggestions, I made the changes. The code now uses a union of subclasses as described in the above example.
dc09bf7 to
5777b72
Compare
| - $ref: '#/components/schemas/AssertDefaultSortOrderId' | ||
|
|
||
| AssertCreate: | ||
| allOf: |
There was a problem hiding this comment.
Do we still need allOf in case of union? I think allOf is only needed if we are using inheritance.
There was a problem hiding this comment.
You are right, with union, allOf is not required. I removed it.
There was a problem hiding this comment.
LGTM. I will merge it soon. Hi @Fokko, do you want to take another look?
There was a problem hiding this comment.
Sorry for the late reply, I was traveling. Thanks @anuragmantri and @flyrain for fixing this 👍
|
Thanks @anuragmantri for working on this. Thanks @Fokko for the reivew. |
Since we are using the discriminator field, the type property is inherited from TableRequirements. This PR removed redundant
typeproperties in child classes.