Skip to content

AVRO-2825: [csharp] Resolve: C# Logical Types throw exception on unkn…#2751

Open
TomBruns wants to merge 8 commits intoapache:mainfrom
TomBruns:main
Open

AVRO-2825: [csharp] Resolve: C# Logical Types throw exception on unkn…#2751
TomBruns wants to merge 8 commits intoapache:mainfrom
TomBruns:main

Conversation

@TomBruns
Copy link
Copy Markdown

AVRO-2825

What is the purpose of the change

  • This pull request resolves an outstanding issue with the csharp implementation behavior that is not consistent with the AVRO spec and the java behavior.
  • Per the AVRO Spec:
    • Language implementations must ignore unknown logical types when reading, and should use the underlying Avro type
  • The current csharp implementation throws an exception for unrecognized Logical Types.

NOTE: This PR WILL change the behavior of the current nuget package. It corrects it to align with the AVRO spec.

This is a new version of PR #2741 incorporating all feedback from KalleOlaviNiemitalo

Avoid swallowing any exceptions thrown for other reasons.
Allow applications to parse a schema and read the name of the logical type from LogicalSchema.LogicalTypeName (or even LogicalType.Name) regardless of whether the library recognizes it.
Allow applications to parse a schema and serialize it back to JSON without losing the unrecognized logical types here:
LogicalSchema.LogicalType could then be null, or perhaps an instance of a new NotSupportedLogicalType class that would pass everything through without conversion. (I created a new type: UnknownLogicalType)

Verifying this change

This change is already covered by existing tests:

  • test > AvroGen > AvroGenSchemaTests.cs > NotSupportedSchema

    • corrected expected result of Test
  • test > Schema > SchemaTests.cs > TestUnknownLogical

    • corrected expected result of Test

This change added tests and can be verified as follows:

  • test > Util > LogicalTypeTests.cs > TestUnknownLogicalType
    • added more complex test to confirm underlying AVRO base type is used.

Documentation

  • Does this pull request introduce a new feature? (no)

@github-actions github-actions bot added the C# label Feb 19, 2024
Comment thread lang/csharp/src/apache/codegen/Properties/launchSettings.json Outdated
Comment thread lang/csharp/src/apache/main/Properties/launchSettings.json Outdated
Comment thread lang/csharp/src/apache/test/Schema/SchemaTests.cs Fixed
Comment thread lang/csharp/src/apache/test/Schema/SchemaTests.cs
Comment thread lang/csharp/src/apache/test/Schema/SchemaTests.cs Fixed
@TomBruns
Copy link
Copy Markdown
Author

resolved all identified issues in new PR

@TomBruns TomBruns requested a review from martin-g February 21, 2024 14:10
Comment thread lang/csharp/src/apache/test/AvroGen/AvroGenSchemaTests.cs Outdated
Comment thread lang/csharp/src/apache/test/Schema/SchemaTests.cs Outdated
@TomBruns TomBruns requested a review from martin-g February 22, 2024 13:24
@martin-g
Copy link
Copy Markdown
Member

We need a C# developer/user to review the changes.

@TomBruns
Copy link
Copy Markdown
Author

Is there something I need to do to allow this PR to move forward?

/// </summary>
/// <param name="nullible">A flag indicating whether it should be nullible.</param>
/// <returns>Type.</returns>
public override Type GetCSharpType(bool nullible)
Copy link
Copy Markdown

@a-kalashnikov a-kalashnikov Apr 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps you meant nullable instead of nullible?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with your comment but that name pre-existed in many files in this codebase so I choose to be consistent.

@TomBruns
Copy link
Copy Markdown
Author

TomBruns commented Apr 8, 2024

I am not sure why this still shows are changes requested, they all have been made to my knowledge.

@TomBruns
Copy link
Copy Markdown
Author

TomBruns commented Apr 8, 2024

all changes made

@TomBruns TomBruns closed this Apr 8, 2024
@TomBruns TomBruns reopened this Apr 8, 2024
@martin-g martin-g dismissed their stale review April 9, 2024 12:35

A C# dev should take a look

@martin-g martin-g requested a review from zcsizmadia April 9, 2024 12:44
@zcsizmadia
Copy link
Copy Markdown
Contributor

LGTM. Thanks @TomBruns!

{ TimestampMillisecond.LogicalTypeName, new TimestampMillisecond() },
{ TimestampMicrosecond.LogicalTypeName, new TimestampMicrosecond() },
{ Uuid.LogicalTypeName, new Uuid() }
{ Uuid.LogicalTypeName, new Uuid() },
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove comma plz

@kaylenmistry
Copy link
Copy Markdown

+1 to this change

@zcsizmadia is there anything blocking this from being accepted?

case @"double":
return nullible ? typeof(System.Double?) : typeof(System.Double);
case @"bytes":
return nullible ? typeof(System.Byte?[]) : typeof(System.Byte[]);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

System.Byte?[] is an array type in which each element is a nullable byte. It is not the the correct type for Avro {"type": [ "null", "bytes" ]}, which allows either a null value or an array of non-null bytes. So this should return typeof(System.Byte[]) regardless of nullible, similarly to the case "string" above.

Comment on lines +52 to +73
public override object ConvertToBaseValue(object logicalValue, LogicalSchema schema)
{
switch (schema.Name)
{
case @"string":
return (System.String)logicalValue;
case @"boolean":
return (System.Boolean)logicalValue;
case @"int":
return (System.Int32)logicalValue;
case @"long":
return (System.Int64)logicalValue;
case @"float":
return (System.Single)logicalValue;
case @"double":
return (System.Double)logicalValue;
case @"bytes":
return (System.Byte[])logicalValue;
default:
return logicalValue;
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These casts are almost no-ops, because the result of each cast is immediately converted back to object. I think this method could just return logicalValue; always regardless of schema.Name.

The casts have two effects, but I don't think those effects matter here, because the logical value should be the same as the base value, which should be an instance of one of the Avro standard types:

  • If the logicalValue is not compatible with the expected type, then it throws InvalidCastException.
  • In some cases, the .NET runtime allows unboxing from a type that does not match exactly. For example, casting from Object to Int32 works even if the Object is actually a boxed UInt32 or a boxed enum whose underlying type is Int32 or UInt32.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants