Skip to content

Fix EnumSerializer.write to accept numpy scalars from NDArray of records#285

Open
SAY-5 wants to merge 1 commit into
microsoft:mainfrom
SAY-5:say5-fix-enum-numpy-ndarray
Open

Fix EnumSerializer.write to accept numpy scalars from NDArray of records#285
SAY-5 wants to merge 1 commit into
microsoft:mainfrom
SAY-5:say5-fix-enum-numpy-ndarray

Conversation

@SAY-5
Copy link
Copy Markdown

@SAY-5 SAY-5 commented May 11, 2026

Closes #284.

EnumSerializer.write assumed its value argument was always a Python Enum, but when called from RecordSerializer._write on the numpy write path (where a record is iterated as value['fieldname'] from a np.void scalar), it receives a bare np.integer and crashes with AttributeError: 'numpy.uint64' object has no attribute 'value'. This affected any T[] whose element type was a record containing one or more enum fields.

Fix unwraps .value only when value is an Enum; numpy scalars (which _integer_serializer.write already accepts) are passed through. Adds python/tests/test_enum_in_ndarray.py covering: the original NDArray-of-record-with-enums reproducer, the bare numpy-scalar path directly through EnumSerializer.write, and the existing Python-Enum path (to guard against regression).

@SAY-5 SAY-5 force-pushed the say5-fix-enum-numpy-ndarray branch from 2a9f25d to 6f48815 Compare May 11, 2026 08:41
Comment thread python/tests/test_enum_in_ndarray.py Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the right way to test this.

The test model should be updated such that it reproduces the issue. Then regression tests can be added to test_generated_types.py and test_protocol_roundtrip.py. Then, an equivalent test should be added for C++ and MATLAB to ensure no similar bugs exist.

@SAY-5
Copy link
Copy Markdown
Author

SAY-5 commented May 11, 2026

Understood, that's a much bigger refactor than I have bandwidth for; happy to close in favour of a more thorough fix.

@SAY-5
Copy link
Copy Markdown
Author

SAY-5 commented May 11, 2026

Understood, the proper test should reproduce the bug through the YAML test model and have regression coverage in the generated-types and protocol-roundtrip suites for Python, C++, and MATLAB. That's a substantially different scope from this PR, so I'll close this and open a follow-up once the test model and codegen changes are ready.

@SAY-5
Copy link
Copy Markdown
Author

SAY-5 commented May 11, 2026

Closing per maintainer feedback, will resubmit with the proper test model approach.

@SAY-5 SAY-5 closed this May 11, 2026
@SAY-5 SAY-5 reopened this May 12, 2026
@SAY-5
Copy link
Copy Markdown
Author

SAY-5 commented May 12, 2026

Reopened. Reworking per your feedback: I'll update the test model to reproduce the issue, add regression tests in test_generated_types.py and test_protocol_roundtrip.py, and add equivalent coverage for C++ and MATLAB to confirm no similar bugs exist there. Will push and re-ping.

@SAY-5 SAY-5 force-pushed the say5-fix-enum-numpy-ndarray branch from 6f48815 to 58785a3 Compare May 19, 2026 06:50
@SAY-5
Copy link
Copy Markdown
Author

SAY-5 commented May 19, 2026

Reworked as discussed. Added recArray: RecordWithEnums[] to the Enums protocol in the test model and regenerated all backends, then added regression coverage to the Python, C++, and MATLAB roundtrip suites plus a dtype assertion in test_generated_types.py. The roundtrip exposed a matching read-path bug, so RecordSerializer.read_numpy now reads enum and nested-record fields via read_numpy so they are numpy-assignable. Removed the standalone test file. Local Python (100 passed), C++ (155 passed) and Go tooling suites are green; CI is waiting on workflow approval.

@naegelejd naegelejd requested a review from Copilot May 19, 2026 14:02
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Fixes a crash in EnumSerializer.write when called from the numpy-record write path where enum fields arrive as bare numpy scalars (rather than Python Enum instances). Also adjusts RecordSerializer.read_numpy so enum and nested-record fields are read via read_numpy, producing numpy-assignable values. Adds a new recArray step to the Enums test protocol and corresponding generated code across C++, Python, and MATLAB targets, plus roundtrip tests.

Changes:

  • Unwrap .value in EnumSerializer.write only when input is an Enum; pass numpy scalars through.
  • Make RecordSerializer.read_numpy dispatch to read_numpy on Enum/Record field serializers to keep numpy-compatible values.
  • Add recArray: RecordWithEnums[] step to the Enums test protocol and regenerate all targets/tests.

Reviewed changes

Copilot reviewed 9 out of 25 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tooling/internal/python/static_files/_binary.py Core fix: tolerate numpy scalars in EnumSerializer.write; route Enum/Record subfields via read_numpy in RecordSerializer.read_numpy.
python/tests/test_protocol_roundtrip.py Adds NDArray-of-records-with-enums roundtrip exercise.
python/tests/test_generated_types.py Asserts dtype layout for RecordWithEnums.
python/test_model/protocols.py Regenerated: adds write_rec_array/read_rec_array and updated state machine.
python/test_model/binary.py Regenerated binary impls for new recArray step.
python/test_model/ndjson.py Regenerated NDJSON impls for new recArray step.
models/test/unittests.yml Adds recArray field to the Enums protocol.
matlab/test/RoundTripTest.m Adds matching MATLAB roundtrip data for recArray.
matlab/generated/+test_model/EnumsWriterBase.m, EnumsReaderBase.m Regenerated MATLAB base classes with new state.
matlab/generated/+test_model/+binary/EnumsWriter.m, EnumsReader.m Regenerated MATLAB binary impls.
matlab/generated/+test_model/+testing/*.m Regenerated MATLAB testing mocks.
cpp/test/roundtrip_test.cc Adds matching C++ roundtrip data for recArray.
cpp/test/generated/protocols.{h,cc} Regenerated base protocol with new step and state.
cpp/test/generated/{binary,ndjson,hdf5}/protocols.{h,cc} Regenerated backend impls.
cpp/test/generated/mocks.cc Regenerated mock with new expectation method.
cpp/test/generated/model.json Regenerated schema.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1329 to +1336
# Enum and nested-record fields must be read via read_numpy so they
# yield numpy-assignable values rather than Enum or record objects.
return cast(
np.void,
tuple(
serializer.read_numpy(stream)
if isinstance(serializer, (EnumSerializer, RecordSerializer))
else serializer.read(stream)
Comment on lines +844 to +845
int_value = value.value if isinstance(value, Enum) else value
self._integer_serializer.write(stream, int_value)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python EnumSerializer raises AttributeError on NDArray of Record types containing enum fields

3 participants