Skip to content

feat(c++/python): support stream deserialization for c++ and python#3307

Merged
chaokunyang merged 21 commits intoapache:mainfrom
chaokunyang:cpp_stream_deserialization
Feb 26, 2026
Merged

feat(c++/python): support stream deserialization for c++ and python#3307
chaokunyang merged 21 commits intoapache:mainfrom
chaokunyang:cpp_stream_deserialization

Conversation

@chaokunyang
Copy link
Copy Markdown
Collaborator

@chaokunyang chaokunyang commented Feb 6, 2026

Why?

C++ and Python deserialization currently assumes data is already materialized in memory-backed buffers. This PR adds stream-backed deserialization support so payloads can be read incrementally from input streams while preserving existing serialization behavior and error handling.

What does this PR do?

  • Adds stream infrastructure in C++ (ForyInputStreamBuf/ForyInputStream) and integrates it with Buffer so reads can request more bytes on demand.
  • Adds Python-to-C++ stream bridge (Fory_PyCreateBufferFromStream) so pyfory.buffer.Buffer can be constructed from Python objects that implement read(size).
  • Updates deserialization paths to be stream-safe by using ensure_size checks for header/string/fixed-field reads and falling back from batched varint reads on stream-backed buffers.
  • Extends build rules to compile/link new stream components and adds stream-focused C++ tests (stream_test.cc, buffer stream tests).
  • Adds Python stream tests (python/pyfory/tests/test_stream.py) and buffer stream coverage in python/pyfory/tests/test_buffer.py.

Related issues

N/A

Does this PR introduce any user-facing change?

  • Does this PR introduce any public API change?
  • Does this PR introduce any binary protocol compatibility change?

Benchmark

N/A

@chaokunyang chaokunyang marked this pull request as draft February 6, 2026 15:43
Zakir032002 added a commit to Zakir032002/fory that referenced this pull request Feb 15, 2026
Implements apache#3300 aligned with C++ PR apache#3307 stream model.

- Add ForyStreamBuf: growable buffer wrapping dyn Read, no compaction
- Make Reader stream-aware: ensure_readable before reads, sync_stream_pos after
- Add byte-at-a-time varint fallbacks for stream-backed readers
- Fix deserialize_from to transfer stream state via take/restore pattern
- Preserve zero-overhead in-memory fast path (branch-light)
- Add 12 comprehensive stream tests (primitives, structs, strings,
  sequential decode, truncated stream errors, Vec, regression)

Closes apache#3300
Zakir032002 added a commit to Zakir032002/fory that referenced this pull request Feb 16, 2026
- Add standalone ForyStreamBuf with growable buffer
- Implements fill_buffer for on-demand reading from std::io::Read
- Buffer grows monotonically without compaction
- No integration with Reader yet (zero impact on existing code)
- Includes 4 unit tests for basic functionality

Design follows C++ PR apache#3307 and addresses apache#3300.
Part 1 of 3-phase implementation.
@chaokunyang chaokunyang marked this pull request as ready for review February 26, 2026 14:37
@chaokunyang chaokunyang requested a review from urlyy February 26, 2026 14:38
@chaokunyang chaokunyang merged commit 812ade8 into apache:main Feb 26, 2026
64 checks passed
chaokunyang added a commit that referenced this pull request Mar 4, 2026
…3453)

## Why?



## What does this PR do?



## Related issues

#3449 
#3307 

## Does this PR introduce any user-facing change?



- [ ] Does this PR introduce any public API change?
- [ ] Does this PR introduce any binary protocol compatibility change?

## Benchmark
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants