Skip to content

Add support for an alternative ArrowBuffer allocation mechanism.#304

Merged
CurtHagenlocher merged 3 commits into
apache:mainfrom
CurtHagenlocher:NativeBuffer
Apr 1, 2026
Merged

Add support for an alternative ArrowBuffer allocation mechanism.#304
CurtHagenlocher merged 3 commits into
apache:mainfrom
CurtHagenlocher:NativeBuffer

Conversation

@CurtHagenlocher
Copy link
Copy Markdown
Contributor

What's Changed

Adds support for an alternative ArrowBuffer allocation mechanism to optimize array creation.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new native, aligned allocation pathway for building ArrowBuffer instances, aiming to optimize array/buffer creation while allowing different GC interaction strategies.

Changes:

  • Introduces NativeBuffer<TItem, TTracker> for aligned native allocation, growth (realloc), and ownership transfer into ArrowBuffer.
  • Adds INativeAllocationTracker plus concrete trackers (MemoryPressureAllocationTracker, NoOpAllocationTracker) to control GC memory-pressure behavior.
  • Implements downlevel aligned allocation via AlignedNative and adds coverage via new NativeBufferTests.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
test/Apache.Arrow.Tests/NativeBufferTests.cs Adds unit tests for allocation, growth, ownership transfer, and downlevel fallback behavior.
src/Apache.Arrow/Memory/NoOpAllocationTracker.cs Adds a tracker that intentionally does not interact with GC memory pressure.
src/Apache.Arrow/Memory/NativeBuffer.cs Implements the new native buffer abstraction and its internal unmanaged MemoryManager<byte>.
src/Apache.Arrow/Memory/MemoryPressureAllocationTracker.cs Adds a tracker that forwards allocation deltas to GC.Add/RemoveMemoryPressure.
src/Apache.Arrow/Memory/INativeAllocationTracker.cs Introduces the tracker interface used by NativeBuffer.
src/Apache.Arrow/Memory/AlignedNative.cs Adds downlevel aligned alloc/free/realloc with CRT probing and a manual-alignment fallback.
src/Apache.Arrow/Apache.Arrow.csproj Excludes AlignedNative.cs from .NETCoreApp TFMs (where NativeMemory.* is used).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/Apache.Arrow/Memory/NativeBuffer.cs Outdated
Comment thread src/Apache.Arrow/Memory/NativeBuffer.cs
Comment thread src/Apache.Arrow/Memory/NativeBuffer.cs
Comment thread src/Apache.Arrow/Memory/NativeBuffer.cs
Comment thread src/Apache.Arrow/Memory/NativeBuffer.cs
Comment thread src/Apache.Arrow/Memory/NativeBuffer.cs Outdated
Comment thread src/Apache.Arrow/Memory/AlignedNative.cs
Comment thread src/Apache.Arrow/Memory/AlignedNative.cs
Comment thread test/Apache.Arrow.Tests/NativeBufferTests.cs
Comment thread test/Apache.Arrow.Tests/NativeBufferTests.cs
Copy link
Copy Markdown
Contributor

@adamreeve adamreeve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks good to me. I'm curious about how it's planned to be used though. Do you intend to switch the array builders to use NativeBuffer eventually somehow, or is this intended to be for users who are creating arrays without using the builders?

IMO it would be nice to update the builders to use the new NativeBuffer type, but I can't see how to do that in a backwards compatible way as they expose ArrowBuffer.Builder instances as protected members. And IArrowArrayBuilder has a Build method that accepts a MemoryAllocator, which wouldn't make sense if we build using a NativeBuffer from the start.

Are we going to end up with two different builders for each array type and a new builder interface?

Comment thread src/Apache.Arrow/Memory/INativeAllocationTracker.cs Outdated
@CurtHagenlocher
Copy link
Copy Markdown
Contributor Author

IMO it would be nice to update the builders to use the new NativeBuffer type, but I can't see how to do that in a backwards compatible way as they expose ArrowBuffer.Builder instances as protected members.

Yeah, me either. But I think this is a good primitive; I've used it in two different places now but (as you saw) I had to use Reflection to create the ArrowBuffer. I was thinking this was also the right primitive to use for some of the cast/convert/operations work that @mobiusklein is doing.

Are we going to end up with two different builders for each array type and a new builder interface?

Maybe :(.

@adamreeve
Copy link
Copy Markdown
Contributor

Yes I agree that this is worth having as-is, I would have also used it for ParquetSharp.Dataset. We don't need to decide what to do with the builders now, but it would be interested to see what the performance benefit could be at some point.

@CurtHagenlocher CurtHagenlocher merged commit aaa07fe into apache:main Apr 1, 2026
14 checks passed
@CurtHagenlocher CurtHagenlocher deleted the NativeBuffer branch April 1, 2026 23:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants