Skip to content

Validate uncompressed size up front in ZipArchiveEntry update mode#128319

Draft
Copilot wants to merge 2 commits into
mainfrom
copilot/analyze-and-fix-issue-127834
Draft

Validate uncompressed size up front in ZipArchiveEntry update mode#128319
Copilot wants to merge 2 commits into
mainfrom
copilot/analyze-and-fix-issue-127834

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 18, 2026

Fixes #127834
ZipArchiveEntry.Open throws ArgumentOutOfRangeException from the MemoryStream constructor whenever an entry's declared _uncompressedSize exceeds Array.MaxLength: the unchecked (int) cast in GetUncompressedData wraps negative, and MemoryStream(int capacity) validates capacity is in [0, Array.MaxLength]. The accompanying comment claimed this was safe ("MemoryStream will just grow") — true on .NET Framework, no longer true on .NET Core.

Description

  • ZipArchiveEntry.IsOpenableFinalVerifications: validate _uncompressedSize up front when opening an entry in update mode. If _uncompressedSize falls outside [0, Array.MaxLength] — the only range that can be loaded into a MemoryStream (which is backed by a single byte[]) — fail immediately with InvalidDataException(SR.EntryTooLarge). This is the shared verification helper called from both ThrowIfNotOpenable (sync) and ThrowIfNotOpenableAsync (async), so it covers OpenInUpdateMode and OpenInUpdateModeAsync with a single check, rather than relying on MemoryStream to fail later from the constructor or from EnsureCapacity during CopyTo.
  • ZipArchiveEntry.GetUncompressedData (sync + async): simplified the capacity hint back to new MemoryStream((int)_uncompressedSize). The (int) cast is now safe because _uncompressedSize is validated upstream. The stale "MemoryStream will just grow" comment is replaced with one that points at the upstream validation.
  • Regression test in zip_InvalidParametersAndStrangeFiles.cs: reflectively forces _uncompressedSize > Array.MaxLength and asserts that both Open and OpenAsync throw InvalidDataException immediately in update mode (rather than the previous capacity '-2147483549' ArgumentOutOfRangeException from MemoryStream).
// Before — wraps negative when _uncompressedSize > Array.MaxLength,
// MemoryStream ctor throws AORE before any read happens
_storedUncompressedData = new MemoryStream((int)_uncompressedSize);

// After — IsOpenableFinalVerifications has already rejected
// values outside [0, Array.MaxLength] for update mode, so the cast is safe
if ((ulong)_uncompressedSize > (ulong)Array.MaxLength)
{
    message = SR.EntryTooLarge;
    return false;
}

Scope / non-goals

Entries whose declared uncompressed size exceeds Array.MaxLength cannot be loaded in update mode — MemoryStream is backed by a single byte[] and cannot grow beyond Array.MaxLength. Such entries are now rejected up front with a descriptive InvalidDataException("Entries larger than 4GB are not supported in Update mode.") instead of surfacing as a misleading ArgumentOutOfRangeException about a negative capacity argument from the MemoryStream constructor before any I/O happens. Lifting the underlying ceiling requires replacing the in-memory update buffer with a segmented or file-backed stream and is out of scope.

Note

This pull request was authored by GitHub Copilot.

Copilot AI self-assigned this May 18, 2026
Copilot AI review requested due to automatic review settings May 18, 2026 11:18
Copilot AI review requested due to automatic review settings May 18, 2026 11:18
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @karelz, @dotnet/area-system-io-compression
See info in area-owners.md if you want to be subscribed.

@alinpahontu2912
Copy link
Copy Markdown
Member

Hey @copilot, I think the fix should be different. We shouldn’t rely on the MemoryStream hitting its max capacity to fail. Instead, we should validate the uncompressedSize up front in ZipArchiveEntry and ensure it falls within the valid range (0 to Array.MaxLength). If it doesn’t, we should throw immediately when opening the entry in update mode.

Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/0fbb8331-f2e4-4bd0-acb1-a58139224285

Co-authored-by: alinpahontu2912 <56953855+alinpahontu2912@users.noreply.github.com>
Copilot AI requested review from Copilot and removed request for Copilot May 18, 2026 13:06
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 18, 2026

Hey @copilot, I think the fix should be different. We shouldn’t rely on the MemoryStream hitting its max capacity to fail. Instead, we should validate the uncompressedSize up front in ZipArchiveEntr...

Switched to up-front validation in ff62bbb. IsOpenableFinalVerifications now rejects _uncompressedSize > Array.MaxLength with InvalidDataException (using SR.EntryTooLarge) when opening an entry in update mode — covering both sync (Open) and async (OpenAsync) paths via the shared verification helper. The Math.Min clamp in GetUncompressedData[Async] is gone; the (int) cast is now safe because the value is validated upstream. Updated the regression test to assert the throwing behavior on both paths; full System.IO.Compression.Tests suite still passes (1970/1970).

Copilot AI changed the title Clamp MemoryStream capacity hint in ZipArchiveEntry.GetUncompressedData Validate uncompressed size up front in ZipArchiveEntry update mode May 18, 2026
Copilot AI requested a review from alinpahontu2912 May 18, 2026 13:07
Comment on lines 990 to 1003
// This limitation originally existed because a) it is unreasonable to load > 4GB into memory
// but also because the stream reading functions make it hard. This has been updated to handle
// this scenario in a 64-bit process using multiple buffers, delivered first as an OOB for
// compatibility.
if (needToLoadIntoMemory)
{
if (_compressedSize > int.MaxValue)
{
if (!s_allowLargeZipArchiveEntriesInUpdateMode)
{
message = SR.EntryTooLarge;
return false;
}
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting.... I wonder if this has ever worked, since as far as I can see, _storedUncompressedData has always used MemoryStream, which cannot take more than int.MaxValue data.

@ericstj do you, by chance, remember?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the commit that added this 641f049.

It looks like it was storing the data in a jagged 2-d byte array to get past the single array limit. I'm not sure how that got exposed in a stream. I do see it using MemoryStream which internally only has a single array - maybe at the time they were experimenting with a different MemoryStream with a different internal structure?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ZipArchiveEntry.Open expectedly throws an ArgumentOutOfRangeException

5 participants