numcodecs currently treats a return value of 0 from ZSTD_getDecompressedSize as an input error. A value of zero could mean one of the following.
- empty
- unknown
- error
|
dest_size = ZSTD_getDecompressedSize(source_ptr, source_size) |
|
if dest_size == 0: |
|
raise RuntimeError('Zstd decompression error: invalid input data') |
Rather numcodecs should use ZSTD_getFrameContentSize which the return value can be differentiated.
0 means empty
0xffffffffffffffff, ZSTD_CONTENTSIZE_UNKNOWN, means unknown
0xfffffffffffffffe, ZSTD_CONTENTSIZE_ERROR, means error
See zstd.h or the manual for a reference.
https://github.com/facebook/zstd/blob/7cf62bc274105f5332bf2d28c57cb6e5669da4d8/lib/zstd.h#L195-L203
https://facebook.github.io/zstd/zstd_manual.html
This error arose during the implementation of Zstandard in n5-zarr:
saalfeldlab/n5-zarr#35
There the compressor was producing blocks which would return ZSTD_CONTENTSIZE_UNKNOWN. ZSTD_getDecompressedSize would return 0 and numcodecs would incorrectly interpret this as an error.
Handling ZSTD_CONTENTSIZE_UNKNOWN may be difficult.
- If a
dest buffer is provided, then perhaps that should we set as the expected decompressed size and an error should occur if the decompressed size is not that.
- If a
dest buffer is not provided, we may need to either use a default or use the streaming API to build an growing buffer until all the data is decompressed.
numcodecs currently treats a return value of
0fromZSTD_getDecompressedSizeas an input error. A value of zero could mean one of the following.numcodecs/numcodecs/zstd.pyx
Lines 151 to 153 in 366318f
Rather numcodecs should use
ZSTD_getFrameContentSizewhich the return value can be differentiated.0means empty0xffffffffffffffff,ZSTD_CONTENTSIZE_UNKNOWN, means unknown0xfffffffffffffffe,ZSTD_CONTENTSIZE_ERROR, means errorSee zstd.h or the manual for a reference.
https://github.com/facebook/zstd/blob/7cf62bc274105f5332bf2d28c57cb6e5669da4d8/lib/zstd.h#L195-L203
https://facebook.github.io/zstd/zstd_manual.html
This error arose during the implementation of Zstandard in n5-zarr:
saalfeldlab/n5-zarr#35
There the compressor was producing blocks which would return
ZSTD_CONTENTSIZE_UNKNOWN.ZSTD_getDecompressedSizewould return0and numcodecs would incorrectly interpret this as an error.Handling
ZSTD_CONTENTSIZE_UNKNOWNmay be difficult.destbuffer is provided, then perhaps that should we set as the expected decompressed size and an error should occur if the decompressed size is not that.destbuffer is not provided, we may need to either use a default or use the streaming API to build an growing buffer until all the data is decompressed.