improve allocations in map serialization#105
Conversation
| // ReadFullStringIntoBuf will read a string off the given stream, consuming the | ||
| // entire cbor item if the string on the stream is longer than the buffer, | ||
| // the string is discarded and 'false' is returned | ||
| func ReadFullStringIntoBuf(cr *CborReader, buf []byte, maxLength uint64) (int, bool, error) { |
There was a problem hiding this comment.
nit: generally I prefer the Hash version of this:
- Take a zero-length buffer with some capacity.
- Return a buffer containing the output, appending to the input buffer if possible.
That way it's possible to just pass nil and let the function do the allocation for you. You also won't need to handle slicing, etc.
There was a problem hiding this comment.
So the main reason i added this was to avoid any allocations, i can make it append if necessary but it feels more prone to being misused in that case
There was a problem hiding this comment.
Ah, I see. You want to be able to skip fields that are too long entirely. I guess that makes sense, it just seems like a bit of a sharp edge.
There was a problem hiding this comment.
yeah... i would make it private but then generated files can't call it
There was a problem hiding this comment.
ill add an extra comment
peeker.go
Outdated
| return b, nil | ||
| } | ||
|
|
||
| func (p *peeker) ReadByteBuf(buf []byte) (byte, error) { |
There was a problem hiding this comment.
If peeker.ReadByte is allocating buf, maybe just replace peeker.lastByte with a [1]byte and use that directly in io.ReadFull?
There was a problem hiding this comment.
ah, thats a good idea
gen.go
Outdated
|
|
||
| if !ok { | ||
| // Field doesn't exist on this type, so ignore it | ||
| cbg.ScanForLinks(cr, func(cid.Cid){}) |
There was a problem hiding this comment.
Do we not have a better way to ignore fields? Honestly, I'd special-case Deferred into an Ignored type that just drops everything.
There was a problem hiding this comment.
Also, we need to handle errors.
There was a problem hiding this comment.
yeah thats the way we've been doing it so far, probably makes sense to have a more dedicated thing for it
There was a problem hiding this comment.
Eh, this is probably fine (assuming you check for errors). It ensures that any CIDs contained in such fields are valid.
io.go
Outdated
| if len(scratchBuf) < int(extra) { | ||
| return cid.Undef, fmt.Errorf("scratchBuf not large enough for cid") |
There was a problem hiding this comment.
Don't we usually allocate header-sized scratch bufs?
There was a problem hiding this comment.
yeah, this is actually a different path entirely (not used in codebase yet). I wanted this for doing some manual/optimized reading of certain objects
io.go
Outdated
| cr.r = GetPeeker(r) | ||
| } | ||
|
|
||
| func (cr *CborReader) ReadCid(scratchBuf []byte) (cid.Cid, error) { |
There was a problem hiding this comment.
Were you planning on using this somewhere?
There was a problem hiding this comment.
outside of package, yeah. Its really a separate thing from the rest of this PR, happy to remove for now and bring back in later
There was a problem hiding this comment.
It's just kind of awkward given that:
- Everything else uses the internal scratch buffer.
- It will fail if the scratch buffer isn't large enough.
If we need a CID scratch buffer, I'd consider just increasing the size of the internal scratch buffer to fit CIDs.
There was a problem hiding this comment.
alright, ill just remove this for now. can bring it back later
gen.go
Outdated
|
|
||
| if !ok { | ||
| // Field doesn't exist on this type, so ignore it | ||
| cbg.ScanForLinks(cr, func(cid.Cid){}) |
There was a problem hiding this comment.
Eh, this is probably fine (assuming you check for errors). It ensures that any CIDs contained in such fields are valid.
reuse buffers more and use optimized read paths when available.
Before:
After: