Optimize vault from load test#22595
Conversation
… into optimize_vault_responses
… into optimize_vault_responses
… into optimize_vault_responses
… into optimize_vault_responses
… into optimize_vault_responses
… into optimize_vault_responses
… into optimize_vault_responses
CORA - Pending Reviewers
Legend: ✅ Approved | ❌ Changes Requested | 💬 Commented | 🚫 Dismissed | ⏳ Pending | ❓ Unknown For more details, see the full review summary. |
|
✅ No conflicts with other open PRs targeting |
|
I see you updated files related to
|
… into optimize_vault_responses
… into optimize_vault_responses
| if share.EncryptionKey == workflowNodeEncryptionPublicKeyStr { | ||
| localNodeShares = share.Shares | ||
| if len(share.BinaryShares) > 0 { | ||
| localNodeBinaryShares = share.BinaryShares |
There was a problem hiding this comment.
@russell-stern Is each request guaranteed to just have 100% binary shares or 100% string shares? When you were working on the base64 implementation I think it was possible for the formats to be mixed; that would suggest you actually want to combine regular + binary shares
There was a problem hiding this comment.
We moved the binary shares encoding to be behind a feature flag so they'll be guaranteed to have one or the other
|
|
||
| var currentBatch []*vaultcommon.StoredPendingQueueItem | ||
|
|
||
| flushBatch := func() error { |
There was a problem hiding this comment.
This is too long to inline IMO, just move it to a top-level private function called flushBatch; this also has the benefit of being clear about what the dependencies are rather than relying on closure variables which could get accidentally shadowed etc
| return nil | ||
| } | ||
|
|
||
| queueLoop: |
There was a problem hiding this comment.
😬 If you're using gotos and loop statements in Golang that should be a signal to refactor the code so you can avoid it.
There was a problem hiding this comment.
The high-level logic should just be to:
- loop over localQueueItems
- maintain a variable of the current batch
- clone this variable, add the new item to it
- proto.Size it to see if we've exceeded the batch
- check blob count
- if we have -> return the current batch
- otherwise assign the new cloned batch (with add'l item to the current batch)
There was a problem hiding this comment.
+1, haven't seen gotos in golang before.
So a good time to refactor.
I would also suggest, maybe in a later PR if not now, to split into multiple files based on components. We have too many things going on in this 1 file.
There was a problem hiding this comment.
Agreed, I want to restructure the entire plugin file after we get these changes in
| return nil, fmt.Errorf("could not marshal pending queue item: %w", ierr2) | ||
| var pack pendingQueueBlobPack | ||
| if optimizations { | ||
| pack, err = r.prepareObservationPendingQueueBlobs(ctx, seqNr, localQueueItems, pendingQueueHasID, maxBlobBytes, maxBlobHandleCount) |
There was a problem hiding this comment.
Does this imply that a blob could be one of two message types? How will you distinguish between them when unmarshalling?
| return errors.New("failed to unmarshal observations: " + err.Error()) | ||
| } | ||
|
|
||
| if len(ao.Observation) > r.maxObservationBytes { |
There was a problem hiding this comment.
I'm pretty sure this is unnecessary; I'd expect OCR to reject oversized observations automatically
| } | ||
|
|
||
| // unmarshalPendingQueueBlob decodes a BroadcastBlob payload (legacy single item or StoredPendingQueueBatch). | ||
| func unmarshalPendingQueueBlob(blob []byte) ([]*vaultcommon.StoredPendingQueueItem, error) { |
There was a problem hiding this comment.
This worries me a little bit; the unmarshaller should know exactly what format the message is in
Proto messages are untyped so we can't guarantee that there won't be cases where we fall into the wrong type
There was a problem hiding this comment.
I'd recommend adding an unambiguous prefix to the message or adding a field to the current wrapper type
| keptItems = keptItems[:errBoundLimited.Limit] | ||
| } | ||
| r.metrics.trackPendingQueueWrittenSize(ctx, len(keptItems)) | ||
| r.lggr.Infow("VAULT_OCR_STATE_TRANSITION_PENDING_WRITE", "seqNr", seqNr, "writtenCount", len(keptItems)) |
There was a problem hiding this comment.
Could we follow the existing style for log messages -- this favours human readable messages for the logs?
I notice you have this throughout btw; please adjust the others too
| vaultRequestID := vaultRequestIDFromMetadata(metadata) | ||
| lggr := logger.With(s.lggr, "requestedKeys", logKeys, "metadata", metadata, "vaultRequestID", vaultRequestID) | ||
| lggr.Infow("dispatching vault GetSecrets request") | ||
| lggr.Debug("fetching secrets...") |
There was a problem hiding this comment.
nit: One of these 2 logs can be deleted.
| } | ||
|
|
||
| // vaultRequestIDFromMetadata mirrors the request ID formula in vault capability Execute. | ||
| func vaultRequestIDFromMetadata(md capabilities.RequestMetadata) string { |
There was a problem hiding this comment.
could you please move this function to a library, and let both secrets.go and capability.go share it?
This way we won't accidentally deviate.
There was a problem hiding this comment.
I removed this altogether. We added metadata to the logger which has this id in it. There's a pull request to update our skill to be able to use the new metadata fields: https://github.com/smartcontractkit/cre-docs/pull/42
| } | ||
|
|
||
| func (s *secretsFetcher) decryptSecret(lggr logger.Logger, encryptedSecretBytes []byte, encryptedDecryptionShares []string, cfg *vaultConfig) (string, error) { | ||
| func encryptedDecryptionShareBytes(binaryShares [][]byte, hexShares []string) ([][]byte, error) { |
There was a problem hiding this comment.
You'd need these changes in confidential-compute repo too:
https://github.com/smartcontractkit/confidential-compute/blob/1ded9b280ac4c356869ba3085383c0dc4f4134c2/capabilities/framework/executor.go#L1020
| } | ||
|
|
||
| func (s *secretsFetcher) decryptSecret(lggr logger.Logger, encryptedSecretBytes []byte, encryptedDecryptionShares []string, cfg *vaultConfig) (string, error) { | ||
| func encryptedDecryptionShareBytes(binaryShares [][]byte, hexShares []string) ([][]byte, error) { |
There was a problem hiding this comment.
Is it better to write this function as a library here and let it be shared used across repos:
https://github.com/smartcontractkit/chainlink-common/blob/f9b356d61ca95d5a13d0105018432e884759778c/pkg/capabilities/actions/vault/vault.go
?
There was a problem hiding this comment.
Since this is just for the transition while switching to the new optimizations I'd rather keep it out of common. That way when we can just delete this code once the optimizations are enabled and won't need to make more common changes or leave extra code in there we can't delete
| shareBytes, err := hex.DecodeString(share) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("failed to decode share: %w", err) | ||
| if len(es.BinaryShares) > 0 { |
There was a problem hiding this comment.
so we have total 3 places where we handle binary or hex shares. better to consolidate into 1 library
There was a problem hiding this comment.
Same as above, once we make the switch to the optimizations I want to delete all this code
| return nil | ||
| } | ||
|
|
||
| queueLoop: |
There was a problem hiding this comment.
+1, haven't seen gotos in golang before.
So a good time to refactor.
I would also suggest, maybe in a later PR if not now, to split into multiple files based on components. We have too many things going on in this 1 file.
40638ec to
307dd2d
Compare
|




No description provided.