Skip to content

chore: bump retries for request tx by hash#13675

Merged
PhilWindle merged 3 commits into
masterfrom
md/bump-retries
Apr 30, 2025
Merged

chore: bump retries for request tx by hash#13675
PhilWindle merged 3 commits into
masterfrom
md/bump-retries

Conversation

@Maddiaa0

Copy link
Copy Markdown
Member

Overview

With better sampling logic, we can more safetly bump the retries for the tx requests

Maddiaa0 commented Apr 18, 2025

Copy link
Copy Markdown
Member Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

Base automatically changed from md/better-peer-sampling to master April 22, 2025 18:49
@Maddiaa0 Maddiaa0 force-pushed the md/bump-retries branch 2 times, most recently from bfb1903 to 20f4a99 Compare April 29, 2025 20:02
// Set collective timeout based on the number of txs we are requesting
const timeoutMs = Math.max(5000, txHashes.length * 100);
const maxPeers = Math.ceil(txHashes.length / 3);
const maxRetryAttempts = Math.min(txHashes.length / maxPeers, 5);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be missing something, but isn't this always equal to 3, since maxPeers is already defined based on txHashes.length?

const txs = await this.p2pService.sendBatchRequest(ReqRespSubProtocol.TX, txHashes);
// Set collective timeout based on the number of txs we are requesting
const timeoutMs = Math.max(5000, txHashes.length * 100);
const maxPeers = Math.ceil(txHashes.length / 3);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we set a minimum of let's say 5, in case we're requesting very few txs?

public async requestTxsByHash(txHashes: TxHash[]): Promise<(Tx | undefined)[]> {
const txs = await this.p2pService.sendBatchRequest(ReqRespSubProtocol.TX, txHashes);
// Set collective timeout based on the number of txs we are requesting
const timeoutMs = Math.max(5000, txHashes.length * 100);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd put a hard cap on this one.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need to ensure this is capped by the validation reexecution deadline, but that's another story.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for prover node's proving deadline

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For re-ex deadline, if it times out as it is unable to get all of the txs, then it will not be able to attest anyway no?

@Maddiaa0 Maddiaa0 added this pull request to the merge queue Apr 30, 2025

@PhilWindle PhilWindle left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slight concern over request timeouts.

public async requestTxsByHash(txHashes: TxHash[]): Promise<(Tx | undefined)[]> {
const txs = await this.p2pService.sendBatchRequest(ReqRespSubProtocol.TX, txHashes);
// Set collective timeout based on the number of txs we are requesting
const timeoutMs = Math.min(8000, txHashes.length * 100);

@PhilWindle PhilWindle Apr 30, 2025

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slight concern over this. From the looks of how this all works, if I need say 2 txs, we would only allow 200ms for the full round trip. Is that correct?

I think that is too low. I think there is a fixed time delay to any request, regardless of how large, for the request to go across the world, the recipient to open their db and the response to come all the way back again. Given that these requests should in general be for small numbers of transactions I'm worried this change could make things worse.

I think at the very least it should be something like

const timeoutMs = Math.min(8000, Math.max(txHashes.length * 100, 1000));

@Maddiaa0 Maddiaa0 removed this pull request from the merge queue due to a manual request Apr 30, 2025
@PhilWindle PhilWindle added this pull request to the merge queue Apr 30, 2025
Merged via the queue into master with commit e379810 Apr 30, 2025
8 checks passed
@PhilWindle PhilWindle deleted the md/bump-retries branch April 30, 2025 10:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants