Skip to content

Chores: add coding guidelines to the AGENTS.md of the prover. #2532

Open
AlexandreBelling wants to merge 6 commits intomainfrom
chores/padd-prover-claudemd
Open

Chores: add coding guidelines to the AGENTS.md of the prover. #2532
AlexandreBelling wants to merge 6 commits intomainfrom
chores/padd-prover-claudemd

Conversation

@AlexandreBelling
Copy link
Contributor

@AlexandreBelling AlexandreBelling commented Mar 3, 2026

This PR implements issue(s) #

Checklist

  • I wrote new tests for my new core changes.
  • I have successfully ran tests, style checker and build against my new changes locally.
  • If this change is deployed to any environment (including Devnet), E2E test coverage exists or is included in this
    PR.
  • I have informed the team of any breaking changes if there are any.

Note

Low Risk
Documentation-only updates that add conventions and guardrails; no code or behavior changes, so risk is low aside from potential process friction if guidelines are misunderstood.

Overview
Updates prover/AGENTS.md to replace the brief package blurb with a detailed description of the prover pipeline and to clarify the prover/ directory structure.

Adds extensive contributor guidelines: prover-specific safety rules, testing conventions (including mandatory circuit soundness/negative tests and fuzzing guidance), code style/dependency policies, documentation standards, explicit circuit-change prohibitions without user directive, and performance/benchmarking guidance.

Written by Cursor Bugbot for commit 57d2ec4. This will update automatically on new commits. Configure here.

prover/AGENTS.md Outdated
* Document each functions, method and structure. State the behavior of the function and its invariants. Document the parameters and the returns arguments.
* Only create new structures they are worth the added maintainance cost or if they decrease the maintainance cost.
* Prefer small functions. Less than 10 lines of code when possible.
* Small, non-technical utility functions go in `./utils`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me personally utils/ package is where functions go to die. Better not to implement it there when it is trivial (a la ToInt[T int | *big.Int](v T) int) and the same pattern appears several times over different packages. Otherwise if something is package-local then better just to implement it in the same package as a helper.

Imo always better to use Go std library functions if there are such (i.e. slices, maps etc.) and otherwise implement inline. Only for something very particular (a la DivCeil, DivFloor, NextPower2)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but with all the many packages that we have, I found that we ended up having way too many occurences of the same function everywhere and some of the implementation are not even correct. For me, having stuffs in "utils" is like having a way to advertise the existence of the function because everybody imports utils all the time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure, utils have a place. But imo there are also occurrences where people generify too soon something into "utils" package and later it is only used once. Now, there are over-generalized util functions and if there are too many "utils", then in the end it becomes too crowded that I never go to find anything from there (unless its already something I know). A la https://github.com/Consensys/linea-monorepo/blob/prover/dev-small-fields/prover/utils/utils.go#L224-L248 imo are pointless, it is something which cannot be incorrectly implemented (the linter will catch if I would do it incorrectly). It is one quick example but there are imo more.

My rule of thumb would be:

  • if the pattern is easy to implement incorrectly
  • if the pattern is used 3+ times in code
  • there doesn't exist existing std-lib functionality (slices, maps, errors)
  • it allows implementing no-overhead (i.e. no additional memory allocs compared if we would implement it inline)
  • do not assume utils are used outside of the project

Additionally, we should occasionally go over the utils packages, the implementations change and a lot of utils are not used anymore, so we can remove stuff from there aggressively.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean, and the example you give is a good one. But it also break the rule, that we should not reinvent what is already available in std. Ok, we to be more precise in the guideline.

prover/AGENTS.md Outdated

## Code style

* Less package is better to avoid dependency cycles.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should instruct the agents to ask us about the package structure. Right now it is too generic and I'm afraid agents wouldn't take it too seriously (not creating any new packages at all).

* Less packages is better. When implementing new functionality then propose to the user package structure. Do not define new packages without user consent.

prover/AGENTS.md Outdated
## Code style

* Less package is better to avoid dependency cycles.
* Use external library with parcimony
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to be explicit:

* never add new external dependencies to go.mod without user consent. Explain why new external dependency is better than implementing ourself. External dependencies are not prohibited, only discouraged, so propose if it makes sense
* Go standard library is always allowed
* Go experimental packages are allowed (`golang.org/x`), but notify user.
* Packages already added to `go.mod` are allowed, but notify user.

prover/AGENTS.md Outdated
* Use external library with parcimony
* Document each functions, method and structure. State the behavior of the function and its invariants. Document the parameters and the returns arguments.
* Only create new structures they are worth the added maintainance cost or if they decrease the maintainance cost.
* Prefer small functions. Less than 10 lines of code when possible.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imo doesn't make sense, the 10-line limit is arbitrary and too small for any practical purpose. I personally prefer having larger functions if it makes sense. Rather say:

* if several functions overlap in functionality, then refactor the overlapping part into a separate function. If you add new function which overlaps with existing functionality then consider refactoring. Do not refactor existing functions without explicit user conset.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, right the 10 line limit is off. Maybe, we could say.

Prefer small functions. If you can factor out a common functionality from several overlapping functions, do it but the behavior of the function should be easy to explain.

prover/AGENTS.md Outdated
* Only create new structures they are worth the added maintainance cost or if they decrease the maintainance cost.
* Prefer small functions. Less than 10 lines of code when possible.
* Small, non-technical utility functions go in `./utils`
* Use hard assertions a lot.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add concrete examples:

  • when we assume inputs are of same length, assert
  • when we assume something is in range, assert
  • when we access slices, check the slices has capacity
  • etc
  • for assertions use Go expressions `if len(A) != len(B) { utils.Panicf("len(A) != len(B)") }.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to keep it concise because this file also eats the LLM context space. Perhaps, we could still make it more precise by stating that: "whenever an implicit invariant, pre-condition, post-condition is testable then it should be asserted (non-nilness, slice size)."

prover/AGENTS.md Outdated
* Error out only if failures are acceptable or expectable. If not, panic.
* Comment explains implementations key considerations. Use with parcimony. Don't paraphrase the code.
* Use docs.go for general package-level user documentations. Keep it short. Less than 1000 characters.
* Use Readme.<name>.md files for package-level maintainer documentation. Keep it less than 5000 character per file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 5000 character per files restriction is too strict. Better to be explicit. When I'm asking for some method description (last I asked about degree reduction), then it goes easily over 5000. Rather - let the user review the documentation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And we should also say that we should keep the documentation up to date:

  • check if the current package/file/function is described in any documentation (README..., docs.go). If so, then ensure you update it according with the method implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the Readme. I think it is better to keep it short because the agent will read it whenever it goes across the package and it will end up eating their context space super fast. If we force him to work with small pieces of context then it helps it a lot. The coding file can be big, but the overall doc for the degree reduction does not need to be big. Just a good summary explaining what it does and the algo involved to get the LLM started on the package/file.

But maybe we could maintain "for-human" documentation separately.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh - I don't think agents need READMEs. In my experience, they work a lot better looking at the code and then figuring out themselves how the things work. Unless it is something very obscure, but imo for 90% cases it is sufficient they look at the code. Otherwise if we have accompanying READMEs then they just get stale at some point and the agents have to do 2x the work (first read README, then look at the code, understand there is a mismatch and then just look at the implementation directly).

So, for me, docs.go and README.<>.md are for humans. Anything useful for agents can be in the code itself (comments etc.), imo it would reduce the context they need as it is usually more dense than README.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having done a few experiments. I found that giving them high-level context before letting them dive in the code resulted in them getting faster through the code. When using them for debugging, I found it made a very stark difference. But I guess we can give up the "Readme" is for agent and find another solution for agents.

Maybe a .gitignore-d file per package where they compact their reading of a package and we make them use it as a cache for their context.

prover/AGENTS.md Outdated
* Use docs.go for general package-level user documentations. Keep it short. Less than 1000 characters.
* Use Readme.<name>.md files for package-level maintainer documentation. Keep it less than 5000 character per file.
* If you want to document more stuffs, factor out the documentation in more Readme files but keep them shorts.
* Use new structures and interfaces with parcimony. Extend existing one when possible.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imo, for performance:

  • if allocating much from heap (1MB+) for returning, then allow providing the destination as a parameter. Only allocate on heap when the result doesn't fit or dst is nil.

Bad:

func double(x []int) []int {
    res := make([]int, len(x))
    for i := range x {
       res[i] = 2*x[i]
    }
    return res
}

Good:

func double(dst, x []int) []int {
    if len(dst) < len(x) {
       dst = append(dst, len(x)-len(dst))
    }
    for i := range x {
        res[i] = 2*x[i]
    }
    return dst
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't disagree with that pattern on hot path where performance is a concern but I am a little bit afraid of what would come out of this advice is this get applied systematically. Perhaps, we should craft an optimization playbook separately where we consign our general experience at optimizing go code and let the agent crawl the code, profile and optimize whatever it can optimize. That would be a fun experiment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I think an optimization playbook would be good. Perhaps in a different section with instruction:

* these are common optimization patterns. Do not optimize early, if you see an obvious performance regression then notify user to apply any of the techniques.
* otherwise, keep a note `// XXX(agents): ` for a future optimization round

prover/AGENTS.md Outdated
* Use docs.go for general package-level user documentations. Keep it short. Less than 1000 characters.
* Use Readme.<name>.md files for package-level maintainer documentation. Keep it less than 5000 character per file.
* If you want to document more stuffs, factor out the documentation in more Readme files but keep them shorts.
* Use new structures and interfaces with parcimony. Extend existing one when possible.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another one for performance:

  • for parallel computation use parallel.Execute. Assume 100+ CPUs.
  • in compute-heavy parts avoid using channels and locks for single work scheduling due to overhead. Try to prepare uniform-shaped work chunks to run using parallel.Execute. For scheduling/work shaping you can use runtime.GOMAXPROCS(0)

- CI workflow: `.github/workflows/prover-testing.yml`
- Static checks run first (gofmt, golangci-lint)
- Compressor tests run separately from main test suite
- 30-minute timeout for test suite
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add:

  • ensure new functions are unit tested. If new function is part of larger computation pipeline, then ensure we have integration unit-tests.
  • unit tests should be useful - if it takes more than 30 seconds to run on user device, then make it faster (using mocking, smaller instances etc.).
  • in case tests need test data which cannot be synthezised then ask user. Otherwise, synchesize test data. If test-data is more than several kilobytes, then store it in testdata/ folder.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is section for testing (and also security) in AGEND.md. We could add these guidelines here. I would also add coverage metrics. Though, I would let it stick to unit-testing functions functions that are testable. For larger testdata, we should instruct it to implement a testdata generator.

@ivokub
Copy link
Contributor

ivokub commented Mar 3, 2026

And imo - after our rounds of inputs, it is better to ask any agent to optimize AGENTS.md for itself -- i.e. add a TOC, structure it better etc. so that it would be most useful for them. We can also ask to split into separate files i.e. if it allows to reduce the context space.

@AlexandreBelling
Copy link
Contributor Author

I have done a pass on all these aspect:

  • Stuffed up the performance section
  • Clarified the documentation rules
  • Clearer coding style guidelines
  • More LLM rules about not touching the circuits without explicit instruction to do so. But eagerly raising concerns.
  • Clarified the role of utils

@ivokub
Copy link
Contributor

ivokub commented Mar 3, 2026

I have done a pass on all these aspect:

  • Stuffed up the performance section
  • Clarified the documentation rules
  • Clearer coding style guidelines
  • More LLM rules about not touching the circuits without explicit instruction to do so. But eagerly raising concerns.
  • Clarified the role of utils

I really like the instructions! Good on my side!

ivokub
ivokub previously approved these changes Mar 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants