Add an LLM policy for rust-lang/rust#1040
Conversation
|
r? @jieyouxu rustbot has assigned @jieyouxu. Use Why was this reviewer chosen?The reviewer was selected based on:
|
|
@rustbot label T-libs T-compiler T-rustdoc T-bootstrap |
## Summary [summary]: #summary This document establishes a policy for how LLMs can be used when contributing to `rust-lang/rust`. Subtrees, submodules, and dependencies from crates.io are not in scope. Other repositories in the `rust-lang` organization are not in scope. This policy is intended to live in [Forge](https://forge.rust-lang.org/) as a living document, not as a dead RFC. It will be linked from `CONTRIBUTING.md` in rust-lang/rust as well as from the rustc- and std-dev-guides. ## Moderation guidelines This PR is preceded by [an enormous amount of discussion on Zulip](https://rust-lang.zulipchat.com/#narrow/channel/588130-project-llm-policy). Almost every conceivable angle has been discussed to death; there have been upwards of 3000 messages, not even counting discussion on GitHub. We initially doubted whether we could reach consensus at all. Therefore, we ask to bound the scope of this PR specifically to the policy itself. In particular, we mark several topics as out of scope below. We still consider these topics to be important, we simply do not believe this is the right place to discuss them. No comment on this PR may mention the following topics: - Long-term social or economic impact of LLMs - The environmental impact of LLMs - Anything to do with the copyright status of LLM output - Moral judgements about people who use LLMs We have asked the moderation team to help us enforce these rules. ## Feedback guidelines We are aware that parts of this policy will make some people very unhappy. As you are reading, we ask you to consider the following. - Can you think of a *concrete* improvement to the policy that addresses your concern? Consider: - Whether your change will make the policy harder to moderate - Whether your change will make it harder to come to a consensus - Does your concern need to be addressed before merging or can it be addressed in a follow-up? - Keep in mind the cost of *not* creating a policy. ### If your concern is for yourself or for your team - What are the *specific* parts of your workflow that will be disrupted? - In particular we are *only* interested in workflows involving `rust-lang/rust`. Other repositories are not affected by this policy and are therefore not in scope. - Can you live with the disruption? Is it worth blocking the policy over? --- Previous versions of this document were discussed on Zulip, and we have made edits in responses to suggestions there. ## Motivation [motivation]: #motivation - Many people find LLM-generated code and writing deeply unpleasant to read or review. - Many people find LLMs to be a significant aid to learning and discovery. - `rust-lang/rust` is currently dealing with a deluge of low-effort "slop" PRs primarily authored by LLMs. - Having *a* policy makes these easier to moderate, without having to take every single instance on a case-by-case basis. This policy is *not* intended as a debate over whether LLMs are a good or bad idea, nor over the long-term impact of LLMs. It is only intended to set out the future policy of `rust-lang/rust` itself. ## Drawbacks [drawbacks]: #drawbacks - This bans some valid usages of LLMs. We intentionally err on the side of banning too much rather than too little in order to make the policy easy to understand and moderate. - This intentionally does not address the moral, social, and environmental impacts of LLMs. These topics have been extensively discussed on Zulip without reaching consensus, but this policy is relevant regardless of the outcome of these discussions. - This intentionally does not attempt to set a project-wide policy. We have attempted to come to a consensus for upwards of a month without significant process. We are cutting our losses so we can have *something* rather than adhoc moderation decisions. - This intentionally does not apply to subtrees of rust-lang/rust. We don't have the same moderation issues there, so we don't have time pressure to set a policy in the same way. ## Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives - We could create a project-wide policy, rather than scoping it to `rust-lang/rust`. This has the advantage that everyone knows what the policy is everywhere, and that it's easy to make things part of the mono-repo at a later date. It has the disadvantage that we think it is nigh-impossible to get everyone to agree. There are also reasons for teams to have different policies; for example, the standard for correctness is much higher within the compiler than within Clippy. - We could have a more strict policy that removes the [threshold of originality](https://fsfe.org/news/2025/news-20250515-01.en.html) condition. This has the advantage that our policy becomes easier to moderate and understand. It has the disadvantage that it becomes easy for people to intend to follow the policy, but be put in a position where their only choices are to either discard the PR altogether, rewrite it from scratch, or tell "white lies" about whether an LLM was involved. - We could have a more strict policy that bans LLMs altogether. It seems unlikely we will be able to agree on this, and we believe attempting it will cause many people to leave the project. ## Prior art [prior-art]: #prior-art This prior art section is taken almost entirely from [Jane Lusby's summary of her research](rust-lang/leadership-council#273 (comment)), although we have taken the liberty of moving the Rust project's prior art to the top. We thank her for her help. ### Rust - [Moderation team's spam policy](https://github.com/rust-lang/moderation-team/blob/main/policies/spam.md/#fully-or-partially-automated-contribs) - [Compiler team's "burdensome PRs" policy](rust-lang/compiler-team#893) ### Other organizations These are organized along a spectrum of AI friendliness, where top is least friendly, and bottom is most friendly. - full ban - [postmarketOS](https://docs.postmarketos.org/policies-and-processes/development/ai-policy.html) - also explicitly bans encouraging others to use AI for solving problems related to postmarketOS - multi point ethics based rational with citations included - [zig](https://ziglang.org/code-of-conduct/) - philosophical, cites [Profession (novella)](https://en.wikipedia.org/wiki/Profession_(novella)) - rooted in concerns around the construction and origins of original thought - [servo](https://book.servo.org/contributing/getting-started.html#ai-contributions) - more pragmatic, directly lists concerns around ai, fairly concise - [qemu](https://www.qemu.org/docs/master/devel/code-provenance.html#use-of-ai-content-generators) - pragmatic, focuses on copyright and licensing concerns - explicitly allows AI for exploring api, debugging, and other non generative assistance, other policies do not explicitly ban this or mention it in any way - allowed with supervision, human is ultimately responsible - [scipy](https://github.com/scipy/scipy/pull/24583/changes) - strict attribution policy including name of model - [llvm](https://llvm.org/docs/AIToolPolicy.html) - [blender](https://devtalk.blender.org/t/ai-contributions-policy/44202) - [linux kernel](https://kernel.org/doc/html/next/process/coding-assistants.html) - quite concise but otherwise seems the same as many in this category - [mesa](https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/docs/submittingpatches.rst) - framed as a contribution policy not an AI policy, AI is listed as a tool that can be used but emphasizes same requirements that author must understand the code they contribute, seems to leave room for partial understanding from new contributors. > Understand the code you write at least well enough to be able to explain why your changes are beneficial to the project. - [forgejo](https://codeberg.org/forgejo/governance/src/branch/main/AIAgreement.md) - bans AI for review, does not explicitly require contributors to understand code generated by ai. One could interpret the "accountability for contribution lies with contributor even if AI is used" line as implying this requirement, though their version seems poorly worded imo. - [firefox](https://firefox-source-docs.mozilla.org/contributing/ai-coding.html) - [ghostty](https://github.com/ghostty-org/ghostty/blob/main/AI_POLICY.md) - pro-AI but views "bad users" as the source of issues with it and the only reason for what ghostty considers a "strict AI policy" - [fedora](https://communityblog.fedoraproject.org/council-policy-proposal-policy-on-ai-assisted-contributions/) - clearly inspired and is cited by many of the above, but is definitely framed more pro-ai than the derived policies tend to be - [curl](https://curl.se/dev/contribute.html#on-ai-use-in-curl) - does not explicitly require humans understand contributions, otherwise policy is similar to above policies - [linux foundation](https://www.linuxfoundation.org/legal/generative-ai) - encourages usage, focuses on legal liability, mentions that tooling exists to help automate managing legal liability, does not mention specific tools - In progress - NixOS - NixOS/nixpkgs#410741 ## Unresolved questions [unresolved-questions]: #unresolved-questions See the "Moderation guidelines" and "Drawbacks" section for a list of topics that are out of scope.
There was a problem hiding this comment.
I really like this version, and thanks a ton for working on it. Specifically:
- It doesn't try to dump entire walls of text, which is unfortunately a good way to be sure nobody reads it. Instead, it gives you concrete examples, and a guiding rule-of-thumb for uncovered scenarios, and acknowledges upfront that it surely cannot be exhaustive.
- I also like where it points out the nuance and recognizes the uncertainties.
- I like that it covers both "producers" and "consumers" (with nuance that reviewers can also technically use LLMs in ways that are frustrating to the PR authors!)
I left a few suggestions / nits, but even without them this is still a very good start IMO.
(Will not leave an explicit approval until we establish wider consensus, which likely will take the form of 4-team joint FCP.)
|
The links to Zulip are project-private, FWIW. |
I'm aware. This PR is targeted towards Rust project members moreso than the broad community. |
|
I'm going to unlock this now that RustWeek is over and we're not all busy. |
|
Just attempting to make sense of the multi-team FCP. Not mentioning folks directly to avoid noise. Using the N-2-per-team recommendations that folks have mentioned, here are the teams that have passed these requirements compared to those who have not: Types (requirements met; N-2):
Rustdoc (requirements met; unanimous):
Compiler (requirements met; N-2):
Libs (requirements not met; N-3):
Note that regardless, there is still one outstanding concern before FCP can proceed. This is mostly just trying to grapple with progress since a lot of people expressed that the 4-team FCP is confusing, and, yeah, it is. Also, ironically, despite the fact that many members have expressed how it would be nice to have boxes to represent the various "hats" they're sharing, the only people who cross the FCP pools here appear to be Oli/oli-obk (who started the FCP) and Santiago/spastorino, who has not yet checked any boxes. So, it seems unlikely that this issue will be a problem in this case, since even though the Types/Compiler teams do overlap a lot, the FCP pools only overlap for these two people. |
By the same rationale as [this Zulip thread](https://rust-lang.zulipchat.com/#narrow/channel/392734-council/topic/relax.20requirements.20for.20a.20cross-team.20FCP.20to.20be.20accepted/near/594902619): > i don't want a situation where 100% of t-compiler votes and no one on t-clippy gets a chance to look at it because their team is small.
Co-authored-by: Ada Alakbarova <58857108+ada4a@users.noreply.github.com>
| Authors are expected to review their own code before posting and after each change. | ||
| - 💡 See the [dev-guide][llm-guidance] for additional suggestions. | ||
|
|
||
| LLM-created PRs must be tagged with a new `ai-assisted` label. |
There was a problem hiding this comment.
I wanted to raise a point specifically around using a PR label instead of something like a Co-Authored By: or other commit message.
With a PR title, the provenance of LLM contributions is going to get lost outside github, and not exist in the git history. For automation or searching of LLM attribution, we will be reliant on utilizing the GitHub API for filtering PRs which were labeled, and then cross-referencing that with git commits. Other projects have specifically gone with the git commit addition because the LLM provenance of commits exists within the history. It remains clear what code was submitted by an LLM. The label approach will not make this clear outside the GitHub UI.
Additionally, the Co-Authored-By or similiar commit additions also generally include the model name, for further attribution. the label excludes that.
There was a problem hiding this comment.
one possibility here is to have a bot that automatically edits the label into the PR description. then bors will automatically pick it up in the merge commit, and we can filter it through the git history that way.
my only concern here is around brigading, i worry that people outside the project will — in a year, two, three years from now — go through the history to harass people. at least on PRs we're able to lock the thread, i don't think github has any kind of moderation controls on commits themselves, even though it lets people comment.
putting this only on the bors merge commit makes me feel a little better about things, since it naturally directs people to the pr but ... idk. it kinda worries me that putting this in the commit is so permanent. we can't remove these without rewriting history for the whole main branch.
There was a problem hiding this comment.
let me ask: what's your goal with establishing provenance like this?
There was a problem hiding this comment.
There are a few scenarios I'd find it useful; one is immediate information and answering questions, and another is more threat-model based on future possibilities I feel uncomfortable painting ourselves into a corner for.
I think for answering basic questions like:
- "what is the entirety of code changes conducted by LLM in this project?"
- "What models are being used in our project?"
- "What code was committed by a specific model found to be problematic?"
Given the experimental allowances here, we are making it prohibitively difficult to answer any meaningful questions about the experiment with objective data. I assume during the term of the experiment, we want to be able to answer questions like what models were used, how each model performed, what areas they seemed to excel or struggle in, etc. We can still distill this from the PRs alone; however, that information will be non-uniform and ad-hoc, making it an entirely manual exercise in gathering data. I worry we are limiting ourselves to subjective results and/or excessive work for answering anything like that as the experiment progresses.
Looking into the future, we can threat model around potential circumstances where we need this information. I can iterate here if desired, but the main burning concern I have for an actionable, and likely, future need is: What if a model is deemed problematic, and we need to remove all contributions by it? I could see a government, or the project, making this kind of call in the near future (ex: the US government bans a model, or the project wants to ban a racist model, or a model is found to be maliciously manipulated). I think we need a built-in mechanism to respond to that.
As written, this document strictly lets us identify what PRs had an LLM involved - we can't tangible answer any of the above without entirely manual and social mechanisms (Filter ai-assisted, contact the author, ask what models they used, ask what code was written by the LLM, etc)
There was a problem hiding this comment.
What if a model is deemed problematic, and we need to remove all contributions by it?
I simply do not think this is realistic. This is on-par with "What if we want to switch the license and everyone but a single contributor agrees, how can we rip out their code?" Code doesn't work like that. Changes are built on top of each other; rewriting the change "as-if" a previous change had never existed is usually tantamount to a full rewrite.
There was a problem hiding this comment.
I would like to emphasise that, in many countries, LLMs are not "authors" from a legal point of view. Per european law, it's pretty clear that author rights apply to original human creation (the degree of acceptable assistance is debatable, but in any case, the AI by itself isn't an author). Therefore, it would be misleading to use Co-authored-by for AI-generated code. At the very least, it should be named something else… But in that case, I also feel like it's kind of an ad.
There was a problem hiding this comment.
I don't think we need to tag each commit. I do think we should arrange something in the bors merge message, to make data gathering easier as part of the LLM experiment.
There was a problem hiding this comment.
IANAL but I don't think the legality of authorship is implied by bots (LLM or not) using the Co-Authored-By:/Author: field, just like no one questioned the commits "authored" by bors or dependabot.
That said, I think Co-Authored-By: should not be enforced. Besides being ad-like, these LLM Co-Authored-By: in the wild mostly comes from Claude Code and Copilot only. Other tools like Codex or Antigravity don't do commit attribution trailer (unless given custom instructions), so for auditing use you'll by-default miss half of the picture. Plus, tools like Copilot and Cursor and OpenCode attribute Co-Authored-By: to the tools themselves despite actually calling someone else's model, making it even less useful to answer "What models are being used in our project?".
I don't think we need to tag each commit. I do think we should arrange something in the bors merge message, to make data gathering easier as part of the LLM experiment.
If the disclosure is written in the PR description it will be automatically included in the merge commit. In such case a git log --grep or other text analysis tool should be enough for auditing.
But this means this guideline needs to specify that the disclosure must be placed within the PR description rather than a separate comment. 🤔
There was a problem hiding this comment.
But this means this guideline needs to specify that the disclosure must be placed within the PR description rather than a separate comment. 🤔
I agree. The core of this is that we are framing this as an experiment to see how the policy and LLMs perform for code contributions, with no way to attribute what was used during the experiment.
I don't think we need to tag each commit. I do think we should arrange something in the bors merge message, to make data gathering easier as part of the LLM experiment.
I'd find this compromise at least meeting our needs for data gathering. At a much less extreme level, I really just want to be able to see what code was written by what model during this experiment - anecdotal feedback from the users is also useful; but I think the future opportunity to see how specific models perform objectively is going to be a data point we want out of this experiment. Having this somewhere machine-readable (either github API or the git log) is an acceptable compromise.
There was a problem hiding this comment.
The model is useful, but personally the thing I see as most critical is recording whether a merge was of LLM-assisted code at all.
on @joshtriplett's behest, I'd like to say: we think we have a path forward here, but I'm too busy this week to write it up. I'll try to write something down by next Friday. |
|
|
||
| Minor changes, such as typo fixes, only require a normal PR approval. | ||
| Major changes, such as adding a new rule or canceling an existing rule, require a successful MCP (2 approvals and no concerns) from each team that ratified the policy. | ||
|
|
There was a problem hiding this comment.
Speaking as a compiler team member, I would be very uncomfortable binding the team with a policy that cannot be changed/removed except with a lack of concerns from any teams that ratified the policy.
Even a lack of concerns within the compiler team is difficult to overcome for a decision with any kind of controversy, and this topic is guaranteed to have that. The compiler team is large, and there was some recent discussion about whether MCPs bias us for inaction. For purely technical matters I think they are okay, but for policy matters like this they are not a good fit.
If the text bakes in "lack of concerns", it effectively cements a bias for inaction, which I think would be a mistake. It would be better to defer to the teams' lightweight policy making processes to the extent they have them, building in a fallback option for when they do not.
Finally, the part that the text leaves unclear is whether a team is allowed to withdraw without the agreement of the other teams. I think the answer is certainly yes, and the text should be explicit about it.
There was a problem hiding this comment.
I don't think that a lack of coordination between the involved teams is really possible with a policy that binds so tightly to a shared repo. Just considering the standard library and the compiler, for example, I don't think it's really possible to have standards for review and merging that drift too far apart considering the fact that changes to one has massive effects on the other, even excluding changes that touch both.
Perhaps rustdoc could potentially be the one team that successfully removes themselves from the repo and starts being managed as a subtree, but in terms of compiler and libs, I don't really see a lack of coordination being possible.
Also re: "lack of concerns," while it is not public yet, libs has plans to ensure an explicit override for concerns on ACPs being overridable by an FCP, so that the process degrades into an FCP in case of controversy, but remains relatively frictionless when there's no controversy. I think this is a fair way to handle this.
Rendered
View all comments
FCP link
Summary
This document establishes a policy for how LLMs can be used when contributing to
rust-lang/rust. Subtrees, submodules, and dependencies from crates.io are not in scope. Other repositories in therust-langorganization are not in scope.This policy is intended to live in Forge as a living document, not as a dead RFC. It will be linked from
CONTRIBUTING.mdin rust-lang/rust as well as from the rustc- and std-dev-guides.Ethical issues
See this thread.
Moderation guidelines
This PR is preceded by an enormous amount of discussion on Zulip. Almost every conceivable angle has been discussed to death; there have been upwards of 3000 messages, not even counting discussion on GitHub. We initially doubted whether we could reach consensus at all.
Therefore, we ask to bound the scope of this PR specifically to the policy itself. In particular, we mark several topics as out of scope below. We still consider these topics to be important, we simply do not believe this is the right place to discuss them.
So, the following are considered off topic for this PR specifically:
We have asked the moderation team to help us enforce these rules. For an extended rationale, please see this comment.
Feedback guidelines
We are aware that parts of this policy will make some people very unhappy. As you are reading, we ask you to consider the following.
If your concern is for yourself or for your team
rust-lang/rust. Other repositories are not affected by this policy and are therefore not in scope.Previous versions of this document were discussed on Zulip, and we have made edits in responses to suggestions there.
Motivation
rust-lang/rustis currently dealing with a deluge of low-effort "slop" PRs primarily authored by LLMs.This policy is not intended as a debate over whether LLMs are a good or bad idea, nor over the long-term impact of LLMs. It is only intended to set out the future policy of
rust-lang/rustitself.Drawbacks
Rationale and alternatives
rust-lang/rust. This has the advantage that everyone knows what the policy is everywhere, and that it's easy to make things part of the mono-repo at a later date. It has the disadvantage that we think it is nigh-impossible to get everyone to agree. There are also reasons for teams to have different policies; for example, the standard for correctness is much higher within the compiler than within Clippy.Prior art
This prior art section is taken almost entirely from Jane Lusby's summary of her research, although we have taken the liberty of moving the Rust project's prior art to the top. We thank her for her help.
Rust
Other organizations
These are organized along a spectrum of AI friendliness, where top is least friendly, and bottom is most friendly.
Unresolved questions
See the "Moderation guidelines" and "Drawbacks" section for a list of topics that are out of scope.