AI Policy Thoughts

This post doesn’t propose a Beman AI policy directly; instead, it attempts to summarize previous discussions about AI contributions, survey the current state of open-source AI contribution policies, enumerate the relevant considerations, and provide suggestions for each of them.

Research

Statistics cited in the following sections were obtained by using the list of policies maintained at GitHub - melissawm/open-source-ai-contribution-policies: A list of policies by different open source projects about how to engage with AI-generated contributions. · GitHub, and in some cases by having Claude Code perform analysis of the linked policies.

Allowing AI Contributions

It seems like, although one Beman library author stated that he personally was leaning toward not using AI for his own library, we have not yet seen anyone arguing for an outright Beman Project-wide ban on AI contributions. 49 out of 73 (67%) of projects tracked by melissawm do not ban or severely restrict AI contributions.

I recommend that the Beman Project’s AI policy allows AI contributions.

Ownership

There seemed to be strong consensus that, although AI contributions should be allowed, they do not allow contributors to shirk responsibility for their commits; just because Claude Code generated a commit does not mean that Claude Code is then responsible for its maintenance.

I recommend that the Beman Project’s AI policy includes language similar to this language from LLVM’s AI policy:

“[…] there must be a human in the loop. Contributors must read and review all LLM-generated code or text before they ask other project members to review it. The contributor is always the author and is fully accountable for their contributions. Contributors should be sufficiently confident that the contribution is high enough quality that asking for a review is a good use of scarce maintainer time, and they should be able to answer questions about their work during review.”

However, I would also like this policy to be relaxed for Beman-related dev tools; for example, my GitHub - bemanproject/beman-local-ci: A tool for running the Beman CI matrix locally via Docker. · GitHub tool is vibe-coded (I haven’t reviewed the code independently yet), and while I think that this approach would be problematic for C++ libraries intended for standardization, when it comes to enhancing the CI ecosystem for those libraries, I think it may be acceptable.

Attribution

This was the most controversial topic at the April 13 sync meeting. I brought up that the Linux Kernel AI Coding Assistants policy guide has an Attribution section that states:

When AI tools contribute to kernel development, proper attribution helps track the evolving role of AI in the development process. Contributions should include an Assisted-by tag […]

I thought that “track[ing] the evolving role of AI in the development process” was a sufficient justification for including a similar requirement in the Beman project’s AI policy. David Sankel disagreed, saying that the attribution functioned as unpaid advertising for the AI tools, that the “Co-authored-by” tag waters down the human contributor’s ownership of the commit, and that the human element of initiation of the work matters more than the particular tools.

Of the tracked open-source contribution policies that allow AI contributions without imposing severe restrictions, 32 out of 51 (65%) either require or encourage attribution of AI coding tools in code submissions. The justifications they cited varied, but mostly cited a desire to reduce reviewer burden from low-quality “AI Slop” contributions. Some quotes from policies that asked for attribution include:

LLVM: “Our policy on labelling is intended to facilitate reviews, and not to track which parts of LLVM are generated[…] This transparency helps the community develop best practices and understand the role of these new tools[…] AI tools must not be used to fix GitHub issues labelled good first issue. These issues are generally not urgent, and are intended to be learning opportunities for new contributors to get familiar with the codebase.”

Arrow: “PRs that appear to be fully generated by AI with little to no engagement from the author may be closed without further review. […] Be upfront about AI usage and summarise what was AI-generated”

NumPy: “Do not waste developers’ time by submitting code that is fully or mostly generated by AI, and doesn’t meet our standards.”

On the other hand, the projects that didn’t require attribution mostly didn’t provide justification for not having it. Some quotes from the ones that did:

Flutter: “Why does it matter how the code was created; shouldn’t the code speak for itself?”

Oxide: “Volunteering that an LLM has been used to generate work product may implicitly distance oneself from the responsibility for the content”

One argument I haven’t raised yet is that, although I’ve heard quibbles about the the importance/viability of this goal, I want it to be the case that standard library maintainers can easily incorporate code directly from Beman implementations into standard library implementations instead of needing to rewrite everything from scratch.

Of the three major standard library implementations, libstdc++ does not have a policy yet, but appears to be leaning toward adopting the binutils policy, which bans AI contributions; libc++ requires attribution; and MSVC’s STL appears to have no policy thus far.

As a result, it may potentially cause conflicts with standard library implementations’ policies if Beman libraries omit AI attribution.

My recommendation is for a Beman Project AI policy to include language that strongly encourages contributors to add AI tool attribution for substantially AI-generated commits, but does not require a specific mechanism or tag for doing so.

Licensing and Copyright Considerations

Projects that forbid AI contributions typically cite the uncertain legal landscape around copyrightability of AI-generated code. My perspective is that, despite this uncertainty, the landscape of AI policy documents shows that there is already substantial precedent for allowing these contributions in open-source projects, including one of the most relevant projects for our use case, LLVM. The AI policies of these projects typically include language to the effect that, independently of whether their contribution is AI-generated, contributors are responsibile for ensuring that their contributions do not include any copyrighted material. Beman’s AI policy should include similar language.

I think AI use should be permitted for Beman libraries, and should not require any attribution. Labeling which parts have been generated by AI is creating busy-work that does not seem helpful beyond satisfying academic curiosity. Consider the policies you’re quoting from:

Arrow: “PRs that appear to be fully generated by AI with little to no engagement from the author may be closed without further review.

The problem isn’t that the PR is generated by AI, but that it’s a low-effort contribution that wastes reviewer time in general.

NumPy: “Do not waste developers’ time by submitting code that is fully or mostly generated by AI, and doesn’t meet our standards.”

The problem is that the submitted code doesn’t meet the standard of the project, not that it’s AI-generated.

In general, I feel like there’s an underlying stigma against the use of AI because some people use it irresponsibly, and the attempted fix of AI attribution or bans does nothing to fix the human part of the problem which is the actual cause. This is like mandating that you’re not allowed to use a specific text editor or must disclose its use because you’ve experienced overwhelmingly negative contributions from, say, VSCode users.

AI Attribution is impractical

From a practical standpoint, I regularly use AI to generate test cases for human-written code, as well as to proof-read code for obvious bugs and typos, and sometimes for not-so-obvious bugs. It’s quite amazing that you can tell an agent “this function doesn’t work, fix it” and then it just goes and figures it out. For these use cases, how do I even begin to do attribution? Do I add Claude as a co-author because it found a typo in my code? Do I attribute my PR to Claude because I told it to generate a bunch of EXPECT_EQ calls for a math function from its documentation?

Attribution is easy when the entire PR is a big pile of slop, but it’s not so easy when specific parts are AI-generated.

  • Does Copilot’s inline autocompletion count?
  • Do agentic bug fixes (that only update, but don’t add new code) count?
  • Do agentic refactorings count, like using an agent to resolve linter diagnostics that don’t have an auto-fix already?
  • What fraction of code specifically needs to be AI-generated for attribution to be necessary?

My concern is also attribution maintenance. If I generate a bunch of EXPECT_EQ calls for testing initially, does the attribution ever disappear? There is a “ship of Theseus” effect with a lot of code where at some point, there may be no AI-generated line left, but the test case may have been 100% slop initially. Does that part of code ever stop being attributed to AI? This is especially relevant if your policy is to label AI-generated code with comments.

AI attribution is futile

  • If even one use of Copilot autocompletion or even one buggy line fixed by an AI agent labels the entire PR as “partially AI-generated”, that’s an easy rule to follow, but the information collected is almost entirely useless because it lacks granularity. You could easily end up with 50% of commits in your code base being “partially AI-generated” and you would have no idea how much code is actually affected.
  • If you make the threshold based on personal judgment, the data gatehred is subjective and largely useless.
  • If you expect high-quality and granular AI attributions, either this is substantial and time-consuming busy-work which adds no direct value to the project, or people won’t meet a bar which has been set too high. Remember that even seemingly basic expectations like high-quality commit messages turn into chores, and granular and objective AI attribution seems even more challenging than high-quality commit messages.

All of this also assumes that people are acting in good faith and won’t just omit attributions out of laziness or shame, and this seems in contradiction with the “AI slop” and “lazy developers” stigma surrounding AI contributions.

Diluting accountability

I also 100% agree with this:

Oxide: “Volunteering that an LLM has been used to generate work product may implicitly distance oneself from the responsibility for the content”

I could distance myself from those generated test cases by saying Claude wrote those, but I’d rather not have that option. AI attribution dilutes accountability and responsibility. AI is just a tool to write code, just like a text editor.

I can equally make a big mess in my code by doing some RegEx-based search and replaced throughout a code base without carefully reviewing all the places affected. However, it’s perfectly clear that I am responsible for this action and that I don’t get to say:

Oh, the RegExp-replace did that. That’s not my work.

Conclusion

I’m not convinced anyone has found a good policy on AI bans and attributions, not even close. At best, you can pick some low-hanging fruits like

Don’t submit a 1000-line diff that you’ve AI-generated and never read, or at least label it as slop.

But this is already obvious to most people without any kind of policy. For all the hard and nuanced questions, you have to rely on personal judgment, which you may as well make your policy.

1 Like

Pretty much agree with the entire track here.

I basically said something similar about attribution: that is, at some level it’s impossible to tell the difference. I don’t know that we want to got to the never attribute stance, I think it’s fine if we allow attribution, but it’s going to quickly get out of hand. As an example, we discussed in the sync meeting AI reviews. I’ve been trying this out with various models and I can absolutely see the value there. It spotted a few things I overlooked (I looked first before being biased by the AI). If I open an issue for the maintainer, it’s unlikely. I’m going to credit the AI bc I’ll tune the output anyway. This is a wholly responsible use of the tools of course.

I think after last week Eddie was going to try and craft a PR on the policy.