- SF Bay Area
Highlights
- Pro
Popular repositories Loading
-
-
sandbagging_auditing_games
sandbagging_auditing_games PublicForked from AI-Safety-Institute/sandbagging_auditing_games
This repository accompanies the research paper "Sandbagging Auditing Games" on detecting sandbagging in frontier AI systems. We provide access to the model organisms used in the paper and tools for…
Python
-
inspect_petri
inspect_petri PublicForked from meridianlabs-ai/inspect_petri
An alignment auditing agent capable of quickly exploring alignment hypothesis
Python
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.



