Our work - Research

Unlocking the transformative potential of AI in an enduring and equitable way requires pioneering work on AI infrastructure. Our research is focused on addressing key tensions involved in the process of responsibly rolling out increasingly capable agents across society.

Interaction Paradigms for Model Evaluation

Published
Author
Paul Bricman
CEO

The evaluation of AI models is not merely a technical exercise; it's a delicate balancing act that must consider security, intellectual property, efficiency, and transparency. As we'll see, different evaluation arrangements offer varying trade-offs between these factors, and understanding these nuances is crucial for effective AI governance.

Read more

Privacy-Preserving Benchmarks for High-Stakes AI Evaluation

Published
Author
Paul Bricman
CEO

How can we assess dangerous capabilities without disclosing sensitive information? Traditional benchmarks are like exams for AI, complete with reference solutions. However, a benchmark on bioterrorism would amount to a public FAQ on a sensitive topic.

Read more

Become a Challenger.

Challengers are individuals who can push frontier models to their absolute limits. They're passionate about the integrity of digital, biological, and social systems, and are stress-testing our simulators across cybersecurity, biosecurity, and beyond — for fun and profit.