When Copilots Take the Wheel

by Paul Bricman, CEO

  • Agents
  • Identity
  • Security

At the time of writing, the "assistant" paradigm still prevails across the AI landscape. This paradigm is characterized by a user interacting with an AI system in a conversational setting, with the AI system providing brief responses to the user's messages. Instances of this paradigm rely heavily on repurposing the public's prior familiarity with the mechanics of human-to-human instant messaging, now put in service of accessing virtual assistants. The turn-taking interaction pattern is well-suited for a range of short-horizon contexts, from self-guided learning to coding assistance, and from research ideation to customer service.

However, a new paradigm is emerging. Increasingly, frontier labs are gearing up to roll out AI systems that are more autonomous, more proactive, and more capable of carrying out complex tasks over extended periods of time by natively hooking into computer peripherals.1 Such "agents" would eventually rely more on our mental model of a colleague working remotely, using their own machine equipped with a full-fledged toolkit of existing apps. Agents are expected to be more capable of carrying out tasks that require a longer time horizon, such as end-to-end project management, software development, or scientific research.

The Challenge Endures

Previously, we've discussed the pressing need for fine-grained access control around AI capabilities. When considering how the issue can be addressed under the two paradigms, the answers do differ, but only superficially. When it comes to assistants, users would interact with the AI system through a chat interface. At some point, they might organically run into a protected capability, and so potentially trigger an authorization journey to gain access to it.

Key idea

Back in 2022, OpenAI researchers described a model which "uses the native human interface of keypresses and mouse movements, making it quite general, and represents a step towards general computer-using agents."2

In the case of agents, however, the interaction would unfold somewhat differently. When a user would delegate a task to an agent, the agent would go on to carry out the task autonomously, potentially running into a protected capability along the way. For instance, a user might delegate the task of implementing a new product feature to an agent. In the process, the agent might attempt to screen a code repository for security vulnerabilities, a point at which the user who has initially delegated the task might be prompted to authorize the agent to access the protected capability on their behalf, if they haven't done so already.

Towards Healthy Boundaries

There's another way of framing the path to confidently deploying agents across delicate environments such as enterprise settings, public sector, and critical infrastructure. On one hand, unlocking the full potential of agents requires that we provide them with the necessary space, autonomy, and flexibility to tackle complex tasks in creative ways — perhaps even in superhuman ways. On the other hand, we need to also be intentional in setting up healthy boundaries when working with agents, ensuring that interfering with mission-critical systems is robustly off-limits.

If blindly following the path of least resistence into the agent paradigm, however, we run the risk of setting up unhealthy boundaries from the get-go. It would be easiest for a user to simply hand over the driving wheel to an agent, granting it full access to their workspace through computer peripherals, and letting it do the job while they go on to focus on other tasks. Similarly, it would be easiest for a user to indiscriminately grant the agent access to all their documents, accounts, and affordances, in the hope that the agent would be able to jump right into the task at hand.

However, the path of least resistence might not be sustainable in the long run. Without healthy boundaries based on non-human identity management, the lack of observability, guarantees, and control around agents might quickly add up to deployments that are opaque, unpredictable, and generally not trustworthy. There is a need for tools that would allow users to precisely carve out a space for agents to operate in, while also providing them with the necessary flexibility to carry out sophisticated tasks. This will require a robust identity and access management solution that is tailored to the unique demands of AI systems, and that is able to provide users with the necessary oversight and control over the agents' domains of operation.

Footnotes

Footnotes

  1. On using games as a testing ground, DeepMind researchers argue that "learning to play even one video game is a technical feat for an AI system, but learning to follow instructions in a variety of game settings could unlock more helpful AI agents for any environment. Our research shows how we can translate the capabilities of advanced AI models into useful, real-world actions through a language interface."

  2. Learning to play Minecraft with Video PreTraining

More resources

Deconfusing AI-based IAM & IAM for AI Capabilities

Exploring the distinctions between AI-based Identity and Access Management and IAM for AI capabilities. How do these concepts intersect, and what are their implications?

Read more

Introducing Pinboard

We’re excited to share Pinboard, a command-line tool that streamlines workflows for developers working with generative systems. Learn how Pinboard can help you manage file references, request in-place file updates, and boost productivity in codebase-level development tasks.

Read more

Become a Challenger.

Challengers are individuals who can push frontier models to their absolute limits. They're passionate about the integrity of digital, biological, and social systems, and are stress-testing our simulators across cybersecurity, biosecurity, and beyond — for fun and profit.