Computer use (beta)
The upgraded Claude 3.5 Sonnet model is capable of interacting with tools that can manipulate a computer desktop environment.
Computer use is a beta feature. Please be aware that computer use poses unique risks that are distinct from standard API features or chat interfaces. These risks are heightened when using computer use to interact with the internet. To minimize risks, consider taking precautions such as:
- Use a dedicated virtual machine or container with minimal privileges to prevent direct system attacks or accidents.
- Avoid giving the model access to sensitive data, such as account login information, to prevent information theft.
- Limit internet access to an allowlist of domains to reduce exposure to malicious content.
- Ask a human to confirm decisions that may result in meaningful real-world consequences as well as any tasks requiring affirmative consent, such as accepting cookies, executing financial transactions, or agreeing to terms of service.
In some circumstances, Claude will follow commands found in content even if it conflicts with the user’s instructions. For example, Claude instructions on webpages or contained in images may override instructions or cause Claude to make mistakes. We suggest taking precautions to isolate Claude from sensitive data and actions to avoid risks related to prompt injection.
Finally, please inform end users of relevant risks and obtain their consent prior to enabling computer use in your own products.
Computer use reference implementation
Get started quickly with our computer use reference implementation that includes a web interface, Docker container, example tool implementations, and an agent loop.
Please use this form to provide feedback on the quality of the model responses, the API itself, or the quality of the documentation - we cannot wait to hear from you!
Here’s an example of how to provide computer use tools to Claude using the Messages API:
How computer use works
1. Provide Claude with computer use tools and a user prompt
- Add Anthropic-defined computer use tools to your API request.
- Include a user prompt that might require these tools, e.g., “Save a picture of a cat to my desktop.”
2. Claude decides to use a tool
- Claude loads the stored computer use tool definitions and assesses if any tools can help with the user’s query.
- If yes, Claude constructs a properly formatted tool use request.
- The API response has a
stop_reason
oftool_use
, signaling Claude’s intent.
3. Extract tool input, evaluate the tool on a computer, and return results
- On your end, extract the tool name and input from Claude’s request.
- Use the tool on a container or Virtual Machine.
- Continue the conversation with a new
user
message containing atool_result
content block.
4. Claude continues calling computer use tools until it's completed the task
- Claude analyzes the tool results to determine if more tool use is needed or the task has been completed.
- If Claude decides it needs another tool, it responds with another
tool_use
stop_reason
and you should return to step 3. - Otherwise, it crafts a text response to the user.
We refer to the repetition of steps 3 and 4 without user input as the “agent loop” - i.e., Claude responding with a tool use request and your application responding to Claude with the results of evaluating that request.
How to implement computer use
Start with our reference implementation
We have built a reference implementation that includes everything you need to get started quickly with computer use:
- A containerized environment suitable for computer use with Claude
- Implementations of the computer use tools
- An agent loop that interacts with the Anthropic API and executes the computer use tools
- A web interface to interact with the container, agent loop, and tools.
We recommend trying the reference implementation out before reading the rest of this documentation.
Optimize model performance with prompting
Here are some tips on how to get the best quality outputs:
- Specify simple, well-defined tasks and provide explicit instructions for each step.
- Claude sometimes assumes outcomes of its actions without explicitly checking their results. To prevent this you can prompt Claude with
After each step, take a screenshot and carefully evaluate if you have achieved the right outcome. Explicitly show your thinking: "I have evaluated step X..." If not correct, try again. Only when you confirm a step was executed correctly should you move on to the next one.
- Some UI elements (like dropdowns and scrollbars) might be tricky for Claude to manipulate using mouse movements. If you experience this, try prompting the model to use keyboard shortcuts.
- For repeatable tasks or UI interactions, include example screenshots and tool calls of successful outcomes in your prompt.
- If you need the model to log in, provide it with the username and password in your prompt inside xml tags like
<robot_credentials>
. Using computer use within applications that require login increases the risk of bad outcomes as a result of prompt injection. Please review our guide on mitigating prompt injections before providing the model with login credentials.
If you repeatedly encounter a clear set of issues or know in advance the tasks Claude will need to complete, use the system prompt to provide Claude with explicit tips or instructions on how to do the tasks successfully.
System prompts
When one of the Anthropic-defined tools is requested via the Anthropic API, a computer use-specific system prompt is generated. It’s similar to the tool use system prompt but starts with:
You have access to a set of functions you can use to answer the user’s question. This includes access to a sandboxed computing environment. You do NOT currently have the ability to inspect files or interact with external resources, except by invoking the below functions.
As with regular tool use, the user-provided system_prompt
field is still respected and used in the construction of the combined system prompt.
Understand Anthropic-defined tools
As a beta, these tool definitions are subject to change.
We have provided a set of tools that enable Claude to effectively use computers. When specifying an Anthropic-defined tool, description
and tool_schema
fields are not necessary or allowed.
Anthropic-defined tools are user executed
Anthropic-defined tools are defined by Anthropic but you must explicitly evaluate the results of the tool and return the tool_results
to Claude. As with any tool, the model does not automatically execute the tool.
We currently provide 3 Anthropic-defined tools:
{ "type": "computer_20241022", "name": "computer" }
{ "type": "text_editor_20241022", "name": "str_replace_editor" }
{ "type": "bash_20241022", "name": "bash" }
The type
field identifies the tool and its parameters for validation purposes, the name
field is the tool name exposed to the model.
If you want to prompt the model to use one of these tools, you can explicitly refer the tool by the name
field. The name
field must be unique within the tool list; you cannot define a tool with the same name as an Anthropic-defined tool in the same API call.
We do not recommend defining tools with the names of Anthropic-defined tools. While you can still redefine tools with these names (as long as the tool name is unique in your tools
block), doing so may result in degraded model performance.
Combine computer use with other tools
You can combine regular tool use with the Anthropic-defined tools for computer use.
Build a custom computer use environment
The reference implementation is meant to help you get started with computer use. It includes all of the components needed have Claude use a computer. However, you can build your own environment for computer use to suit your needs. You’ll need:
- A virtualized or containerized environment suitable for computer use with Claude
- An implementation of at least one of the Anthropic-defined computer use tools
- An agent loop that interacts with the Anthropic API and executes the
tool_use
results using your tool implementations - An API or UI that allows user input to start the agent loop
Understand computer use limitations
The computer use functionality is in beta. While Claude’s capabilities are cutting edge, developers should be aware of its limitations:
- Latency: the current computer use latency for human-AI interactions may be too slow compared to regular human-directed computer actions. We recommend focusing on use cases where speed isn’t critical (e.g., background information gathering, automated software testing) in trusted environments.
- Computer vision accuracy and reliability: Claude may make mistakes or hallucinate when outputting specific coordinates while generating actions.
- Tool selection accuracy and reliability: Claude may make mistakes or hallucinate when selecting tools while generating actions or take unexpected actions to solve problems. Additionally, reliability may be lower when interacting with niche applications or multiple applications at once. We recommend that users prompt the model carefully when requesting complex tasks.
- Scrolling reliability: Scrolling may be unreliable in the current experience, and the model may not reliably scroll to the bottom of a page. Scrolling-like behavior can be improved via keystrokes (PgUp/PgDown).
- Spreadsheet interaction: Mouse clicks for spreadsheet interaction are unreliable. Cell selection may not always work as expected. This can be mitigated by prompting the model to use arrow keys.
- Account creation and content generation on social and communications platforms: While Claude will visit websites, we are limiting its ability to create accounts or generate and share content or otherwise engage in human impersonation across social media websites and platforms. We may update this capability in the future.
- Vulnerabilities: Vulnerabilities like jailbreaking or prompt injection may persist across frontier AI systems, including the beta computer use API. In some circumstances, Claude will follow commands found in content, sometimes even in conflict with the user’s instructions. For example, Claude instructions on webpages or contained in images may override instructions or cause Claude to make mistakes. We recommend: a. Limiting computer use to trusted environments such as virtual machines or containers with minimal privileges b. Avoiding giving computer use access to sensitive accounts or data without strict oversight c. Informing end users of relevant risks and obtaining their consent before enabling or requesting permissions necessary for computer use features in your applications
- Inappropriate or illegal actions: Per Anthropic’s terms of service, you must not employ computer use to violate any laws or our Acceptable Use Policy.
Always carefully review and verify Claude’s computer use actions and logs. Do not use Claude for tasks requiring perfect precision or sensitive user information without human oversight.
Pricing
See the tool use pricing documentation for a detailed explanation of how Claude Tool Use API requests are priced.
As a subset of tool use requests, computer use requests are priced the same as any other Claude API request.
We also automatically include a special system prompt for the model, which enables computer use.
Model | Tool choice | System prompt token count |
---|---|---|
Claude 3.5 Sonnet (new) | auto any , tool | 466 tokens 499 tokens |
In addition to the base tokens, the following additional input tokens are needed for the Anthropic-defined tools:
Tool | Additional input tokens |
---|---|
computer_20241022 | 683 tokens |
text_editor_20241022 | 700 tokens |
bash_20241022 | 245 tokens |