claude-code computer-use how-to

Claude Code Computer Use: Let Claude Drive Your Desktop

Computer use lets Claude open apps, click, and fill forms on your Mac. Plan and OS requirements, how to enable it, realistic use cases, and the preview caveats.

jordan · June 2, 2026 · 7 min read

Claude Code can now take over your mouse and keyboard. On March 23, 2026 Anthropic shipped a research preview that lets Claude open apps, click buttons, type into fields, and read your screen through screenshots, all from inside a session you started in the terminal. Claude Code computer use is macOS-only in the CLI, gated to Pro and Max plans, and off by default. (Anthropic's announcement; Claude Code docs.)

That is the short version. The longer version matters more. Computer use is the kind of feature that ends up being either exactly what you needed or a security incident waiting to happen, and which one you get depends entirely on how you wire it up.

What computer use actually does

Until now, Claude Code worked through the parts of your machine that have a clean interface: the shell, the filesystem, MCP servers, the browser if you had Claude in Chrome connected. Computer use covers the rest. The GUI-only stuff. The native Swift app you just compiled, the iOS Simulator, a hardware control panel, some proprietary tool that ships no API and never will.

The flow is screenshot-driven. Claude takes a picture of your screen, decides where to click or what to type, sends a synthesized mouse or keyboard event, then takes another screenshot to check what happened. The docs give a concrete example that captures the appeal: Claude can "compile a Swift app, launch it, click through every button, and screenshot the result, all in the same conversation where it wrote the code." (docs).

This is the last resort, by design, not the first thing Claude reaches for. Anthropic's own ordering goes like this: if there is an MCP server for the task, Claude uses that; if it is a shell command, Bash; if it is browser work and you have Claude in Chrome set up, that. Only when nothing else reaches the target does it fall back to driving the screen. Screen control is slow and a little error-prone, so it gets reserved for the things genuinely nothing else can touch.

Does Claude Code support computer use, and on what

Yes, with three hard limits worth getting straight before you try.

Plan. Pro or Max only. The docs are explicit that it is not available on Team or Enterprise plans during the preview, and not through third-party providers like Amazon Bedrock, Google Cloud Vertex AI, or Microsoft Foundry. If you reach Claude exclusively through one of those, you need a separate claude.ai account to turn this on.

Operating system. The CLI version runs on macOS only. The Desktop app supports both macOS and Windows, so if you are on Windows your path is the Desktop app, not the terminal. Neither surface supports Linux.

Version. You need Claude Code v2.1.85 or later, and an interactive session. It does not work in headless mode with the -p flag. Run claude --version to check, and /status to confirm your subscription.

How to enable Claude Code computer use

Here is the part people search for, and it is short. Computer use ships as a built-in MCP server named computer-use, disabled until you switch it on. So yes, to answer the common question directly: Claude Code computer use is an MCP server. It is just one that comes bundled rather than one you install.

In an interactive session, run:

/mcp

Find computer-use in the list. It will show as disabled. Select it and choose Enable. That setting sticks per project, so you do this once per repo you want it in.

The first time Claude reaches for your screen, macOS throws two permission prompts. Accessibility lets it click, type, and scroll. Screen Recording lets it see what is on screen. Grant both, then hit Try again in the terminal prompt. macOS sometimes makes you fully quit and relaunch Claude Code after you grant Screen Recording, so if the prompt loops, restart the app and confirm your terminal is listed under System Settings > Privacy & Security > Screen Recording.

On the Desktop app the toggle lives in Settings > General, under the Desktop app section, instead of the /mcp menu.

If computer-use never shows up in /mcp, you have hit one of the gates above: wrong OS, a version below 2.1.85, a plan that is not Pro or Max, a third-party provider login, or a non-interactive session.

Per-app approval, because enabling the server is not the end

Switching on the server does not hand Claude your whole machine. The first time it wants a specific app in a session, a terminal prompt names which app it wants, any extra access it needs (clipboard, for example), and how many other apps will be hidden while it works. You pick Allow for this session or Deny, and approvals last only for that session.

Some apps carry an extra warning when you approve them, and these are the ones to read slowly. Terminals and IDEs (Terminal, iTerm, VS Code, Warp) are flagged as "equivalent to shell access." Finder is flagged as able to read or write any file. System Settings can change system settings. None of these are blocked. The warning exists so you decide whether a given task is worth that reach.

Claude's level of control also varies by app type. Browsers and trading platforms are view-only. Terminals and IDEs are click-only. Everything else gets full control.

While Claude works, it hides your other apps so it only ever interacts with the ones you approved. Your terminal window stays visible and is excluded from screenshots, which means Claude never reads its own output back as if it were a user instruction. A macOS notification reads "Claude is using your computer · press Esc to stop," and pressing Esc anywhere aborts immediately. Only one Claude session can hold the machine at a time, and a lock file enforces it.

What it is good for right now

The honest set of use cases is narrow and developer-shaped, which fits the audience this preview was built for.

Validating a native build is the headline one. You change a macOS or iOS app, then tell Claude to build the target, launch it, click through the controls, and screenshot any error states. No Playwright config, no test scaffolding. End-to-end UI testing on an Electron app works the same way: "test the onboarding flow," and it opens the app, clicks through signup, and screenshots each step.

Visual bug reproduction is the one I would reach for first. Say you describe a layout bug that only shows up at certain window sizes ("the settings modal clips its footer on narrow windows"). Claude resizes the window until it reproduces the clip, screenshots the broken state, then reads the relevant CSS. It sees what you see, which is the part that has been missing from terminal-only agents.

Then there is driving GUI-only tools: the iOS Simulator, design apps, hardware control panels, anything with no CLI or API. If your repeatable desktop workflows are getting complicated enough that you want to hand them to an agent, that is also the moment they are worth writing down as a portable Agent Skill so any tool can run them the same way twice.

The safety part, which is not optional reading

Computer use runs on your real desktop, not in the sandbox that isolates the Bash tool. That is the whole risk in one sentence. Claude scans each action for prompt injection coming from on-screen content, but Anthropic says plainly that attacks are evolving and the safeguards are not perfect. (safety guide).

Their own guidance is to keep computer use away from financial accounts and investments, legal documents and contracts, medical or health data, and apps holding other people's personal information. Close sensitive files before a session starts, because every screenshot captures whatever is on screen. Investment, trading, and cryptocurrency platforms are blocked by default, and you can extend that with your own blocklist on the Desktop app. The CLI does not yet expose a denied-apps list.

The built-in guardrails do real work. You get per-app approval, sentinel warnings on shell and filesystem and settings access, the terminal excluded from screenshots, a global Esc that is consumed so injected content cannot use it to dismiss dialogs, and the single-session lock. Start with apps you trust, on a project that does not touch anything you would mind a misclick on.

What "research preview" buys you, and what it costs

It means the feature works, and also that it will sometimes embarrass itself. Anthropic's framing: complex tasks sometimes need a second try, and working through your screen is slower than a direct integration. Your computer has to stay awake with Claude open for any of it to run. The capability is early relative to the rest of Claude, and the preview's reach (Pro and Max, macOS in the CLI) will likely widen, though Anthropic has not committed to a date for Windows in the CLI or for Team and Enterprise plans.

So here is my read. If you write Swift or ship a Mac app, turn it on this week and point it at your build. If you mostly live in the terminal and your tools all have APIs, you can wait, since the MCP-first ordering means computer use will rarely fire for you anyway. Either way, read the per-app warnings before you click Allow. That prompt is the actual trust boundary, and it is the one thing the model cannot decide for you.