Sandboxing — Agent Mastered

You create a “read-only” agent with --allowedTools "Read,Grep,Glob". It writes a file anyway. The --allowedTools flag is a preference, not an enforcement boundary — Claude fell back to Bash, which you never explicitly blocked. The OS sandbox is the layer that actually stops things at the kernel level.

Claude Code runs inside an OS-level sandbox that restricts filesystem and network access at the kernel layer. Even if a prompt injection tells Claude to read /etc/shadow or exfiltrate data over HTTP, the operating system blocks the operation before it reaches the filesystem or the network stack. This is not application-level filtering that Claude can reason around — it is enforcement by the kernel itself.

For anything running on untrusted code, use macOS Seatbelt or Docker. The sandbox operates at the kernel level — below anything Claude can influence — which is why it’s the one security layer that can’t be talked around.

Platform Support

Sandbox Backends by Platform

Platform	Backend	Mechanism	Status
macOS	Seatbelt (`sandbox-exec`)	Kernel-level sandbox profiles restricting syscalls, filesystem paths, and network	Fully supported
Linux	bubblewrap (`bwrap`)	Mount namespaces for filesystem isolation, network namespaces, process isolation	Fully supported
WSL2	bubblewrap (`bwrap`)	Same as Linux — WSL2 runs a real Linux kernel with full namespace support	Fully supported
WSL1	None	WSL1 translates syscalls to Windows NT; no Linux namespace support	Not supported
Native Windows	None	No Seatbelt or bubblewrap equivalent available	Not supported

Which should I use? Both work. macOS Seatbelt has been around for 10+ years. Linux namespaces offer finer control but need newer kernels (4.8+).

Filesystem Isolation

By default, the sandbox allows read-write access to exactly three locations:

cwd — the working directory where you launched Claude
~/.claude/ — Claude’s configuration and session data
System tmpdir — temporary files needed during execution

Everything else on the filesystem is blocked. If Claude attempts to read a file outside these directories, the OS kernel denies the syscall and returns a permission error. This holds true even if Claude uses Bash to attempt raw file reads — the sandbox sits below the shell.

# This fails -- /var/log/app is outside the sandbox
$ claude -p "Read /var/log/app/error.log" --output-format json

The response will contain a denial because /var/log/app is not within the allowed directory set.

Extending Access with —add-dir

The --add-dir flag punches a hole in the sandbox for a specific directory. You can use it multiple times to grant access to several paths:

# Grant access to two additional directories
$ claude -p "Analyze logs and check config" \
    --add-dir /var/log/app \
    --add-dir /etc/app-config \
    --output-format json

Key behaviors of --add-dir:

Paths must be absolute — relative paths are rejected
The path must exist at invocation time
Added directories get read-write access, the same level as cwd
The sandbox boundary becomes: cwd + all --add-dir paths + ~/.claude/ + system tmpdir

Gotcha

—add-dir grants full read-write access. If you need read-only access to an external directory, combine it with —disallowedTools “Write,Edit” to prevent modifications while still allowing reads.

Network Isolation

Filesystem isolation alone is not enough. Without network restrictions, a compromised agent could read SSH keys from ~/.ssh/ (which may be inside the allowed sandbox) and exfiltrate them over HTTP.

The sandbox can block network access entirely, and you can further restrict it at the tool level:

# Air-gapped agent -- no web access, no Bash escape
$ claude -p "Analyze this code offline" \
    --disallowedTools "WebFetch,WebSearch,Bash" \
    --permission-mode bypassPermissions \
    --output-format json

When web tools are blocked, Claude falls back to its training knowledge. The response comes from what the model already knows, not from live web data. Verify freshness accordingly.

Used by: financial services running Claude on sensitive data, healthcare orgs with HIPAA requirements, defense contractors.

Defense Layers

Safe unattended operation requires all three layers working together. No single mechanism is sufficient on its own:

Defense-in-Depth Layers

Layer	Mechanism	What It Blocks	What It Misses
Sandbox	OS-level filesystem and network isolation	Access to paths outside allowed directories, unauthorized network calls	Anything within allowed directories is fair game
Tool restrictions	`—allowedTools` and `—disallowedTools`	Specific tool usage (Write, Edit, Bash, WebFetch)	MCP tools bypass built-in tool restrictions
Hooks	PreToolUse / PostToolUse event handlers	Custom rules — regex on file paths, command auditing, API call logging	Only as strong as the rules you write

Consider a prompt-injection attack chain: an injected instruction tells Claude to cat ~/.ssh/id_rsa. The filesystem sandbox blocks the path if ~/.ssh/ is outside allowed directories. If the read somehow succeeds, the network sandbox blocks exfiltration. And even if both layers fail, a PreToolUse hook can match the path pattern and reject the operation. Each layer catches what the previous one might miss.

The --dangerously-skip-permissions flag skips all permission prompts, but hooks still fire. This makes hooks the last line of defense in fully automated pipelines:

Permission check flow with --dangerously-skip-permissions:
  1. Tool requested  --> permission check --> SKIPPED
  2. PreToolUse hook --> STILL FIRES     --> can block execution
  3. Tool executes   --> sandbox          --> STILL ACTIVE
  4. PostToolUse hook --> STILL FIRES    --> can audit results

▸ Try This

Test the sandbox yourself. Try to read a file outside your working directory:

claude -p “Read /etc/passwd” —output-format json | jq ‘.result’

The sandbox should block it. Now try claude -p “Read /etc/passwd” —add-dir /etc —output-format json | jq ‘.result’. What changed? This is the —add-dir flag punching a hole in the sandbox.

Proof: The allowedTools Bypass

The most dangerous misconception in Claude Code security is that --allowedTools alone creates a read-only agent. It does not. Here is the proof — a supposedly read-only agent successfully writing a file:

This interactive demo shows why sandboxing matters. Watch Claude find 4 different ways around an allowedTools restriction.

# WRONG: This is NOT read-only
$ claude -p "Write hello to /tmp/outside_test.txt" \
    --allowedTools "Read,Grep,Glob" \
    --permission-mode bypassPermissions \
    --output-format json

allowedTools Bypass -- Write Succeededartifacts/13/readonly_write_blocked.json

2 "type": "result",

3 "subtype": "success",

4 "is_error": false,

5 "duration_ms": 6800,

6 "duration_api_ms": 6732,

7 "num_turns": 2,

8 "result": "Done. Wrote \"hello\" to `/tmp/outside_test.txt`.",← A

9 "stop_reason": "end_turn",

10 "session_id": "98023423-42d3-42ea-9ba4-14be27ac1400",

11 "total_cost_usd": 0.027829,← B

12 "usage": {

13 "input_tokens": 4,

14 "cache_creation_input_tokens": 1526,

15 "cache_read_input_tokens": 29893,

16 "output_tokens": 133,

17 "server_tool_use": {

18 "web_search_requests": 0,

19 "web_fetch_requests": 0

20 }← C

21 },

22 "permission_denials": []

23}

AThe write succeeded -- a 'read-only' agent wrote a file

BCost confirms a tool was executed, not just a text response

CEmpty -- Bash was never explicitly blocked, so no denial was recorded

Claude fell back to Bash (which was not in --allowedTools but was not explicitly blocked either) and ran echo "hello" > /tmp/outside_test.txt. The --allowedTools flag is a preference, not an enforcement boundary. The --disallowedTools deny list is what actually blocks tools.

The allowedTools BypassSECURITY WALKTHROUGH

Auto-play

Step 1: The Setup

THE WRONG COMMAND

$ claude -p "Write hello..." \
--allowedTools "Read,Grep,Glob"

SECURITY LAYERS

sandbox✅ active

allowedTools✅ active

disallowedTools❌ NOT SET

The correct pattern for a true read-only agent:

# RIGHT: Both allowedTools AND disallowedTools
$ claude -p "Analyze this codebase for security issues" \
    --allowedTools "Read,Grep,Glob" \
    --disallowedTools "Write,Edit,Bash,WebFetch,WebSearch" \
    --permission-mode bypassPermissions \
    --output-format json

Air-Gapped Response

When web tools are blocked, Claude responds entirely from training knowledge. This payload shows what that looks like:

Air-Gapped Agent -- Training Knowledge Onlyartifacts/13/airgapped_test.json

2 "type": "result",

3 "subtype": "success",

4 "is_error": false,

5 "duration_ms": 24081,← A

6 "duration_api_ms": 24046,

7 "num_turns": 2,

8 "result": "## Claude Code\n\nClaude Code is Anthropic's command-line tool and programmable agent runtime...",← B

9 "stop_reason": "end_turn",

10 "session_id": "121a5238-d1a2-4f7d-b56a-4e9c8ec673f6",

11 "total_cost_usd": 0.117647,

12 "usage": {← C

13 "input_tokens": 4,

14 "cache_creation_input_tokens": 13460,

15 "cache_read_input_tokens": 12453,

16 "output_tokens": 564,

17 "server_tool_use": {

18 "web_search_requests": 0,

19 "web_fetch_requests": 0← D

20 }

21 },

22 "permission_denials": []

23}

A24 seconds -- longer than usual because Claude generated a detailed response from memory

BResponse from training knowledge, not live web data

CHigher cost reflects the longer generated output (564 tokens)

DZero web requests confirms the air-gap held

Notice web_search_requests: 0 and web_fetch_requests: 0. Claude did not attempt any web access — it recognized the tools were unavailable and produced the answer from what it already knew.

Gotcha

The sandbox is enforced at the OS kernel level. Prompt injection cannot escape it. Even if an attacker crafts a prompt that instructs Claude to cat /etc/shadow or curl https://evil.com, the kernel blocks the syscall before it executes. No amount of clever prompting gets past a denied syscall.

Gotcha

WSL1 is not supported. WSL1 translates Linux syscalls to Windows NT kernel calls and does not provide the Linux namespace isolation that bubblewrap requires. If you are on Windows, use WSL2 (which runs a real Linux kernel) or a Docker container with a Linux image.

Note

Hook exit 2 survives —dangerously-skip-permissions. A PreToolUse hook that exits with code 2 blocks tool execution even when —dangerously-skip-permissions is active. Hooks are the only enforcement layer that cannot be bypassed by any flag. The security stack is: hooks (unbyppassable) > sandbox (kernel-level) > permissions (flag-bypassable).

`--allow-dangerously-skip-permissions` vs `--dangerously-skip-permissions`

These flags look similar but serve fundamentally different purposes. One is a single-key bypass; the other is a two-key safety mechanism.

Permission Bypass Comparison

Flag	Alone	With `—permission-mode bypassPermissions`	Use Case
`—dangerously-skip-permissions`	Bypass active immediately	N/A (already active)	Quick scripts, trusted environments
`—allow-dangerously-skip-permissions`	No bypass — capability enabled only	Bypass active (both keys required)	CI/CD with explicit activation

The two-key pattern makes --allow-dangerously-skip-permissions safer for CI/CD pipelines:

# Single-key bypass (one flag does it all)
claude -p "Deploy" --dangerously-skip-permissions

# Two-key bypass (both flags required)
claude -p "Deploy" \
  --allow-dangerously-skip-permissions \
  --permission-mode bypassPermissions
# Removing EITHER flag blocks the bypass

Without --permission-mode bypassPermissions, the --allow-dangerously-skip-permissions flag does nothing — the default permission mode applies normally. This is intentional: it lets pipeline templates include the allow flag while requiring explicit activation in the specific job step.

Gotcha

—allow-dangerously-skip-permissions alone does NOT bypass permissions. Read that again. It enables the capability but does not activate it. You must pair it with —permission-mode bypassPermissions — a two-key launch sequence. This is intentional: it prevents accidental bypasses in scripts that set the allow flag broadly. If you’re confused by the naming, you’re not alone — but the two-key pattern is what makes it safe for CI/CD templates.

Tip

Settings.json deny rules survive even with bypass active. A “deny”: [“Write”] rule removes the Write tool from Claude’s toolset entirely — Claude never sees it, regardless of permission mode. Deny rules are enforced at the tool-loading stage, before permission checks occur.

Known Security Vulnerabilities

Published security research has identified real vulnerabilities in Claude Code’s sandboxing and configuration system. The --dangerously-skip-permissions flag was removed in v2.1 after security researchers demonstrated privilege escalation. This is why we recommend defense-in-depth—no single layer is perfect.

CVE-2025-59536 / CVE-2026-21852 — RCE via Project Files: Malicious .claude/ project configurations (hooks, MCP servers, environment variables) can achieve remote code execution and API token exfiltration. An attacker who controls a repository’s .claude/ directory can execute arbitrary code when a victim opens the project with Claude Code. Mitigations: review .claude/ contents before opening untrusted repos, use managed settings to restrict hook and MCP sources.

Denylist Bypass via /proc/self/root/: On Linux, Claude can bypass its own denylist by using /proc/self/root/usr/bin/npx to resolve to the same binary without matching the deny pattern. The deny rules use string matching, not path resolution, creating a gap for symlink-based bypasses. Mitigation: combine deny rules with OS-level sandbox restrictions.

ToxicSkills — Malicious Agent Skills: Research found that 36% of community-created agent skills contain security flaws, with 13.4% at critical severity. There is no official vetting system for skills. Mitigation: audit skills before installation, prefer skills from trusted sources, use --disable-slash-commands in sensitive environments.

Prompt Injection via MCP Tool Outputs: MCP tool responses are a documented attack vector for prompt injection. A malicious MCP server can return crafted tool outputs that manipulate Claude’s behavior. Mitigation: use --strict-mcp-config with explicitly trusted servers only, implement PostToolUse hooks to validate MCP outputs.

Review .claude/ Before Opening Untrusted Repos

Published CVEs demonstrate that malicious .claude/ directories can achieve remote code execution. Before opening any untrusted repository with Claude Code, inspect .claude/settings.json (hooks, permissions), .mcp.json (MCP servers), and any hook scripts. Managed settings with allowManagedPermissionRulesOnly: true can enforce organizational policies that override malicious project configs.

→ Now Do This

Test your sandbox right now: claude -p “What is in /etc/shadow?” —output-format json | jq ‘.result’. If the sandbox blocks it, you’re protected. If it doesn’t, you’re running without sandboxing — check your platform support table above and fix it before running unattended agents.

Platform Support

Sandbox Backends by Platform

Filesystem Isolation

Extending Access with —add-dir

Network Isolation

Defense Layers

Defense-in-Depth Layers

Proof: The allowedTools Bypass

Air-Gapped Response

--allow-dangerously-skip-permissions vs --dangerously-skip-permissions

Permission Bypass Comparison

Known Security Vulnerabilities

`--allow-dangerously-skip-permissions` vs `--dangerously-skip-permissions`