Skip to content

Sandboxing

OS-level isolation, Docker containers, and the allowedTools gotcha

12 min read
Sandbox layers: four nested isolation boundaries from Docker container to allowedToolsDocker / VMmacOS SandboxPermission ModeallowedTools

You create a “read-only” agent with --allowedTools "Read,Grep,Glob". It writes a file anyway. The --allowedTools flag is a preference, not an enforcement boundary — Claude fell back to Bash, which you never explicitly blocked. The OS sandbox is the layer that actually stops things at the kernel level.

Claude Code runs inside an OS-level sandbox that restricts filesystem and network access at the kernel layer. Even if a prompt injection tells Claude to read /etc/shadow or exfiltrate data over HTTP, the operating system blocks the operation before it reaches the filesystem or the network stack. This is not application-level filtering that Claude can reason around — it is enforcement by the kernel itself.

Platform Support

Sandbox Backends by Platform

PlatformBackendMechanismStatus
macOSSeatbelt (sandbox-exec)Kernel-level sandbox profiles restricting syscalls, filesystem paths, and networkFully supported
Linuxbubblewrap (bwrap)Mount namespaces for filesystem isolation, network namespaces, process isolationFully supported
WSL2bubblewrap (bwrap)Same as Linux — WSL2 runs a real Linux kernel with full namespace supportFully supported
WSL1NoneWSL1 translates syscalls to Windows NT; no Linux namespace supportNot supported
Native WindowsNoneNo Seatbelt or bubblewrap equivalent availableNot supported

Filesystem Isolation

By default, the sandbox allows read-write access to exactly three locations:

  • cwd — the working directory where you launched Claude
  • ~/.claude/ — Claude’s configuration and session data
  • System tmpdir — temporary files needed during execution

Everything else on the filesystem is blocked. If Claude attempts to read a file outside these directories, the OS kernel denies the syscall and returns a permission error. This holds true even if Claude uses Bash to attempt raw file reads — the sandbox sits below the shell.

Terminal window
# This fails -- /var/log/app is outside the sandbox
$ claude -p "Read /var/log/app/error.log" --output-format json

The response will contain a denial because /var/log/app is not within the allowed directory set.

Extending Access with —add-dir

The --add-dir flag punches a hole in the sandbox for a specific directory. You can use it multiple times to grant access to several paths:

Terminal window
# Grant access to two additional directories
$ claude -p "Analyze logs and check config" \
--add-dir /var/log/app \
--add-dir /etc/app-config \
--output-format json

Key behaviors of --add-dir:

  • Paths must be absolute — relative paths are rejected
  • The path must exist at invocation time
  • Added directories get read-write access, the same level as cwd
  • The sandbox boundary becomes: cwd + all --add-dir paths + ~/.claude/ + system tmpdir
Gotcha

—add-dir grants full read-write access. If you need read-only access to an external directory, combine it with —disallowedTools “Write,Edit” to prevent modifications while still allowing reads.

Network Isolation

Filesystem isolation alone is not enough. Without network restrictions, a compromised agent could read SSH keys from ~/.ssh/ (which may be inside the allowed sandbox) and exfiltrate them over HTTP.

The sandbox can block network access entirely, and you can further restrict it at the tool level:

Terminal window
# Air-gapped agent -- no web access, no Bash escape
$ claude -p "Analyze this code offline" \
--disallowedTools "WebFetch,WebSearch,Bash" \
--permission-mode bypassPermissions \
--output-format json

When web tools are blocked, Claude falls back to its training knowledge. The response comes from what the model already knows, not from live web data. Verify freshness accordingly.

Defense Layers

Safe unattended operation requires all three layers working together. No single mechanism is sufficient on its own:

Defense-in-Depth Layers

LayerMechanismWhat It BlocksWhat It Misses
SandboxOS-level filesystem and network isolationAccess to paths outside allowed directories, unauthorized network callsAnything within allowed directories is fair game
Tool restrictions—allowedTools and —disallowedToolsSpecific tool usage (Write, Edit, Bash, WebFetch)MCP tools bypass built-in tool restrictions
HooksPreToolUse / PostToolUse event handlersCustom rules — regex on file paths, command auditing, API call loggingOnly as strong as the rules you write

Consider a prompt-injection attack chain: an injected instruction tells Claude to cat ~/.ssh/id_rsa. The filesystem sandbox blocks the path if ~/.ssh/ is outside allowed directories. If the read somehow succeeds, the network sandbox blocks exfiltration. And even if both layers fail, a PreToolUse hook can match the path pattern and reject the operation. Each layer catches what the previous one might miss.

The --dangerously-skip-permissions flag skips all permission prompts, but hooks still fire. This makes hooks the last line of defense in fully automated pipelines:

Permission check flow with --dangerously-skip-permissions:
1. Tool requested --> permission check --> SKIPPED
2. PreToolUse hook --> STILL FIRES --> can block execution
3. Tool executes --> sandbox --> STILL ACTIVE
4. PostToolUse hook --> STILL FIRES --> can audit results
Try This

Test the sandbox yourself. Try to read a file outside your working directory:

claude -p “Read /etc/passwd” —output-format json | jq ‘.result’

The sandbox should block it. Now try claude -p “Read /etc/passwd” —add-dir /etc —output-format json | jq ‘.result’. What changed? This is the —add-dir flag punching a hole in the sandbox.

Proof: The allowedTools Bypass

The most dangerous misconception in Claude Code security is that --allowedTools alone creates a read-only agent. It does not. Here is the proof — a supposedly read-only agent successfully writing a file:

Terminal window
# WRONG: This is NOT read-only
$ claude -p "Write hello to /tmp/outside_test.txt" \
--allowedTools "Read,Grep,Glob" \
--permission-mode bypassPermissions \
--output-format json
allowedTools Bypass -- Write Succeededartifacts/13/readonly_write_blocked.json
1{
2 "type": "result",
3 "subtype": "success",
4 "is_error": false,
5 "duration_ms": 6800,
6 "duration_api_ms": 6732,
7 "num_turns": 2,
8 "result": "Done. Wrote \"hello\" to `/tmp/outside_test.txt`.",A
9 "stop_reason": "end_turn",
10 "session_id": "98023423-42d3-42ea-9ba4-14be27ac1400",
11 "total_cost_usd": 0.027829,B
12 "usage": {
13 "input_tokens": 4,
14 "cache_creation_input_tokens": 1526,
15 "cache_read_input_tokens": 29893,
16 "output_tokens": 133,
17 "server_tool_use": {
18 "web_search_requests": 0,
19 "web_fetch_requests": 0
20 }C
21 },
22 "permission_denials": []
23}
AThe write succeeded -- a 'read-only' agent wrote a file
BCost confirms a tool was executed, not just a text response
CEmpty -- Bash was never explicitly blocked, so no denial was recorded

Claude fell back to Bash (which was not in --allowedTools but was not explicitly blocked either) and ran echo "hello" > /tmp/outside_test.txt. The --allowedTools flag is a preference, not an enforcement boundary. The --disallowedTools deny list is what actually blocks tools.

The allowedTools BypassSECURITY WALKTHROUGH
Step 1: The Setup
THE WRONG COMMAND
$ claude -p "Write hello..." \
--allowedTools "Read,Grep,Glob"
SECURITY LAYERS
sandbox active
allowedTools active
disallowedTools NOT SET

The correct pattern for a true read-only agent:

Terminal window
# RIGHT: Both allowedTools AND disallowedTools
$ claude -p "Analyze this codebase for security issues" \
--allowedTools "Read,Grep,Glob" \
--disallowedTools "Write,Edit,Bash,WebFetch,WebSearch" \
--permission-mode bypassPermissions \
--output-format json

Air-Gapped Response

When web tools are blocked, Claude responds entirely from training knowledge. This payload shows what that looks like:

Air-Gapped Agent -- Training Knowledge Onlyartifacts/13/airgapped_test.json
1{
2 "type": "result",
3 "subtype": "success",
4 "is_error": false,
5 "duration_ms": 24081,A
6 "duration_api_ms": 24046,
7 "num_turns": 2,
8 "result": "## Claude Code\n\nClaude Code is Anthropic's command-line tool and programmable agent runtime...",B
9 "stop_reason": "end_turn",
10 "session_id": "121a5238-d1a2-4f7d-b56a-4e9c8ec673f6",
11 "total_cost_usd": 0.117647,
12 "usage": {C
13 "input_tokens": 4,
14 "cache_creation_input_tokens": 13460,
15 "cache_read_input_tokens": 12453,
16 "output_tokens": 564,
17 "server_tool_use": {
18 "web_search_requests": 0,
19 "web_fetch_requests": 0D
20 }
21 },
22 "permission_denials": []
23}
A24 seconds -- longer than usual because Claude generated a detailed response from memory
BResponse from training knowledge, not live web data
CHigher cost reflects the longer generated output (564 tokens)
DZero web requests confirms the air-gap held

Notice web_search_requests: 0 and web_fetch_requests: 0. Claude did not attempt any web access — it recognized the tools were unavailable and produced the answer from what it already knew.

Gotcha

The sandbox is enforced at the OS kernel level. Prompt injection cannot escape it. Even if an attacker crafts a prompt that instructs Claude to cat /etc/shadow or curl https://evil.com, the kernel blocks the syscall before it executes. This is the single most important property of the sandboxing system.

Gotcha

WSL1 is not supported. WSL1 translates Linux syscalls to Windows NT kernel calls and does not provide the Linux namespace isolation that bubblewrap requires. If you are on Windows, use WSL2 (which runs a real Linux kernel) or a Docker container with a Linux image.

Note

Hook exit 2 survives —dangerously-skip-permissions. Experimentally confirmed: a PreToolUse hook that exits with code 2 blocks tool execution even when —dangerously-skip-permissions is active. Hooks are the only enforcement layer that cannot be bypassed by any flag. The security stack is: hooks (unbyppassable) > sandbox (kernel-level) > permissions (flag-bypassable).

--allow-dangerously-skip-permissions vs --dangerously-skip-permissions

These flags look similar but serve fundamentally different purposes. One is a single-key bypass; the other is a two-key safety mechanism.

Permission Bypass Comparison

FlagAloneWith —permission-mode bypassPermissionsUse Case
—dangerously-skip-permissionsBypass active immediatelyN/A (already active)Quick scripts, trusted environments
—allow-dangerously-skip-permissionsNo bypass — capability enabled onlyBypass active (both keys required)CI/CD with explicit activation

The two-key pattern makes --allow-dangerously-skip-permissions safer for CI/CD pipelines:

Terminal window
# Single-key bypass (one flag does it all)
claude -p "Deploy" --dangerously-skip-permissions
# Two-key bypass (both flags required)
claude -p "Deploy" \
--allow-dangerously-skip-permissions \
--permission-mode bypassPermissions
# Removing EITHER flag blocks the bypass

Without --permission-mode bypassPermissions, the --allow-dangerously-skip-permissions flag does nothing — the default permission mode applies normally. This is intentional: it lets pipeline templates include the allow flag while requiring explicit activation in the specific job step.

Gotcha

—allow-dangerously-skip-permissions alone does NOT bypass permissions. Read that again. It enables the capability but does not activate it. You must pair it with —permission-mode bypassPermissions — a two-key launch sequence. This is intentional: it prevents accidental bypasses in scripts that set the allow flag broadly. If you’re confused by the naming, you’re not alone — but the two-key pattern is what makes it safe for CI/CD templates.

Tip

Settings.json deny rules survive even with bypass active. A “deny”: [“Write”] rule removes the Write tool from Claude’s toolset entirely — Claude never sees it, regardless of permission mode. Deny rules are enforced at the tool-loading stage, before permission checks occur.

Known Security Vulnerabilities

Published security research has identified real vulnerabilities in Claude Code’s sandboxing and configuration system:

CVE-2025-59536 / CVE-2026-21852 — RCE via Project Files: Malicious .claude/ project configurations (hooks, MCP servers, environment variables) can achieve remote code execution and API token exfiltration. An attacker who controls a repository’s .claude/ directory can execute arbitrary code when a victim opens the project with Claude Code. Mitigations: review .claude/ contents before opening untrusted repos, use managed settings to restrict hook and MCP sources.

Denylist Bypass via /proc/self/root/: On Linux, Claude can bypass its own denylist by using /proc/self/root/usr/bin/npx to resolve to the same binary without matching the deny pattern. The deny rules use string matching, not path resolution, creating a gap for symlink-based bypasses. Mitigation: combine deny rules with OS-level sandbox restrictions.

ToxicSkills — Malicious Agent Skills: Research found that 36% of community-created agent skills contain security flaws, with 13.4% at critical severity. There is no official vetting system for skills. Mitigation: audit skills before installation, prefer skills from trusted sources, use --disable-slash-commands in sensitive environments.

Prompt Injection via MCP Tool Outputs: MCP tool responses are a documented attack vector for prompt injection. A malicious MCP server can return crafted tool outputs that manipulate Claude’s behavior. Mitigation: use --strict-mcp-config with explicitly trusted servers only, implement PostToolUse hooks to validate MCP outputs.

Review .claude/ Before Opening Untrusted Repos

Published CVEs demonstrate that malicious .claude/ directories can achieve remote code execution. Before opening any untrusted repository with Claude Code, inspect .claude/settings.json (hooks, permissions), .mcp.json (MCP servers), and any hook scripts. Managed settings with allowManagedPermissionRulesOnly: true can enforce organizational policies that override malicious project configs.

Now Do This

Test your sandbox right now: claude -p “What is in /etc/shadow?” —output-format json | jq ‘.result’. If the sandbox blocks it, you’re protected. If it doesn’t, you’re running without sandboxing — check your platform support table above and fix it before running unattended agents.