Anthropic's Containment Architecture: How They Actually Sandbox Claude

📝 TECHNIQUE

Anthropic's Containment Architecture: How They Actually Sandbox Claude

Anthropic's engineering team pulls back the curtain on how they contain Claude across Claude Code, computer use, and MCP — detailing the specific sandboxing patterns, permission boundaries, and isolation layers that keep agents from going rogue in production. This isn't a blog post about principles; it's the actual architecture. If you're shipping agents with file system or network access, this is the containment playbook you've been waiting for. (970 likes | 117 RTs) Read more →

The "Folder + Scripts + HTML" Pattern for Non-Technical Claude Code Users: Drop files in a directory, tell Claude Code to write scripts and output HTML. That's it — and 1.2K likes confirm this deceptively simple workflow is becoming the default pattern for non-engineers using coding agents. No IDE, no terminal knowledge, no framework choices. Just a folder and a prompt. (1,203 likes | 34 RTs) Read more →

💡 INSIGHT

Microsoft Copilot Cowork Can Exfiltrate Your Files

Simon Willison documents a concrete attack: Microsoft Copilot Cowork can be used to exfiltrate files from your system — not theoretically, not in a lab, but in the shipping product. While everyone debates agent safety in the abstract, here's a real agent with real file access leaking real data. Pair this with Anthropic's containment post above and the gap between "knowing you need sandboxing" and "actually having it" becomes painfully clear. (via Simon Willison) Read more →

Paul Graham: AI-Written Emails Feel Like Being Lied To: "I have never knowingly finished reading an email signed by a human but written by AI." PG's observation landed at 2K likes because it names something everyone feels — as AI writing tools proliferate, the authenticity signal is becoming the scarcest commodity in communication. If your outbound comms are AI-generated but human-signed, your recipients can tell, and they stop reading. (2,012 likes | 68 RTs) Read more →

Chris Olah Responds to the Pope's AI Encyclical: Anthropic co-founder Chris Olah offers a direct response to Pope Leo XIV's "Magnifica humanitas" — the first time a frontier lab co-founder has formally engaged with the Vatican's AI framework. He bridges Anthropic's mechanistic interpretability research with the encyclical's call for "intelligible AI," arguing they're pursuing the same goal from different starting points. Read more →

Stack Overflow's Forum Is Dead — But the Company Pivoted: Stack Overflow's Q&A traffic cratered post-AI, but the company didn't die — it pivoted to enterprise data licensing and API access for AI training. The first major knowledge platform to complete its AI-era transformation, and a template for every content platform watching their traffic evaporate. (137 likes | 198 RTs) Read more →

Simon Willison on "The Pressure" to Adopt AI Tools: The most consistent chronicler of the AI developer experience writes about the compounding pressure developers feel to adopt AI tools — and when the person who tries everything says the pace is unsustainable, it's worth reading. Not a rejection of AI tooling, but an honest accounting of the cognitive cost. Read more →

🔬 RESEARCH

Mollick: We Have Zero Rigorous Productivity Data on Autonomous Coding Agents

Ethan Mollick drops the most uncomfortable observation in AI right now: every productivity study on coding tools predates the Claude Code/Codex-era autonomous agents that shipped after December 2025. We're making billion-dollar tooling bets on vibes and anecdotes — there is literally zero rigorous measurement of whether these autonomous agents actually make teams more productive. The most important gap in AI research right now isn't a missing model; it's a missing study. (674 likes | 44 RTs) Read more →

MIT Tech Review: The AI Jobs Hysteria Doesn't Match the Data: Despite headlines about layoffs at Coinbase, Meta, and Cisco, MIT Technology Review finds scant evidence of large-scale AI-driven job displacement. The gap between narrative and data is widening — and that gap matters for policy decisions being made right now. (via MIT Technology Review) Read more →

🔧 TOOL

Claude Code's /goal: A Second Model Checks When You're Actually Done: The agent doing the work shouldn't decide it's finished. /goal adds a separate model as a completion checker after every turn — useful when the finish line is concrete (tests passing, build clean, backlog empty). A small feature that addresses one of the most common agent failure modes: declaring victory too early. (205 likes | 11 RTs) Read more →

Expo Ships a Public MCP Server for React Native Developers: Expo launches a public MCP server that connects AI coding assistants to Expo docs, build logs, TestFlight crash reports, and simulator control. This is MCP moving from protocol spec to daily developer tooling — if you ship React Native, connect your coding agent today. (259 likes | 17 RTs) Read more →

GPT-5.5 + Codex Takes On Enterprise Document Parsing at Databricks: OpenAI showcases Codex paired with GPT-5.5 handling messy enterprise document parsing for Databricks — the kind of unglamorous but high-value agentic work that actually drives adoption. Not flashy demos, just reliably turning unstructured customer documents into structured data. (287 likes | 15 RTs) Read more →

🧠 LAUNCH

NVIDIA PiD: 4x Super-Resolution Straight from Model Latents: NVIDIA releases PiD, which upscales generated images 4x by working directly in pixel space from model latents — no separate upscaling pipeline needed. Weights are already on HuggingFace. If you're running any image generation workflow, this collapses your post-processing stack into a single step. (245 likes | 117 downloads) Read more →

🏗️ BUILD

Six Real Products Built by Non-Engineers Using Claude

Claude's official account showcases six projects built entirely by non-engineers — from illustrated home repair manuals to business dashboards to custom data tools. The 7.2K likes signal this is resonating far beyond the developer audience: the "why not?" era of building has arrived, and the barrier to entry is a folder and a prompt. Steal the patterns. (7,244 likes | 242 RTs) Read more →

ADHD: Tree-of-Thought with Pruning, Built on Claude Agent SDK: A Claude Agent SDK skill that fans out parallel divergent thoughts under different cognitive frames, scores them, prunes dead ends, and deepens survivors. Tree-of-thought has been theory for two years — this makes it a drop-in skill for creative and interdisciplinary agent work. (177 likes | 5 RTs) Read more →

🎓 MODEL LITERACY

Agent Sandboxing vs. Capability Isolation: Today's Anthropic containment post and Copilot exfiltration story show two sides of the same coin. Sandboxing restricts what an agent can access — file systems, networks, databases are walled off. Capability isolation restricts what an agent can do with the access it has — read but not write, list but not delete, query but not exfiltrate. Most agent frameworks only implement one. Sandboxing alone fails when the agent legitimately needs file access (like a coding agent). Capability isolation alone fails when the threat model involves data leaving the sandbox entirely. Understanding the difference is critical as agents get file system and network access by default — and today proved that getting it wrong has real consequences.

⚡ QUICK LINKS

Codex CLI 0.134.0: History search with previews, unified --profile, MCP OAuth support. (201 likes | 11 RTs) Link
Microsoft Lens: New text-to-image model dropped on HuggingFace. (101 likes | 673 downloads) Link
DeepSWE: A contamination-free benchmark for long-horizon coding agents — the SWE-bench antidote. (21 likes | 4 RTs) Link
Rezonant Alter: Point at your live product, voice a change, ship the spec straight to a coding agent. (166 likes | 165 RTs) Link
Anthropic TypeScript SDK v0.98.1: Fixes directory prefix bug in skill version uploads. Link

🎯 PICK OF THE DAY

Anthropic publishing its containment architecture on the same day Copilot gets caught exfiltrating files isn't coincidence — it's a signal. The gap between knowing how to sandbox agents and actually shipping sandboxed agents is now the defining security challenge of the agentic era. Anthropic's post details real containment patterns — permission boundaries, isolation layers, MCP sandboxing — while Willison's Copilot Cowork writeup shows what happens when a major lab ships an agent without them. Microsoft didn't lack the engineering talent to build proper containment; they lacked the prioritization. And that's the scarier finding. As agents graduate from "autocomplete with extra steps" to autonomous systems with file access, network calls, and tool use, the security surface isn't just growing — it's growing faster than most teams' ability to audit it. Today gave us the playbook and the cautionary tale in a single news cycle. Read both. Read more →

Until next time ✌️