Opus 4.7 Gets a Fast Lane — 2.5x Speed Boost Hits API and Claude Code

🧠 LAUNCH

Opus 4.7 Gets a Fast Lane — 2.5x Speed Boost Hits API and Claude Code

Claude Opus 4.7 now ships with a fast mode that delivers 2.5x faster output — available on the API with a speed: fast parameter and in Claude Code via /fast. At $30/$150 per 1M tokens, the pricing matches Opus 4.6 fast mode, but the speed gain is real for latency-sensitive agentic workflows where you're waiting on multi-step tool chains. If you're building anything that chains more than three agent calls, this changes your UX math. (2,300 likes | 93 RTs) Read more →

Opus 4.7 Fast Mode API Docs Drop — Waitlist Required. Official documentation is live: set speed: fast with the beta header, same rate limits as before. API access requires joining the waitlist — Claude Code users get it immediately. Read more →

Anthropic Makes Its First Vertical Play — Claude for Legal. Anthropic targets the $1T+ global legal industry with a dedicated Claude offering. This isn't just "use Claude for contracts" — it's purpose-built tooling for an industry that lives and dies by precision language and long-document reasoning. The vertical play signals Anthropic is done waiting for enterprises to figure out use cases on their own. Read more →

🔧 TOOL

OpenAI Ships Symphony — Every Task Gets Its Own Codex Agent

OpenAI Symphony automatically spawns a dedicated Codex agent for each open task in your workflow. Instead of one agent context-switching across your project, every task gets its own persistent agent with its own state. This is a direct shot at Claude Code's agent view, arriving the same week Anthropic ships fast mode — the multi-agent IDE war is officially a two-front battle. (663 likes | 53 RTs) Read more →

claude agents — The Terminal Control Plane You Didn't Know You Had. Run claude agents from your project root to get a dashboard across all active Claude Code sessions. Hit left-arrow from any CLI session to register it. If you're juggling multiple agents across branches or tasks, this is the workflow upgrade you missed in the changelog. (522 likes | 35 RTs) Read more →

This MCP Server Cuts Claude Code Tool Calls by 94% with a Local Knowledge Graph. An open-source MCP server indexes your codebase into a local knowledge graph, letting the agent query structure instead of scanning files one at a time. Supports 19+ languages, runs fully local, no API keys. If your Claude Code sessions are burning through tool calls on large repos, this directly attacks the token overhead problem. (132 likes | 10 RTs) Read more →

🔬 RESEARCH

DeepMind Reimagines the Mouse Pointer — Gestures, Speech, and Gemini Take Over

Google DeepMind just published experimental demos replacing the 50-year-old mouse pointer with motion, speech, and natural gestures directing Gemini on-screen. Forget clicking buttons — you point, speak, gesture, and the model interprets intent in real-time. This isn't a UX polish; it's a prototype for what computing looks like when the interaction layer is an AI, not a cursor. Landing right before Google I/O, the timing is deliberate. (4,414 likes | 505 RTs) Read more →

GPT 5.5 Breaks ProgramBench — And the Two Settings Picked Different Languages. GPT 5.5 becomes the first model to solve a ProgramBench task, with high and xhigh settings each independently choosing different programming languages to crack the same problem. The benchmark ceiling matters, but the approach diversity is the real signal — frontier models are developing genuine strategic preferences, not just following patterns. (954 likes | 90 RTs) Read more →

Sakana's KAME Takes a Different Path to Real-Time Voice — Tandem Architecture. KAME uses a tandem architecture to enhance knowledge in real-time speech-to-speech conversation, splitting the work between a fast acoustic model and a slower reasoning model. While OpenAI and xAI bet on single-model voice agents, Sakana is betting that two specialized models beat one generalist. Voice AI architectures are diverging fast. (730 likes | 145 RTs) Read more →

Thinking Machines Kills the Voice Activity Detector with Native Interaction Models. TML-Interaction-Small (276B-A12B) advances state-of-the-art in real-time voice by handling turn-taking natively — no traditional VAD pipeline at all. If you're building voice agents and still wiring up silence detection heuristics, this paper argues you're solving the wrong problem. (551 likes) Read more →

📝 TECHNIQUE

Boris Cherny Books Flights End-to-End with Claude Cowork + Opus 4.7. The creator of Claude Code reports that Opus 4.7 crossed a usability threshold — he booked flights end-to-end using Claude Cowork without intervention. Not a demo, not a cherry-picked screenshot, an actual booking. The gap between "impressive demo" and "I'd trust it with my credit card" just closed for at least one power user. (2,789 likes | 71 RTs) Read more →

Your Claude Code Sessions Might Be Eating 30GB of RAM. Simon Willison discovered Claude Code processes consuming ~30GB on his Mac, with the single largest process hitting 7.8GB. If you're running multiple sessions and wondering why your machine is thermal-throttling, check Activity Monitor before blaming your browser. (551 likes) Read more →

💡 INSIGHT

Code with Claude SF — Anthropic's Developer Ecosystem Gets Its Flagship Moment

Anthropic held Code with Claude SF, its first major developer event — part roadmap reveal, part community builder, part talent magnet. The recap signals Anthropic is investing heavily in developer relations and ecosystem tooling, not just model capability. When a lab starts throwing developer conferences, they're building a platform, not just shipping an API. Read more →

Mollick: ChatGPT Quietly Killed Study Mode, and That's a Problem. Ethan Mollick flags that OpenAI silently removed Study Mode from ChatGPT — the feature that made AI tutoring actually work. Research shows AI in pure assistant mode hurts learning; Study Mode was the fix. Claude and Gemini still have their equivalents. If you're using AI for education, this is your cue to switch tools. (495 likes | 36 RTs) Read more →

The ASI Heuristic: Watch the Consulting Teams, Not the Benchmarks. Mollick drops a sharp observation: as long as AI labs need "forward deployed engineering" teams to make their AI useful for customers, the ASI timeline is further out than the marketing suggests. When those teams get disbanded, that's your signal. A useful filter for separating capability claims from deployment reality. (804 likes | 72 RTs) Read more →

🏗️ BUILD

Anthropic's Security Team Built Their Threat Detection Platform with Claude Code. Anthropic dogfoods Claude Code for internal cybersecurity — a concrete case study of agentic coding tools applied to security operations. The architecture patterns are practical and transferable if you're building detection pipelines. Read more →

Needle: Gemini's Tool Calling Distilled into a 26M Parameter Model. A 26M parameter model that replicates Gemini's tool-calling capability — 1000x smaller than the original. If tool-calling can be distilled this aggressively, the cost structure for agent architectures changes fundamentally. Run it on a Raspberry Pi, embed it in a CLI, skip the API call entirely. (242 likes | 87 RTs) Read more →

🎓 MODEL LITERACY

Speculative Decoding: Opus 4.7 fast mode promises 2.5x speed at the same quality tier — speculative decoding is the key technique that makes this possible. The idea: a smaller, faster "draft" model generates candidate tokens, then the full model verifies them in batch. If the draft model guesses right (which it does most of the time for predictable tokens), you get multiple tokens for the cost of one full-model forward pass. It's like having an intern write the first draft and a senior editor approve whole paragraphs at once instead of writing word by word. The speedup is real, the quality ceiling stays the same, and it's why "fast mode" isn't just a marketing label.

⚡ QUICK LINKS

Daybreak: Altman frames OpenAI's cybersecurity vertical — frontier models meet Codex for defense. (2,385 likes | 167 RTs) Link
Gemini Omni: Rumored for Google I/O — advanced video model with editing and world understanding. (262 likes) Link
Code with Claude Hardware: Tiny computer giveaways spawned delightful creative builds. (1,996 likes | 122 RTs) Link
Claude Code v2.1.140: Agent color palette, subtype matching fixes, /goal hang resolved. Link
LangChain 1.3.0: Ships v3 stream_events for agents — update your stream handlers. Link

🎯 PICK OF THE DAY

DeepMind's AI pointer isn't a UX experiment — it's a platform play. Those experimental demos replacing the mouse cursor with gestures, speech, and Gemini-powered intent recognition look like a research showcase, but the timing — days before Google I/O, alongside Gemini Omni rumors — tells a different story. DeepMind is betting that the next computing paradigm will bypass chat entirely. Think about it: every major AI lab is building chat interfaces, but DeepMind just showed a future where you don't type prompts at all — you point at your screen and speak naturally. Whoever owns the interaction layer owns the platform, and Google already controls Android, Chrome, and the world's most popular browser. If they can replace the cursor with an AI-mediated interaction model, every app becomes a Gemini surface without developers lifting a finger. The 50-year-old mouse pointer was the last piece of computing infrastructure nobody thought to disrupt. Now someone has. Read more →

Until next time ✌️