NewsletterBlogLearnCompareTopicsGlossary
LAUNCHINSIGHTTOOLRESEARCHTECHNIQUEBUILD

24 items covered

Anthropic ships Fable 5 and Mythos 5 — the biggest capability jump since Opus 4.5

🧠 LAUNCH

Anthropic ships Fable 5 and Mythos 5 — the biggest capability jump since Opus 4.5.

Claude Fable 5 lands as Anthropic's new frontier model — Mythos-class intelligence made safe for general use, with Mythos 5 sitting above it as the raw research powerhouse. SOTA on nearly every benchmark that matters, with particular dominance on extended reasoning and agentic coding tasks. The gap between Fable and what came before it isn't incremental — it's the kind of jump that changes what you can delegate to a model. Available now on claude.ai and via API. Read more →

Gemini 3.5 Live Translate ships real-time speech-to-speech in 70+ languages.

Google just made universal translation feel real. Gemini 3.5 Live Translate streams translations while still listening — no awkward pauses, no turn-taking, just fluid cross-language conversation. Supporting 70+ languages across AI Studio, Google Translate, and Meet, this is the most ambitious production deployment of streaming speech-to-speech translation we've seen. If you build anything multilingual, this just became your baseline. (1,787 likes | 233 RTs) Read more →

Gemma 4 12B drops as a unified, encoder-free multimodal model. One architecture, four modalities — text, images, audio, and video — with no separate encoders bolted on. Google already has QAT GGUF weights available, so you can run this locally today. At 12B parameters, this is the multimodal model that actually fits on your hardware. Read more →

North Mini Code enters the ring. Cohere ships its first developer-focused coding model, already trending on HuggingFace. More competition in the small-coding-model space is exactly what cost-sensitive deployments need — benchmark it against your current stack before dismissing it. (153 likes | 1.8K downloads) Read more →


🔧 TOOL

Claude Code unlocks nested subagents — agents spawning agents, five levels deep.

Claude Code now supports nested subagents capped at depth 5, and this is a bigger deal than it sounds. Instead of one agent trying to hold an entire complex task in context, it can spawn specialized sub-agents that each focus on a piece — better context management, better results, and a genuine architecture unlock for agentic workflows that outgrew single-agent loops. (4,981 likes | 250 RTs) Read more →

Claude Managed Agents adds cron scheduling and secret vaults. Two critical missing pieces for production agent deployments just landed — schedule recurring agent sessions on a cron cadence, and inject secrets securely via environment variable vaults instead of hardcoding them. If you've been running agents manually on a timer, migrate now. Read more →

OpenAI Responses API now returns image search results. Web search through the Responses API surfaces images alongside text — products, places, diagrams, visual references, all programmatically accessible. If you're building anything that needs to show users what something looks like, this just got a lot easier. (1,486 likes | 84 RTs) Read more →


📝 TECHNIQUE

Four tips for getting the most out of Fable 5. Alex Albert, Anthropic's head of developer relations, shares concrete guidance: give Fable bigger tasks than you'd give previous models, default to xhigh/high effort levels, and — critically — rewrite your CLAUDE.md files because old instructions anchor the model to prior behaviors. That last one is subtle and important: your existing prompts may be actively holding Fable back. (916 likes | 42 RTs) Read more →

Fable 5's tokenizer produces ~30% more tokens for the same text. Fable uses the Opus 4.7 tokenizer, which means your existing prompts will cost roughly 30% more in tokens — same input, bigger bill. Use the token counting API with model: 'claude-fable-5' before you get surprised by your invoice. This is the unglamorous launch-day detail that actually affects your budget. Read more →

Self-verification loops are the key ingredient for long-running Fable sessions. Boris Cherny explains the pattern: Fable's capability means it can run autonomously for much longer stretches, but longer sessions without verification accumulate errors. The fix is building self-check loops into your agent workflows — let the model periodically verify its own work before continuing. With Fable, the model is good enough to catch its own mistakes. (1,151 likes | 80 RTs) Read more →


🔬 RESEARCH

Multi-agent Mythos teams write large programs 3x faster than solo agents. Anthropic shares concrete data: teams of Claude Mythos agents coordinating on large codebases achieve 3x the speed of a single agent working alone. The multi-agent coordination thesis now has numbers behind it — if you're still running one agent at a time on complex projects, you're leaving performance on the table. (25 likes | 1 RTs) Read more →

Internal evals: Fable matches GPT-5.5 on 98% of coding tasks, shines on the hardest 2%. Independent internal evaluations show Fable 5 performs identically to GPT-5.5 and Opus 4.8 on the vast majority of coding tasks — at roughly 2x the cost. The premium buys you quality on the hardest 2% where other models struggle. The practical move: implement model routing and send only your toughest problems to Fable. (324 likes | 15 RTs) Read more →

FrontierCode benchmark shows over half of SWE-Bench solutions fail real code review. Latent Space digs into FrontierCode, a benchmark that evaluates code quality rather than just "does it pass tests." The finding is damning: over 50% of SWE-Bench solutions that technically pass would be rejected in a real code review. If you're choosing models based on SWE-Bench alone, you're optimizing for the wrong thing. Read more →

DeepMind lays out its European robotics investment strategy. Google DeepMind announces a dedicated push into Europe's robotics ecosystem — research partnerships, investment, and infrastructure at a time when VLA-JEPA and embodied AI are advancing rapidly. The bet: Europe's manufacturing density makes it the right testbed for real-world robotics deployment. Read more →


💡 INSIGHT

Karpathy calls Fable 5 a major-version-bump step change.

Andrej Karpathy weighs in with the day's highest-engagement independent take: Fable 5 represents a step change of the same order as Claude 4.5 last November. He specifically highlights its performance on long problem-solving sessions tackling very difficult tasks — the kind of sustained reasoning that separates tool-like models from genuinely capable collaborators. When Karpathy says "major version bump," the industry listens. (14,840 likes | 1,347 RTs) Read more →

Anthropic's Claude Code lead says "a third era quietly started today." Felix Rieseberg, who built Claude Code and Cowork, frames Fable 5 as the beginning of a new phase where models shift from tools you direct to collaborators you delegate to. Coming from the person who architects Anthropic's flagship developer tools, this isn't marketing — it's a product vision statement about how human-AI interaction fundamentally changes when the model is good enough to sustain autonomous work. (4,983 likes | 331 RTs) Read more →

Apple pulls AI-powered Siri from the EU after failing to get a regulatory exemption. Apple decided not to roll out its AI-enhanced Siri in the EU after the European Commission denied an exemption request. The EU AI Act now has its highest-profile casualty — and every company planning an AI product launch in Europe just got a concrete reminder that compliance isn't optional. (343 likes | 575 RTs) Read more →


🏗️ BUILD

One dev replaced an OCR pipeline with GPT-5.5 to translate 23K Chinese research papers. A single developer ditched a complex OCR-to-translation pipeline and pointed GPT-5.5 at 23,000+ ChinaRxiv papers instead — getting more complete, higher-quality English translations with less code. It's a clean example of LLMs collapsing multi-step traditional pipelines into a single inference call, and the translated papers are now freely available. (729 likes | 58 RTs) Read more →


🎓 MODEL LITERACY

Tokenizer Divergence: When you switch model families, your costs can change dramatically even if your prompts don't. Fable 5 uses the Opus 4.7 tokenizer, which produces roughly 30% more tokens for the same input text compared to pre-Opus-4.7 models. That means the exact same prompt costs 30% more in tokens — not because the model is slower or greedier, but because it literally counts your text differently. Different model families slice text into tokens using different vocabularies, and there's no universal standard. Before switching providers or upgrading models, always re-estimate costs using the provider's token counting API with the specific model ID — your old token counts are meaningless on a new tokenizer.


⚡ QUICK LINKS

  • Claude Code v2.1.170: Ships Fable 5 access, fixes VS Code transcript saving, adds nested subagents. Link
  • Codex CLI 0.139.0: Web search now works in code mode, richer MCP tool schemas. (152 likes | 10 RTs) Link
  • Fable 5 refusal concerns: Viral HN post claims Fable may refuse to help competitors build competing products — test your use case. (314 likes | 141 RTs) Link
  • Flourish raises $500M: Bezos backs brain-inspired AI startup targeting continuous learning on 50 watts. (139 likes | 14 RTs) Link
  • KAN on FPGAs: Kolmogorov-Arnold Networks implemented on FPGAs for ultrafast edge inference. (139 likes | 15 RTs) Link
  • Agent-built 3D gallery: An agent chained two HuggingFace Spaces to build a 3D Paris gallery — the composability pattern matters more than the demo. Link

🎯 PICK OF THE DAY

A third era quietly started today — and it rewires how we build software. Felix Rieseberg's framing isn't hype. When models sustain multi-hour autonomous sessions with self-verification loops, the developer relationship fundamentally shifts from "use a tool" to "delegate to a collaborator." That's not a productivity improvement — it's an architectural change. Today's Fable 5 launch, combined with nested subagents in Claude Code and the multi-agent speed data from Mythos, paints a coherent picture: the model layer is now good enough that the bottleneck moves from "can the AI do it" to "can we design systems that let it work independently." That means rethinking how we decompose tasks, how we define verification criteria, and how we structure codebases for agent navigability — not just how we write prompts. The teams that internalize this shift first will build fundamentally different software than teams still treating AI as autocomplete with extra steps. (4,983 likes | 331 RTs) Read more →


Until next time ✌️