OpenAI Ships Lockdown Mode — First Platform-Level Prompt Injection Defense
🧠 LAUNCH
OpenAI Ships Lockdown Mode — First Platform-Level Prompt Injection Defense
The first major platform-level defense against prompt injection is live. Lockdown Mode restricts what data the model can access and exfiltrate during a session — think of it as a sandbox for the conversation itself. Not foolproof (no prompt injection defense is), but it's a meaningful layer for enterprise deployments handling PII or proprietary data. If you're running ChatGPT in any environment where a user could paste untrusted text, enable it now. Read more →
💡 INSIGHT
25+ Open-Weight Models in One Week — LeCun Calls It the Most Insane Drop Ever
Yann LeCun amplified what many of us felt: the past seven days saw 25+ notable open-weight model releases. This isn't a blip — it's the new cadence. The competitive moat for closed providers is shrinking by the week, and if you haven't reviewed the full list, you're almost certainly missing models relevant to your stack. (2,590 likes | 381 RTs) Read more →
Jane Street Now Designs in Claude Code More Than Figma
Jane Street — a firm that optimizes every microsecond of its trading systems — publicly says Claude Code has replaced Figma as their primary design tool. This isn't a solo developer testimonial; it's a quant trading firm that ruthlessly eliminates inefficiency telling you the design→code round-trip is dead. Their workflow patterns are worth stealing. (246 likes | 224 RTs) Read more →
Mythos Pricing Rumored at $400/M Tokens — GPT 5.6 and Gemini 3.5 Become the Practical Frontier: Fresh pricing intelligence says Mythos may cost $400/M tokens — up from earlier $70/M leaks. If true, Mythos is a specialized capability play, not a general-purpose model. For most workloads, GPT 5.6 and Gemini 3.5 are your practical frontier. (579 likes | 15 RTs) Read more →
Mollick: Implementation Is Now Cheap — Unique Ideas Are the Scarce Resource: Ethan Mollick frames the new AI economics: implementation costs are collapsing, making unique ideas — not execution ability — the scarce input. If you have a backlog of "too hard to build" projects, dust them off. (575 likes | 28 RTs) Read more →
WWDC 2026 Preview: Siri's Overhaul and Apple Intelligence Finally Ship: Apple's AI strategy has been the slowest-moving among the big players — WWDC 2026 is where they try to close the gap. Expect Siri's long-awaited overhaul and new on-device Apple Intelligence features that could matter for iOS developers. Read more →
📝 TECHNIQUE
Multi-Agent Lite Swarms: Let Opus Plan, Let Cheap Models Execute — 10x Cost Reduction
The cost arbitrage pattern is becoming standard: let Opus 4.8 or GPT 5.5 handle planning and routing, then dispatch execution to cheap models like Deepseek Flash or Gemma. The claim is 10x cost reduction with comparable task completion. The insight isn't new, but the tooling to implement it cleanly finally exists — if you're running heavy agentic loops and paying frontier prices for every step, you're overspending. (292 likes | 17 RTs) Read more →
OpenAI Publishes Dozens of Real-World Automation Workflows: Not vaporware demos — concrete patterns showing how teams automate actual work with the API. Browse these before building your next automation; someone's probably solved your exact use case already. (520 likes | 40 RTs) Read more →
Codex's Five Primitives: Automations, Worktrees, Skills, Plugins, Sub-agents: A clean breakdown of Codex's core architecture. The key insight: markdown + linear state tracking is all you need. If your agent framework is more complex than these five concepts, you're probably overengineering it. (491 likes | 23 RTs) Read more →
🔬 RESEARCH
Sakana AI Launches RSI Lab — The First Lab Chartered for Recursive Self-Improvement: Sakana AI's RSI Lab is now actively recruiting and publishing research directions. This is the first lab explicitly dedicated to open-ended self-improving AI — a research direction most labs pursue quietly but none have branded as their core mission. (184 likes | 26 RTs) Read more →
Reports: Anthropic's Mythos Found Zero-Days in Every Major Browser and OS: Unconfirmed reports claim Anthropic's internal Mythos model discovered previously unknown vulnerabilities across all major browsers and operating systems — and Anthropic withheld release to allow patching. If accurate, this is the first confirmed case of an AI model performing frontier-level vulnerability research at scale. (36 likes | 2 RTs) Read more →
🔧 TOOL
Pentest-AI: 205 Security Tools Wrapped in One MCP Server: An MCP server wrapping 205 offensive security tools with 17 specialist agents covering the OWASP Top 10 — no API key needed on the MCP path. Turns any Claude Code session into a full pen-testing workstation. (62 likes | 7 RTs) Read more →
Anthropic Python SDK v0.107.1 Fixes Foundry Auth: If you're using Anthropic's managed Foundry deployment, update now — this patch fixes API key auth that could silently break your pipeline. pip install --upgrade anthropic. Read more →
MicroPython Compiled to WASM — Run Python in Browser Sandboxes: Simon Willison ships MicroPython compiled to WebAssembly — execute Python in the browser with no server round-trips. A clean solution for AI tools that need to safely run user-provided code client-side. Read more →
🏗️ BUILD
Unsloth Ships Gemma 4 12B QAT in GGUF — Local Multimodal Fast Path: The community made Google's quantization-aware checkpoints runnable on llama.cpp and Ollama within days of release. ollama run gemma4:12b and you've got local multimodal inference that actually preserves quality at 4-bit — because QAT models were trained expecting quantization. (120 likes | 85.8K downloads) Read more →
Baoyu-Design: Run Claude Design Locally as a Cursor/Claude Code Skill: An open-source tool that brings Claude Design capabilities into your IDE — produce UI mockups, prototypes, and wireframes as self-contained HTML without needing claude.ai. Works as a local skill in both Cursor and Claude Code. (249 likes | 17 RTs) Read more →
Claude Agent + Obsidian: A Brain That Learns on Its Own via MCP: A practical MCP implementation connecting AI agents to Obsidian vaults — agents pull context, do work, and write learnings back. The "AI with persistent memory via markdown files" pattern is proving surprisingly effective for knowledge workers. (171 likes | 22 RTs) Read more →
🎓 MODEL LITERACY
Quantization-Aware Training (QAT): When you shrink a model from 16-bit to 4-bit precision after training (post-training quantization), you're forcing weights into bins they were never optimized for — quality degrades unpredictably. QAT flips this: the model trains knowing it will be quantized, learning weight distributions that survive precision reduction gracefully. Google's Gemma 4 QAT checkpoints are already running locally via Unsloth's GGUF conversions, and the quality gap versus full-precision is remarkably small. This is why the open-weight flood matters practically: QAT makes 12B-parameter models genuinely useful on laptops, not just technically runnable.
⚡ QUICK LINKS
- 436 Upvotes Demand Claude Desktop for Linux: The developer community has spoken — loudly. (436 likes | 247 RTs) Link
- Claude Code v2.1.168: Two patch releases in one day signal active maintenance. Link
- Anthropic TS SDK Fixes Bedrock Middleware Execution Order: Middleware now runs before request signing — silently broken custom auth flows are fixed. Link
- Ollama v0.30.7 Adds Native Zod JSON Schema Support: Structured output from local models just got easier. Link
- VibeOS: An AI-Native Operating System Concept Hits HN: Early-stage, but the thesis that AI should be a system-level primitive — not an app bolt-on — is gaining traction. (19 likes | 15 RTs) Link
🎯 PICK OF THE DAY
Jane Street designs in Claude Code more than Figma now. When a firm that optimizes every microsecond of its trading systems voluntarily replaces a visual tool with a text-based AI, it signals that the design→code boundary is collapsing. Jane Street isn't a startup chasing vibes — they're a quantitative trading firm where inefficiency costs real money, and they've concluded that prompting Claude is faster than pushing pixels in Figma. The implication is stark: "design" is becoming a prompting skill, not a visual-tool skill. The firms that recognize this will ship 10x faster than those still round-tripping through mockup tools, getting stakeholder approval on static screenshots, then translating approved mocks into code. The mockup step was always a translation layer — and AI just made it unnecessary. Read more →
Until next time ✌️