NewsletterBlogLearnCompareTopicsGlossary
TECHNIQUELAUNCHRESEARCHTOOLINSIGHTBUILD

23 items covered

Claude Code Weekly Limits Jump 50% Through July 13

🧠 LAUNCH

Claude Code Weekly Limits Jump 50% Through July 13

Your Claude Code sessions just got a lot longer. Anthropic is raising weekly usage limits by 50% across all paid tiers, effective now through July 13. If you've been rationing your Claude Code usage or hitting walls during deep coding sessions, that friction is gone β€” for the next two months, at least. The timing isn't accidental: this lands the same day OpenAI starts offering free Codex trials. (12,664 likes | 1,091 RTs) Read more β†’

Anthropic Bundles Agent SDK Credits Into Every Paid Plan Starting June 15

Starting June 15, every paid Claude plan includes a monthly Claude Agent SDK credit covering scripts, claude -p calls, and third-party apps like OpenClaw. This is Anthropic subsidizing its own ecosystem β€” making it cheaper for developers to build agent workflows on Claude instead of treating the SDK as an extra cost center. If you've been running agent scripts and wincing at the bill, plan your usage around the June 15 rollout. (757 likes | 15 RTs) Read more β†’

NVIDIA Drops AnyFlow β€” Flexible-Step Video Diffusion on HuggingFace. NVIDIA's first any-step video diffusion model lets you trade quality for speed at inference time by adjusting step counts on the fly. Available now on HuggingFace β€” a serious open-weight option for text-to-video that doesn't lock you into a fixed compute budget per frame. (203 likes | 30 RTs) Read more β†’

holaOS Beta Ships the First Agent-Native Operating System. holaOS just launched Beta 0.1 β€” a purpose-built OS designed from the ground up for AI agents, not humans using AI tools. It's an early and ambitious bet: instead of bolting agent capabilities onto Linux or macOS, holaOS treats agent processes as first-class citizens with native scheduling, permissions, and resource management. (301 likes | 150 RTs) Read more β†’


πŸ”§ TOOL

Claude Code Desktop Now Defaults to Remote Control On. The Claude Code desktop app now enables remote control by default β€” meaning you can monitor and interact with your coding sessions from another device without toggling settings. A small UX change that removes a step everyone was doing manually anyway. (388 likes | 20 RTs) Read more β†’

Claude Code v2.1.141: Desktop Notifications, Summarize-Up-To-Here, 61 Changes. The latest Claude Code release packs 61 changes including a terminalSequence hook for desktop notifications, clearer auto-mode permissions, and a "Summarize up to here" command for managing long conversation contexts. The notification hook alone is worth the update if you run long agent sessions in the background. Read more β†’

Anthropic Python SDK v0.102.0 Adds Cache Diagnostics and Managed Agent Types. The Anthropic Python SDK now ships BetaManagedAgentsSearchResultBlock types and a cache diagnostics beta. If you're building managed agents or trying to debug why your prompt caching hit rate is lower than expected, these are the APIs you've been waiting for. Read more β†’


πŸ“ TECHNIQUE

Anthropic Publishes the Official Playbook for Computer Use With Claude. If you're building browser automation agents, stop guessing. Anthropic just released a comprehensive best-practices guide covering screenshot resolution, action confirmation patterns, error recovery, and the specific prompting strategies that actually work for computer use tasks. This is the reference doc the internal teams use β€” now it's yours. Read more β†’


πŸ”¬ RESEARCH

UK AISI Confirms Mythos Preview Solved Both Cyber Ranges β€” Including the "Impossible" One

The UK AI Safety Institute just confirmed that Mythos Preview is the first model to solve both their cyber evaluation ranges end-to-end β€” including "Cooling Tower," which no model had ever completed. This isn't a benchmark score on a sanitized test set; it's a real-world cyber range designed by government security researchers to be unsolvable by current AI. The capability jump is concrete and measurable. (714 likes | 25 RTs) Read more β†’

The Information: Mythos Shows "Notable Capability Jumps" in Finding Novel Vulnerabilities. Independent reporting from The Information adds context: Mythos isn't just solving known challenges β€” it's showing "notable capability jumps" at discovering previously unknown vulnerabilities. Anthropic hasn't released Mythos widely, opting for a government-approval path that's generating both praise and strategic questions. Read more β†’

LeCun: Reliable Agents Are Impossible Without World Models. Yann LeCun doubles down on his core thesis: LLMs cannot predict the consequences of their actions, which makes truly reliable agentic systems architecturally impossible without world models. It's a direct challenge to every team shipping "autonomous agents" built on pure language models β€” and the timing, amid an agent-infrastructure funding surge, is pointed. (1,443 likes | 194 RTs) Read more β†’

Latent Space Declares "The End of Finetuning." Latent Space argues that finetuning is becoming obsolete as context engineering, prompt caching, and in-context learning close the performance gap. The thesis: why spend weeks training a custom model when you can get 90% of the way there with a well-engineered prompt and cached context? Provocative, but the economics are increasingly hard to argue with. Read more β†’


πŸ’‘ INSIGHT

Sam Altman Offers Two Free Months of Codex to Poach Claude Code Users

Sam Altman is playing hardball: OpenAI is offering two months of free Codex access to any company willing to switch from competing coding tools. The subtext is obvious β€” Claude Code's momentum has OpenAI worried enough to buy market share directly. When the CEO personally pitches a free trial on X, that's not marketing, that's a competitive emergency response. (12,959 likes | 525 RTs) Read more β†’

Microsoft Testifies It Spent More Than $100 Billion on OpenAI. A Microsoft executive testified in court that total OpenAI spending will exceed $100 billion by the end of this fiscal year β€” the $13B investment plus massive infrastructure costs. The scale of the bet is now on the public record, and it puts the Codex giveaway in perspective: when you've spent $100B, two free months is a rounding error. Read more β†’

Mollick: Can Anthropic's Cautious Mythos Strategy Survive If Rivals Ship Without Guardrails? Ethan Mollick asks the uncomfortable question: if Google and OpenAI release equivalent cyber-capable models without government approval gates, Anthropic's responsible-release approach becomes a competitive disadvantage. The policy dilemma is real β€” being the responsible actor only works if responsibility is rewarded. (456 likes | 17 RTs) Read more β†’

Modal Raises at $4.5B as Agent Infrastructure Revenue Surges. Modal's valuation jumps 80% in months as GPU rental and agent infrastructure revenue takes off. The picks-and-shovels thesis for AI agents is generating real revenue now, not projected revenue β€” and the infrastructure layer is consolidating fast. Read more β†’

Ex-DeepMind Researcher Raises $650M for Recursive AI at $4.65B. Tim RocktΓ€schel (ex-DeepMind, UCL professor) just raised $650M for Recursive, a safety-focused AI lab building systems that "experiment on how to safely improve themselves." The team pulls from Meta, OpenAI, and Google. The safety-first startup category is attracting serious capital β€” $650M serious. (143 likes | 16 RTs) Read more β†’


πŸ—οΈ BUILD

Run Qwen3-35B as a 24/7 AI Researcher on Your Laptop β€” For Free. Qwen3-35B with mixture-of-experts runs locally via llama.cpp at 4-bit quantization β€” giving you a capable research agent that costs nothing to operate. The setup guide from HuggingFace walks through the full pipeline from download to always-on local research assistant. The gap between cloud-only and run-it-yourself AI keeps shrinking. (713 likes | 79 RTs) Read more β†’


πŸŽ“ MODEL LITERACY

Context Engineering vs. Finetuning: Finetuning means retraining a model's weights on your specific data β€” it's expensive, slow, and requires ML expertise. Context engineering is the alternative: instead of changing the model, you change what you feed it β€” crafting prompts, caching repeated context, and structuring in-context examples so the model performs your task without any weight updates. Today's "End of Finetuning" thesis argues that prompt caching (which stores and reuses processed context at a fraction of the cost) and in-context learning (where models learn patterns from examples in the prompt itself) are closing the quality gap with finetuned models. This matters right now because Anthropic's new bundled SDK credits and Claude Code's higher limits both encourage heavier context usage β€” making context engineering cheaper and more practical than ever.


⚑ QUICK LINKS

  • HuggingFace Transformers v5.8.1: Patches DeepSeek V4 integration β€” fixes continuous batching and weight conversion bugs. Link
  • Sakana's KAME: Fast speech model replies instantly while a reasoning model injects knowledge in parallel. (180 likes | 25 RTs) Link
  • Open-Source Is Ruthlessly Out-Innovating the Trillion-Dollar Labs: DeepSeek V4's SSD-based KV cache, TurboQuant compression, Kimi K2 memory optimization β€” the cost advantage is compounding. (256 likes | 31 RTs) Link
  • Mollick: Where Is Google in the AI IDE Race?: Gemini is conspicuously absent from the local AI app race Claude Cowork and Codex are defining. (386 likes | 14 RTs) Link
  • Awesome AI Engineering: 200+ curated resources from Anthropic, OpenAI, Meta, Uber, Shopify, Netflix and more. (334 likes | 53 RTs) Link

🎯 PICK OF THE DAY

Mythos solving both UK AISI cyber ranges isn't just a capability milestone β€” it's a policy stress test. The "Cooling Tower" range was designed to be unsolvable by current AI, and Mythos Preview cracked it anyway. That's impressive engineering, but the real story is what happens next. Anthropic chose a government-approval release path for Mythos β€” sharing capabilities with safety institutes before public deployment. It's the responsible thing to do. But as Mollick points out, if Google or OpenAI ship equivalent cyber capabilities without those guardrails, Anthropic's caution becomes a competitive handicap. The responsible-release strategy only works in a world where responsibility is rewarded, or at minimum, where it doesn't get you lapped. This is the first real-world test of whether "build the most capable model but release it carefully" can survive market pressure. Every AI lab watching this is calculating the same thing: how much market share is safety worth? Read more β†’


Until next time ✌️