NewsletterBlogLearnCompareTopicsGlossary
INSIGHTLAUNCHTOOLTECHNIQUERESEARCH

25 items covered

Gemini 3.5 lands: Google's sharpest model yet goes all-in on agents

🧠 LAUNCH

Gemini 3.5 lands: Google's sharpest model yet goes all-in on agents.

Gemini 3.5 is Google's new frontier model family, and the pitch is clear β€” agents and coding. Flash is positioned as the strongest model yet for developer workflows, going head-to-head with Claude and GPT on the tasks that actually matter to builders. Early benchmarks look competitive, but the real test is whether it holds up in production agentic loops where context management and tool use matter more than leaderboard scores. Read more β†’

Gemini Omni: Google's first create-anything-from-anything model.

Gemini Omni is Google's play for the multimodal creation crown β€” a single model designed to take any input and generate any output, starting with video. It combines Gemini's reasoning with generative media systems, which means you can go from text to video, image to 3D, or audio to animation without stitching together separate pipelines. This is Google saying the future of AI isn't just understanding β€” it's creating. (6,246 likes | 879 RTs) Read more β†’

Google I/O 2026: the full AI product carpet bomb. Gemini 3.5, Gemini Omni, Project Genie, Chrome DevTools for agents, Workspace AI upgrades β€” the full I/O collection is the largest single-day AI product drop Google has ever made. If you build on any Google platform, block an hour to scan the list. Read more β†’

Project Genie turns Street View into playable 3D worlds. Google combined its world-simulation model Genie with Street View data to generate interactive 3D environments of real places β€” now available to AI Ultra subscribers globally. It's a tech demo that hints at something much bigger: AI-generated spatial computing content from existing mapping data. Read more β†’

OpenAI launches compute futures: reserve your GPU capacity in advance. OpenAI now lets enterprise customers pre-reserve compute capacity through Guaranteed Capacity contracts β€” essentially forward contracts for AI inference. This is OpenAI treating compute like a commodity market, and it's a direct response to the rate limiting pain that's been pushing customers to multi-provider setups. (1,701 likes | 133 RTs) Read more β†’


πŸ”§ TOOL

Claude agents break out of the cloud: self-hosted sandboxes hit public beta.

Claude Managed Agents just removed the top enterprise blocker for agentic AI. Self-hosted sandboxes (public beta) let you run Claude agents inside your own infrastructure perimeter β€” your VPC, your compliance boundary, your data. MCP tunnels (research preview) go further, connecting agents to private-network MCP servers without exposing them publicly. If "data can't leave our network" has been your team's reason for not deploying agents, that reason just evaporated. (6,377 likes | 497 RTs) Read more β†’

OpenAI adopts Google's SynthID: rivals cooperate on AI image provenance. OpenAI is adding C2PA Content Credentials plus Google's SynthID watermarking to all AI-generated images, with a public verification tool. Two companies that compete on everything else just agreed on how to label synthetic media β€” that's a significant industry moment for AI trust infrastructure. (3,020 likes | 258 RTs) Read more β†’

Chrome DevTools gets an agent mode for AI-assisted debugging. Your coding agent can now interact with the browser directly β€” testing code, emulating users, and catching bugs using Chrome DevTools capabilities before shipping. This closes a major gap in agent-driven web development where agents could write code but couldn't see what it looked like in a browser. (55 likes | 8 RTs) Read more β†’

Gemini for Science: purpose-built AI tools for researchers. Google launched a new suite of science-specific AI tools β€” expanding the scale and precision of scientific exploration with Gemini capabilities built specifically for research workflows. Part of the broader I/O push to position Gemini as the default AI layer across every domain. (823 likes | 143 RTs) Read more β†’


πŸ”¬ RESEARCH

DeepMind's AI discovers genetic switches that reverse cellular aging. DeepMind's Co-Scientist AI tool helped biologists find novel genetic factors that successfully rejuvenate human cells β€” not in simulation, in actual experiments. This is one of the most concrete AI-for-science results published to date, moving from benchmark bragging to genuine biological discovery that could reshape aging research. Read more β†’

Cloudflare stress-tests Anthropic's Mythos across 50 real repos. Cloudflare's security team spent weeks running Mythos against their own production codebase β€” the first major independent, real-world security audit outside Anthropic. Results validate last week's 11x vulnerability discovery claims, which is significant because Cloudflare's repos are battle-hardened code, not CTF challenges. (3,785 likes | 671 RTs) Read more β†’

HuggingFace's Carbon: a DNA model 275x faster than anything else. Carbon is fast enough to process an entire human genome efficiently β€” and that kind of speedup doesn't just accelerate existing workflows, it enables entirely new genomics applications that were previously computationally impossible. (768 likes | 121 RTs) Read more β†’


πŸ’‘ INSIGHT

Karpathy joins Anthropic β€” the talent war's biggest move this year.

Andrej Karpathy β€” former Tesla AI lead, OpenAI founding team, YouTube educator with millions of followers β€” is joining Anthropic. This isn't just a hiring announcement; it's a seismic signal about where the frontier research momentum is shifting. Karpathy could have gone anywhere (or stayed independent with his massive audience), and he chose Anthropic. For a company already shipping at an aggressive pace, adding someone who's built AI systems at Tesla-scale and shaped research at OpenAI is the kind of move that compounds. Watch what he ships. (115,179 likes | 8,757 RTs) Read more β†’

KPMG deploys Claude to 276,000 employees β€” Anthropic's Big Four sweep continues. KPMG integrating Claude across its entire workforce is one of the largest single-company AI rollouts ever announced. Combined with PwC's expanded partnership, Anthropic now has two of the Big Four consulting firms deploying Claude at scale. The pattern: large enterprises are picking one frontier provider and going deep, not hedging across three. Read more β†’

Anthropic acquires Stainless: vertically integrating the developer surface. Stainless already powers Anthropic's SDKs and MCP servers β€” this acquisition formalizes vertical integration of the developer experience layer. When your SDK provider is also your model provider, the API surface gets tighter and iteration gets faster. Read more β†’

Mistral acquires Emmi AI as European AI consolidation accelerates. Mistral AI acquiring Emmi AI signals the same vertical integration playbook hitting Europe β€” building a full-stack AI platform, not just models. The timing, days after Anthropic's Stainless acquisition, suggests the industry has collectively decided that models alone aren't the moat. (144 likes | 34 RTs) Read more β†’

PwC goes deeper on Claude for deals, tech builds, and enterprise ops. PwC is expanding its Claude deployment to build technology, execute deals, and reinvent enterprise functions β€” reinforcing the Big Four pattern alongside KPMG. Two down, two to go. Read more β†’


πŸ“ TECHNIQUE

The unreasonable effectiveness of HTML as Claude Code's output format. Anthropic's own team discovered that HTML is a surprisingly effective output format for Claude Code workflows β€” better than markdown or plain text for many agent tasks. The key insight: HTML's structure gives the model natural guardrails for layout and formatting that reduce hallucinated formatting artifacts. Try it on your next agent output pipeline. Read more β†’

TLA+ meets LLMs: formal verification enters the AI coding era. A practical guide to using AI to write and verify TLA+ formal specifications. As AI-generated code scales, formal verification becomes more important, not less β€” and LLMs turn out to be surprisingly good at writing specs once you know how to prompt them. If you have concurrent systems, this is worth your afternoon. (103 likes | 25 RTs) Read more β†’


πŸŽ“ MODEL LITERACY

Multi-Token Prediction (MTP): Standard language models generate text one token at a time β€” predict the next word, append it, repeat. Multi-Token Prediction flips this by training the model to predict several tokens ahead simultaneously, then verifying them in parallel during inference. The result is dramatically faster generation without sacrificing quality, because the model is already "thinking ahead" during training. Today's llama.cpp speed gains for Qwen3.6 come from MTP support, and as more models adopt it natively, understanding MTP helps you evaluate local-model performance claims β€” a model with MTP isn't cheating the benchmarks, it's genuinely doing less redundant work per output token.


⚑ QUICK LINKS

  • llama.cpp + MTP: Multi-Token Prediction support for Qwen3.6 delivers significant local inference speed gains. (1,160 likes | 178 RTs) Link
  • Claude Code v2.1.145: JSON session listing, OTEL trace improvements, status line upgrades. Link
  • Ettin Reranker: HuggingFace launches a new reranker family for RAG pipelines. Link
  • Gemini 3.5 Flash pricing: Simon Willison's analysis β€” 3x its predecessor but still undercuts GPT-5.5. Link
  • ByteDance Lance: Another any-to-any multimodal model enters the ring. (301 likes | 171 downloads) Link
  • Google Antigravity 2.0: New Google DeepMind product announced at I/O. Link

🎯 PICK OF THE DAY

DeepMind's cell rejuvenation result is AI-for-science graduating from benchmarks to biology. Forget the model launches for a second β€” the most important thing that happened this week might be a biology paper. DeepMind's Co-Scientist AI tool helped researchers discover novel genetic factors that actually reverse cellular aging in human cells. Not predicted, not simulated β€” experimentally validated. This is the moment AI-for-science graduates from benchmark theater to actual biological discovery, and the implications ripple far beyond one paper. It reframes how we should fund and evaluate AI research: not by how well a model scores on academic benchmarks, but by whether it surfaces insights that human researchers wouldn't have found alone. The aging research community has been chasing reprogramming factors for over a decade β€” AI just accelerated that search by orders of magnitude. If this result replicates and extends, it changes the cost-benefit calculus for every pharma company evaluating AI research partnerships. Read more β†’


Until next time ✌️