TTY-changelog #045
Claude Opus 4.8 dropped, ESMFold2 opened protein design, China biotech protectionism split the room, and humanoid robotics got dismissed as hype.
👉 Article originally posted on TTY
TTY Party
Short version: Please join us the 17th of June starting at 6pm in Paris to celebrate. Location TBD. Long version here.
Autonomous Agents
🗺️ PEEK caches long-context agent maps – PEEK built and cached a compact context map summarizing large, recurring corpora like codebases or document sets, so LLM agents could quickly locate relevant parts, avoid re-reading everything, and answer long-context queries more accurately and cheaply.
It beat strong baselines by 6.3 to 34 percent on long-context reasoning and aggregation tasks.
It used 93 to 145 fewer iterations and up to 5.8 times lower cost than the prior state of the art.
The gains held across different language models and agent architectures, including OpenAI Codex.
Biotech, Health, and Chemistry
🇨🇳 China competition divided biotech leaders – A widely shared Maraganore tweet argued US biotech leaders complaining about Chinese rivals should compete on merit instead of pushing protectionist policy. The thread weighed whose interests the anti-China framing actually serves. (article mentioned in the tweet)
Community take: Felix Raimundo saw the anti-China push as US biotechs that were losing trying to protect themselves, wrapping “save high-paying jobs” and “security risk” in patient-safety language. Americans, he said, love “merit and free-market competition until they face serious competition and lose.” Maraganore, the former Alnylam CEO who pioneered siRNA drugs, was essentially saying: innovative drugs do not fear copycats. The real fight, Felix added, was not about safety, since Chinese drugs still go through FDA approval and US trials. It was about China potentially blocking innovative drugs during trade disputes, and the US biotech sector hollowing out the way electronics did when manufacturing moved to Shenzhen. Ihab Bendidi agreed many US biotechs were complaining for the wrong reasons but felt Chinese regulatory shortcuts were genuinely dangerous. He preferred labeling drugs by risk and letting the market decide. He also noted US tech plays the same game, asking for rules on competitors “because only they can be trusted with AI.” Felix closed by pointing to an old industry trick: piling on compliance costs that giants can absorb but that block startups from entering at all.
🔬 Reading foundation models in biology – Great take from Julien Duquesne. Sparse autoencoders trained on protein embeddings recovered thousands of interpretable features per layer, far beyond what raw neurons exposed. Applied to single-cell models, the method validated concepts causally through counterfactual gene perturbations, not correlation.
🧬 ESMFold2 advanced protein design – An open scientific engine claimed state-of-the-art results on protein interactions and antibodies, with validated miniprotein binders across five cancer and immunology targets. It shipped with an atlas of 6.8 billion proteins and 1.1 billion predicted structures.
A world model of protein biology emerged purely from language modeling over billions of sequences.
Mechanistic interpretability tools borrowed from LLM research exposed how the model represents biological concepts.
A simple gradient-based search over the model surfaced high-affinity binders.
🧪 AlphaFold’s out-of-distribution blind spots – A widely shared interview argued that structure prediction is not biological inference. The hardest cases sit out of distribution: quantum protein fluctuations, post-translational modifications, and conformational ensembles the training data never captured.
Community take: Khalil Ouardini flagged the podcast as articulating the importance of uncertainty quantification in out-of-domain settings for foundation models, with AlphaFold on proteins subject to quantum fluctuations as a sharp example.
Image, Video & 3D
🖼️ PiD bridges latent models – PiD turns compressed image data into high-res pictures faster by combining decoding and upscaling in one step, instead of doing them separately. It can make 512x512 images into 2048x2048 in under 1 second with good quality.
Cyber
🔧 Build narrow dependencies yourself – Redis’s creator endorsed building narrow-scope dependencies in-house, extending a fork-and-trim argument. The 2026 twist: an LLM can merge security patches and test dependencies for flaws upstream maintainers may ignore. Julien Danjou’s companion essay framed the real question as which dependencies to drop, not whether to build.
Community take: Amine Saboni kicked off the thread asking whether forking dependencies works for smaller teams that cannot afford to rebuild and maintain their own tooling. Pierre Chapuis agreed he always tried to minimize dependencies and pointed at Julien Danjou’s piece. Robert Hommes suggested artifact scanning platforms like Cloudsmith as a middle ground. Amine pushed back: the proposed mitigation was to remove supply chain links, not scan them, so where do you set the threshold? Robert conceded the tradeoff between functionality and security. Pierre later flagged a small HPSv2 PR as a clear case for a rewrite.
🛡️ Multi-agent system found real zero-days – A multi-agent LLM system built on Google’s OSS-Fuzz automated vulnerability discovery with reproducible verification. A control-flow abstraction localized bugs at the right granularity, while dual-layer fuzzing and analysis tools reasoned across functions.
It hit a 90 percent detection rate, finding 36 of 40 bugs at the AIxCC 2025 final.
In the wild it surfaced 29 zero-days across 12 open-source projects, all confirmed and fixed.
Every reported vulnerability was fuzzer-reproducible, cutting the false positives that plague LLM detectors.
Infrastructure
🛡️ Edgee added fallback models – Sacha’s Edgee launched its Fallback Models feature on Product Hunt. When Claude cannot run a request (outage, plan limit, or programmatic credit cap from June 15), the layer routes it automatically to Kimi K2.6, GLM, Qwen, Gemma, or to a Bedrock, Vertex, or Azure account. Works with copilot, codex, and opencode.
Community take: Kevin Kuipers asked how Edgee evaluated routing quality. Sacha Morard said they ran SWE benchmarks to verify no drift from the compressor and planned the same for fallback models, while admitting “human intuition” remains the most common indicator of effectiveness in practice.
⚡ Inference engine prioritized decode speed – An inference preview reached 3,000 output tokens per second per request on eight AMD MI300X GPUs and 2,100 on eight NVIDIA H200s, in FP16 with no speculative decoding. Its claim: agents run sequential loops, so decode speed per request beats aggregate throughput.
Language Models
🤖 Claude Opus 4.8 is out – Anthropic upgraded its flagship at the same price, with sharper agentic judgment and user-controllable effort on claude.ai. The headline change was reliability: the model was roughly four times less likely to let code flaws slip past unflagged.
On the Super-Agent benchmark it was the only model to finish every case end-to-end, allegedly matching GPT-5.5 on cost.
Claude Code gained a dynamic workflows mode aimed at very large-scale problems.
Fast mode claims runnning at 2.5 times speed for three times less cost than prior models.
📱 Quantized diffusion runs on phones – Heavily compressed image models brought diffusion to laptops and phones. The 1-bit variant shrank to 0.93GB and the ternary variant to 1.21GB, both staying competitive on composition and prompt fidelity. A companion iOS app ran generation fully on-device.
MLOps
💡 Fine-tuning versus long-context personalization – A revived community thread argued adding experts (MLPs) or LoRA adapters is more effective for LLM personalization than feeding 1M tokens per prompt, especially as models get larger. At $0.10 per million tokens, 2000 requests a week becomes a $5 vs $200 split for the same outcome.
Community take: Glenn Sonna endorsed the argument. Kevin Kuipers pushed back that fine-tuning investment can become obsolete when the model is replaced months later (for objective reasons, or just perception), unless the use case is static enough for a LoRA. Jeremie Kalfon countered that internal representations may not change much between Opus 4.6 and 4.8, so the same fine-tuning neurons could potentially be reused, with distillation or per-layer optimization keeping personalization viable across model upgrades. He added that in-context learning is short-term memory and fine-tuning is long-term, and that he was tired of agents that only possess short-term memory. A $195 per week per user spread, he noted, is not limited value.
Programming
🕸️ Graph indexing tools for codebases – Two graph-based codebase indexing tools, codegraph and GitNexus, drew attention for high star counts. Both expose pre-indexed knowledge graphs to coding agents like Claude Code, Codex, Cursor, and Gemini, claiming fewer tokens and tool calls than text-based retrieval.
Community take: Antoine Sueur asked whether anyone in the community had real success with these tools, flagging that high star counts had not translated into evidence of positive impact. Youssef Tharwat (currently building Noodlbox - codebase context, built from code) confirmed graph-based indexing does work and said codegraph looked good without having tried either, but noted the analyses on both projects were still shallow. Kevin Kuipers also flagged that Leonard Scheidemantel at Sequa is building something similar.
🌐 Built-in AI shipped in Chrome – A Google I/O writeup walked through Chrome’s on-device AI APIs, which skip cloud cost and latency while keeping data local. The Summarizer API generated headlines and meta descriptions, and the Prompt API used JSON Schema for tagging and comment moderation.
Community take: Raymond Rutjes saw the on-device APIs as a way to augment features without paying the cost-tax of cloud inference.
Robotic, World AI
🦾 Open robotics simulation rebuilt fast – An open-source simulation platform rebuilt its stack for physical AI with a penetration-free contact solver, unified rigid and deformable physics, and a path-traced renderer. A GPU compiler forked from Taichi gave 10x faster launch and up to 4.6x runtime over the first release.
🤖 Humanoid robotics dismissed as hype – Arnaud Thiercelin reported that across recent conversations with peers and tier-1 SV VCs, the consistent characterization of humanoid robotics was “useless hype” with no real-world impact, though interest in specialized robotics solutions had risen as a side effect.
Community take: Arnaud argued generic robots will always lose to specialized ones. Humans, he said, are the apex multi-role machine and trying to replicate us is futile, while specialized machines are where the real potential sits. Kevin Kuipers partly agreed but pushed back that humanoids carry the advantage of fitting tools and environments designed for humans (stairs, appliances, vehicles), making the path harder but the reward 100x bigger.
Other topics
📚 Anti-Agent redesigned with new features – Another cool update from Louis Manhes on his Anti-Agent shipped a complete UI redesign, mobile-first throughout. Three learning blocks now live inline in any page: flashcards, Socratic dialogues, and new graded exercises that work for languages or code. Public sharing is enabled, with example curricula on human evolution, beginner French, and Rust.
TTY Lunch
Each week, TTY Lunch brings together exceptional builders around the table. Today’s lineup included some of the best talent in deep tech right now with Aram Adamyan, Ihab Bendidi (Recursion), Julien Duquesne (Scienta), Khalil Ouardini (ex-OWKIN), Victor Schmidt (Entalpic), Victoria Latynina (Sanofi), and of course Willy Braun to drive the insightful discussion.
New members
🇫🇷 Clément Castellon – Co-founder @ Candide · Research lab building infrastructure to make the AI research cycle measurably faster than what currently exists. Research Engineer at Sorbonne, ENS Ulm, and ERC ModERN by day, building Candide with co-founder Arindam Biswas in parallel. ICML 2026 workshop paper on Ramanujan-graph sparse initialization shipping now. Background across physics, math, CS, and digital humanities. Chess, Dune Imperium, and an unending search for music that fuses orchestral and electronic without sounding like a film trailer. Special power: peaked top 90 worldwide on GeoGuessr and does OSINT on the side, give him a photo and he can usually tell you where it was taken. 📍 Paris, France
Contributors This Week
Félix Raimundo, Arnaud Thiercelin, Jeremie Kalfon, Ihab Bendidi, Pierre Chapuis, Quentin Dubois, Amine Saboni, Robert Hommes, Antoine Sueur, Glenn Sonna, Louis Manhes, Sacha Morard, Tejas Chopra, Victoria Latynina, Youssef Tharwat, Benoit Kohler, Clement Castellon, Julien Seveno-Piltant, Khalil Ouardini, Louis Choquel, Maziyar Panahi, Raymond Rutjes







