TTY-changelog #049
Sakana shipped Marlin and the Fugu orchestrator, OpenAI previewed GPT-5.6 Sol and a custom inference chip, world models advanced, and Anthropic kept poaching Google talent.
👉 Article originally posted on TTY
Events
🇫🇷 ContextCon MCP conference in Paris (December 1-2) – International enterprise summit on MCP and context engineering across six dedicated tracks. 🎤 Apply to speak
🇺🇸 Computer Use Agents Hackathon in San Francisco (July 11) – A computer use agents hackathon by H Company partnering with NVIDIA and Anthropic for agent builders.
🇫🇷 GenerationAI conference in Paris (December 1-2) – Builder focused GenAI and agent experience conference with 150 plus speakers.
Autonomous Agents
🔭 Sakana Marlin autonomous research assistant – Sakana launched its first commercial product, an autonomous research assistant pitched as a virtual chief strategy officer. Given a topic, it can work unattended for up to eight hours, forming hypotheses, gathering and verifying information, and mapping causal relationships into a roughly hundred page strategy report with executive summary slides.
Community take: Victoire Cachoux ran a full test on the maturity of AI virtual cell models in drug discovery, letting Marlin set its own research plan and generating a report plus slides for 10,000 yen (55 euros - the access is gated to Japanese payment cards). Ihab Bendidi, who works in the field, found the plan genuinely interesting and even agreed with how Marlin widened the definition of virtual cell models beyond the current hype. He raved about slide graphics he cannot get close to in Claude or ChatGPT, noted it cited his own lab’s papers, and said the information was accurate on public knowledge while missing the unwritten details practitioners hold implicitly. His main critique was that model selection leaned on online presence rather than true state of the art, so it skipped strong work that shipped quietly, and that it lagged a few months to a year on research focused content while staying very up to date on commercial topics. His verdict: “If it covers a field I know nothing about, I would gladly pay the price. If it is a field I already know, I would not.”
🌐 Qwen-AgentWorld language world model – Alibaba’s Qwen introduced a native language world model simulating seven agent environments, from MCP and terminal to web, OS, and Android, in a single model with environment modeling as the day one objective. It claimed to beat Opus 4.8 and GPT-5.4 on AgentWorldBench.
Biotech, Health, and Chemistry
🧬 John Jumper leaves DeepMind for Anthropic – The Nobel chemistry laureate and AlphaFold co-creator left Google DeepMind to join Anthropic after nearly nine years, landing one day after a Gemini co-lead departed for OpenAI. The hire deepened Anthropic’s life sciences push.
Image, Video & 3D
🎬 ByteDance Seedance 2.5 video model – ByteDance unveiled an upgraded Seedance 2.0 and Seedance 2.5. The headline 2.5 generates 30 second clips in one pass with native 4K, up to 50 reference inputs, and 3D white model support, while the upgraded 2.0 also gained native 4K. It launched alongside a licensed film IP platform with revenue sharing.
Community take w/ Nancy Wang: “It does look amazing. The cost is forecasted to be quite high though, anywhere from 5 to 15 dollars per video. Video generation is still one of these areas where I am willing to suffer some quality loss from open source models because of cost. At 15 dollars per video, it gets hard to iterate.”
Cyber
🛡️ GPT-5.5-Cyber tops CyberGym benchmark – OpenAI released the GPT-5.5-Cyber, claiming state of the art on CyberGym and pitched as a defensive tool built with the US government and the security ecosystem. It also pointed to Patch The Planet and Codex Security as ways to solve security problems, not just find them.
Infrastructure
🌶️ OpenAI Broadcom Jalapeño inference chip – OpenAI and Broadcom unveiled Jalapeño, OpenAI’s first inference chip built from scratch for large model serving. Early tests showed performance per watt well above current hardware, designed to tape out in nine months with gigawatt scale deployment from late 2026.
Language Models
🐡 Sakana Fugu multi-agent orchestrator model – Sakana introduced Fugu, an orchestrator model that calls a pool of other models to build agentic scaffolds on the fly. It claimed state of the art against public models on SWE-Bench Pro and Terminal Bench, shipping as Fugu and the higher quality Fugu Ultra.
Fugu Ultra was pitched as matching Fable and Mythos while sidestepping export control risk, with a 500 user beta on automated data science and security tasks.
A critique argued the base model is essentially a router that lost ground to Opus on SWE-Bench Pro, with no reported output token counts or cost.
☀️ OpenAI previews GPT-5.6 Sol model – OpenAI began a limited preview of the GPT-5.6 family of Sol, Terra, and Luna, with gains in coding, biology, and cybersecurity plus a new max reasoning effort and an ultra multi agent mode. Access started small at the government’s request before a wider rollout.
📱 Liquid LFM2.5-230M runs on phones – Liquid AI introduced its smallest model yet at 230M parameters, built to run fast on CPUs, NPUs, and GPUs for agentic tasks on phones, robots, and home devices. Trained on 19T tokens with a 32K context, it often beat models more than twice its size on tool use.
Programming
🌊 Poolside Laguna XS.2 coding models – Poolside detailed its first Laguna models, the M.1 flagship and the smaller open weight XS.2 under Apache 2.0, both agentic coding models built for long horizon work. They now serve at 256K context and are free for a limited time via the API and OpenRouter.
⚔️ GLM-5.2 versus Claude Opus benchmark – A head to head pitted open weight GLM-5.2 against Claude Opus 4.8 on a one shot WebGL platformer build. Opus finished faster with a cleaner game and could check its own visuals, while GLM-5.2 cost a fraction and stays available as a downloadable open model.
🌉 Skybridge framework for MCP apps – The Product Hunt page for this open source, full stack React framework for building MCP apps that run the same across assistants like Claude and ChatGPT.
Robotic, World AI
🤖 Air powered 3D printed robot – A YouTube deep dive into microfluidic soft robotics built a fully 3D printed walking robot whose control system runs on air rather than wires, starting from a transparent printed chip that behaves like a circuit board for air.
Other topics
💡 Why papers should become data graphs – An essay by community member Jeremy Kalfon argued the scientific paper should stop being the atomic unit of trust, since it bundles data, code, models, tests, and proofs that fail in different ways. It proposed treating papers as human readable views over a graph of data, tools, results, and certificates.
📑 Mistral OCR 4 structures documents – Mistral released OCR 4, which structures documents with bounding boxes, block classification, and per region confidence scores across 170 languages. Blind tests on 600 plus real world documents preferred it over every system tested, and it topped OlmOCRBench.
🏃 More Google researchers join Anthropic – Two more senior researchers who worked on Gemini left Google for Anthropic, extending a run of departures that already included a Gemini co-lead heading to OpenAI and a Nobel laureate joining Anthropic.
Contributors This Week
Victoire Cachoux, Fabien Niel, Julien Seveno-Piltant, Ihab Bendidi, Nancy Wang, Gabriel Olympie, Glenn Sonna, Charly Poly, Félix Raimundo, Quentin Dubois, Robert Hommes, Arnaud Thiercelin, Jules Belveze, Justin Halsall, Pierre Chapuis, Quentin Churet, Raymond Rutjes






