Your Dose of Reg.exe, Week {10}

AI & Deep Tech roundup: AlphaEarth global mapping, vLLM determinism, OpenAI on hallucinations, new agents & speech models, GPU export risks, coding tools, Raidium’s 3D imaging FM.

Sep 13, 2025

Reg.exe is a global closed community of 260+ engineers, founders, and researchers interested in AI innovation, from San Francisco to Tokyo. Each week, we share the highlights of our discussions in a newsletter. If you’d like to join, write to join@welovesota.com

Events

🇫🇷 Inference & vLLM meetup in Paris - Technical meetup covering scaling inference to multiple GPUs, structured generation, and transformers serve with experts from Exxa, .txt and Hugging Face. 📆 September 15 (6:30PM-10PM CET) 👉 Registration
🇫🇷 AI Hackathon Paris - Weekend Hackathon for building cutting-edge AI projects. 📆 September 27-28 👉 Registration

Knowledge

🌍 AlphaEarth Foundations - Google DeepMind's new AI model integrating petabytes of Earth observation data for unprecedented global mapping. (🙏 Fabien Niel)
- Unified Global Mapping with AI: AlphaEarth Foundations is a new AI model that fuses petabytes of diverse satellite Earth observation.
- Efficiency and Accuracy Breakthrough: The model summarizes vast spatial data into compact embeddings while achieving significantly higher accuracy and learning efficiency.
- Real-World Impact and Accessibility: Their annual embeddings are publicly released as a Satellite Embedding dataset on Google Earth Engine.

🧠 Deep Tech: Clearing Up Misconceptions - Willy Braun's analysis rethinking capital, timeline, risk, and returns at the frontier of deep tech. (🙏 Willy Braun @ Galion.exe)
- Deep tech is the original heart of venture capital: Investing in scientific and engineering breakthroughs has always driven value creation and national power in various technological waves.
- Common misconceptions about deep tech debunked are: ❌ Deep tech is not a recent hype; ❌ Deep tech companies burn too much capital and fail more often; ❌ Deep Tech takes too long from revenue to exits'; ❌ Returns aren’t there

Computer Vision

🌅 Finegrain's Flux Kontext LoRA - Finegrain has trained and open-sourced 🙏product placement LoRA for inserting objects into images. (🙏 Pierre Chapuis @ Finegrain)
- 💪 Weights available here
- 🛝 Play with it in the demo space

💥 ComicScene154 Dataset - Carefully annotated dataset for scene-level narrative arcs in comics analysis, combining text and imagery for computational narrative analysis. (🙏 Ivan Yamshchikov @ Pleias)

Language Models

🤯 Breaking LLM Nondeterminism - Deep dive into defeating nondeterminism in LLM inference for reproducible results. (🙏 Kevin Kuipers @ Reg.exe)
- Setting temperature to 0 is not enough to produce deterministic outputs.
- Two culprits: the floating-point non-associativity and the lack of batch invariance in inference kernels.
- The authors defeated nondeterminism by rewriting and patching inference kernels to be batch-invariant.
💊 Why Language Models Hallucinate - OpenAI paper revealing hallucinations as features not bugs, occurring because models are rewarded for guessing over admitting uncertainty. (🙏 Jules Belveze @ Dust)
- Hallucinations come from incentives, not architecture. Models are trained and benchmarked in ways that reward confident guessing.
- Benchmarks are the root cause. Current evaluations give partial credit for wrong answers, pushing models toward overconfidence rather than honesty.
- The fix is incentive alignment, not new tech. Adjust benchmarks to reward calibrated answers and penalize overconfident mistakes.
🧠 Nous Research Hermes 4 - Released reasoning-focused models based on Llama 3.1 with 405B version, known for breaking alignment and strong RefusalBench performance. (🙏 Pierre Chapuis)

Autonomous Agents

⚖️ Evaluatorq, Prompt Evaluator - Orq.ai has release this open-source framework for prompt evaluation the right way.
- ✅ Run multiple evaluation jobs concurrently with automatic error handling
- ✅ Type-safe from data ingestion to result scoring
- ✅ Optional Orq platform integration for visualization & historical comparisons
- 👉 Repository available here

🤖 Open-SWE by LangChain - An open-source, cloud-based asynchronous coding agent built using LangGraph, to autonomously manage complex software development workflows (🙏 Taekmin Kim @ Bg.app)
- 🛟 Taekmim is seeking for feedback from Open-SWE users. If you have used it, 👉 DM him!

Audio and Speech

🔥 PyannoteAI Precision-2 - New flagship speaker diarization model with improved accuracy, speaker counting, timestamps, and confidence scores. (🙏 Hervé Bredin @ PyannoteAI)

🗣️ AssemblyAI’s Universal - They provide a production-ready speech-to-text via a simple managed API endpoint that you can invoke in just 5 lines of code.
The latest update also supports 99 languages, with automatic language detection for all supported languages, and speaker diarization for 95 of them. (🙏 Kevin Kuipers)
- 🛝 Play with it here.

Also:

🔔 WaveBlender Physics Engine - A mindblowing model using physical properties to generate sound from physical laws rather than AI.

Cyber

🔬 Cipher Attacks in LLMs - Introduction of CiFR benchmark for testing safeguards against cipher-enabled harmful fine-tunes with 99%+ detection rates. (🙏 Kevin Kuipers)

🔬 Security risks with LLMs - Paper revealing how attackers can hide instructions in emails or calendar invites that assistants might execute when reading them. (🙏 Thibaut Bayer @ SensCritique)

Also:

🐤 DuckDB npm Packages Compromised - Popular package compromised by same attackers that hit debug and chalk packages.
💣 Meta WhatsApp Security Failures - Former security boss lawsuit reveals 1,500 engineers had unrestricted access to user data without audit trails.

Infrastructure

🇺🇸 U.S. GAIN AI Act Amendment - The proposed U.S. GAIN AI Act Amendment (S.Amdt.3505) would block exports of advanced GPUs (>4,800 TPP) to Europe. This would restrict access to chips like NVIDIA’s H100, A100, and future generations, effectively pushing European players to rely on U.S.-based cloud providers. (🙏 Christophe Lesur @ CloudTemple)

Programming

👩‍💻 AI Coding at a Crossroads - Thoughts on the reshaping of AI coding, from the overcrowded ~~code editor~~ VS Code–fork market to (controversial) benchmark. (🙏 Kevin Kuipers @ Reg.exe)
- Lightweight IDEs are popping up each month in a still small but crowded market. The vast majority are either VS Code forks or VS Code extensions, making it hard to stand out.
- SWE benchmarks remain imperfect, though the research literature on improving them offers valuable (unexploited) insights, especially when it comes to mitigating overfitting.

🤖 Cocode Open-Sourced - Tool for updating docs and generating release changelogs, turning 4-hour documentation updates into 30-second commands. (Louis Choquel @ Pipelex)
- 📹 Demo in video

Also:

Claude Code Architecture Analysis - Deep dive into the master agent loop design, revealing simplified architectures using regex over embeddings for better performance.
Vercel's Vibe Coding Platform - Open-source platform built with AI SDK featuring GPT-5 agent loop
Zed ACP Integration - Claude Code now available in Zed editor through Agent Client Protocol.
Sonoma Models - New stealth models (Dusk Alpha and Sky Alpha) available with 2M context window, speculation suggests xAI involvement.

Health

🩻 Raidium's Curia Foundation Model - Raidium released SoTA Foundation Model for 3D imaging in precision radiology, trained on 200M+ CT & MRI slices. (🙏 Willy Braun)
- Trained on the entire cross-sectional imaging output of a major hospital.
- Outperforms or matches radiologists and recent foundation models on a 19-task benchmark.
- Brings a deep, transferrable understanding of complex anatomy, unlocking applications across clinical practice and research.

New Members in The Community

Gilles Walbrou (DataDome) - CTO at DataDome (AI-Powered Fraud Protection), 12+ years at Dassault Systemes building 3D experiences and SaaS foundations.
Guillaume Daix (Datadog) - Senior Engineering Manager at Datadog, 5+ years in startups, currently working on internal Datadog usage.
Gabriel Duciel (Arcade AI) - AI Engineer at Arcade (Figma + Cursor for 3D Games), created multiple MRR-generating side projects.
Taekmin Kim (Bg.app) - Building autonomous coding agents running in background, previously ML Engineer at Meta and TikTok focused on GPU optimization.
Julien Kilo (Cooking some AI) - Co-Founder & CTO building an AI RAG company, 10 years in startup ecosystem, passionate about web and AI.
Jean du Terrail (.Omics) - Founding AI scientist at .Omics building foundation models on plant genomics.

Article originally posted on WeLoveSota.com

TTY Weekly

Discussion about this post

Ready for more?