The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Claude OpenAI Anthropic Gemini Mistral Cursor

LESS Is More: Mutual-Stability Sampling for Diffusion Language Models

Diffusion large language models (dLLMs) offer a promising alternative to autoregressive decoding by iteratively refining masked sequences, enabling parallel token updates and bidirectional conditioning. Their practical efficiency, however, is limited by sampling procedures that execute a fixed number of reverse denoising steps selected before decoding, spending computation on already-stable positions and sometimes committing unstable ones too early. We present \textsc{LESS}, a training-free, model-agnostic adaptive sampler that treats token commitment as an online stopping problem. \textsc{LE...

Amr Mohamed·7 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Speaking the Language of Science: Toward a General-Purpose Generative Foundation Model for the Natural Sciences

In this report, we present LOGOS (Language Of Generative Objects in Science), a scientific generative language model that unifies heterogeneous tasks across the natural sciences within a single autoregressive framework based on a shared scientific grammar. It encodes diverse scientific objects and their spatial interactions as token sequences over a common vocabulary. By representing spatial contact and constraint patterns as discrete tokens, the model captures complex structural interactions in a purely sequential manner, without relying on explicit coordinates or geometric neural networks. ...

Mingyang Li·7 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Binary Tracking for Spatial QA and Navigation with Open Vision-Language Models

This work addresses spatial question answering for service robots traversing long egocentric routes. Given a query such as "where can I find a dry cleaner on the way back home?", the system returns a metric coordinate that downstream navigation components can act on. Prior Spatial Question Answering approaches leverage retrieval-augmented agents built on closed-source models such as GPT-4o for path exploration. However, robots operating in the real world often cannot reliably depend on online closed-source models due to network instability, communication latency, and deployment cost. It creat...

Dongbin Na·7 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Factorized Neural Operators Decompose Dynamic and Persistent Responses

Physical systems often exhibit heterogeneous mechanisms, where rapidly evolving dynamics coexist with persistent structures. Capturing such multiscale physical behavior remains challenging for existing neural operators, which typically rely on single dominant inductive bias and therefore couple distinct physical responses into a shared representation. We introduce the Unified Green's Function Framework across domains and propose the Factorized Neural Operators (FaNO), which decompose spectral representations into equivariant dynamic responses and invariant persistent responses, leading to bet...

Hao Tang·7 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Fantastic Pretraining Optimizers and Where to Find Them II: Hyperball Optimization

Matrix based optimizers such as Muon can substantially speed up language model pretraining, but their gains over AdamW are observed to shrink as model size and data scale grow when using standard constant decoupled weight decay. We propose Hyperball, a simple optimizer wrapper that addresses this issue. Given a base optimizer such as Adam or Muon, Hyperball sets the Frobenius norms of weight matrices and their corresponding optimizer updates to fixed constants. On Qwen3 style models up to 1.2B parameters, Muon Hyperball achieves 20--30% token equivalent speedup over weight decay baselines. Hy...

Kaiyue Wen·7 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Semantic Flip: Synthetic OOD Generation for Robust Refusal in Embodied Question Answering and Spatial Localization

Detecting unanswerable user queries remains essential for the reliable deployment of real-world embodied agents. However, modern vision-language models (VLMs) often generate overly confident answers even when the available visual memory cannot support the query. Such overconfidence poses various task-dependent risks. The agent may provide misleading information to the user in Embodied Question Answering and select an arbitrary coordinate and physically guide the user there in spatial reasoning for navigation. Despite these high stakes, only a few prior studies directly address when and how an...

Dongbin Na·7 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Contrastive-Difference CKA Reveals Concept-Specific Structural Alignment Across Language Model Architectures

Do different LLM architectures encode high-level concepts in structurally compatible ways? We systematically characterize a geometric-functional universality dissociation: across multiple concept domains and architectural families, moderate geometric convergence coexists with near-perfect functional transfer. Using contrastive-difference CKA (CKA_Delta), a training-free diagnostic that computes kernel alignment on per-sample contrastive differences, we isolate concept-specific convergence from generic similarity -- achieving significant discrimination where standard CKA cannot. The dissociati...

Xueping Gao·7 days ago

TechCrunch AI· PRESS

Cybersecurity vets protest ‘dangerous’ US government ban on Anthropic’s most powerful models

A group made up of dozens of cybersecurity experts urged the White House to remove export control restrictions on Anthropic’s models Fable and Mythos, arguing that the order is going to limit the ability of cybersecurity defenders to secure their software and products.

Lorenzo Franceschi-Bicchierai·7 days ago

Google AI (Gemma)· FRONTIER

We’re strengthening our presence in Alabama through new investments and community support.

Google has announced a $1.5 billion investment for 2026 and 2027 to expand its data center campus in Jackson County, Alabama. Operating since 2019 on a repurposed former…

Google AI (Gemma)·7 days ago

Simon Willison· ANALYST

"They screwed us": Personality clashes sent Anthropic's models offline

"They screwed us": Personality clashes sent Anthropic's models offline Lots of "source familiar with the administration's thinking" and "source close to Anthropic" in this Axios piece, which is the best collection of behind-the-scenes gossip I've seen about the US government export control Mythos/Fable story so far. Logan Graham, Dave Orr and blog favorite Nicholas Carlini are supposedly meeting with the Commerce Department today in D.C. Good luck to them! This closing notes doesn't give me much optimism that we'll be getting Fable back any time soon: The bottom line : One option is to m...

Simon Willison·7 days ago

TechCrunch AI· PRESS

Salesforce acquires AI customer service platform Fin for $3.6 billion

Salesforce says it wants to use Fin's team and technology to improve Agentforce, its existing enterprise platform that businesses can use to build custom AI agents that automate tasks.

Amanda Silberling·7 days ago

The Verge AI· PRESS

Skydio CEO Adam Bry on why Silicon Valley shouldn’t draw red lines for drone use

Today, I’m talking with Adam Bry, who is CEO of Skydio, the leading US maker of autonomous drones. Before we recorded this episode, I actually got to remotely operate one of Skydio’s drones in the Bay Area from Adam’s laptop in our podcast studio in New York and fly an indoor drone around our office. You can check out the full video of that on our YouTube channel. Beyond flying drones around the country, Adam and I talked about why Skydio is so focused on the enterprise market — I asked him a lot about working with police and military, but you’ll hear him say a lot of Skydio’s customers are u...

Nilay Patel·7 days ago

TechCrunch AI· PRESS

Sarvam becomes India’s newest AI unicorn with $234 million funding round led by HCLTech

Indian IT services company HCLTech is investing $150 million in the Bengaluru startup.

Jagmeet Singh·7 days ago

TechCrunch AI· PRESS

As AI agents become employees, NewCore emerges with $66M to give them identities

NewCore argues the next challenge in enterprise security will be managing AI agents, not people.

Jagmeet Singh·7 days ago

NVIDIA Dev Blog· INFRA

Pretrained to Imagine, Fine-Tuned to Act: The Rise of World-Action Models

Quick glossary for readers new to VLA/WAM terminology VLA Vision-Language-Action model: a robot policy that starts from a pretrained VLM backbone and adapts it... Quick glossary for readers new to VLA/WAM terminology VLA Vision-Language-Action model: a robot policy that starts from a pretrained VLM backbone and adapts it to generate actions from visual observations and language instructions. Large-scale VLM pretraining is a core part of the recipe. See Pi-0 and GR00T N1. WAM World-Action Model: a policy that starts from a pretrained world-model or video… Source

Moritz Reuss·7 days ago

TechCrunch AI· PRESS

A satellite just learned to find things on its own — here’s what that means

In April, for the first time ever, an Earth observation satellite found what it was looking for, all on its own.

Tim Fernholz·7 days ago

Import AI· ANALYST

Import AI 461: "Alignment is not on track"; FrontierCode; and synthetic research interns

Where are your agents right now?

Jack Clark·7 days ago

Stratechery· ANALYST

Anthropic’s Safety Superpower

Anthropic's belief in its own commitment to safety gives the company license to aggressively favor its business and even challenge the U.S. government.

Ben Thompson·8 days ago

TechCrunch AI· PRESS

The AI layoff wave is becoming a powder keg

What makes this combustible: at the very moment that tens of thousands of workers are being shown the door, a small cohort of AI insiders is becoming wealthy on a scale that's hard to comprehend.

Connie Loizos·8 days ago

Simon Willison· ANALYST

Quoting Julia Evans

[...] Instead, I picture a specific person and I just write for them. Often this person is "me, but 3 years ago" or a good friend. — Julia Evans , write for 1 person Tags: writing , julia-evans

Simon Willison·8 days ago

Cohere· FRONTIER

Cohere triples UK footprint with new London office to support R&D growth

The new office places Cohere at the centre of London’s AI growth story with a growing global research hub based out of the city.

Cohere·8 days ago

Cohere· FRONTIER

Modernizing FOI Systems with AI Agents

Modernize FOI request systems with AI. Learn how AI agents can help public-sector teams reduce manual work, improve responsiveness, and preserve oversight.

Cohere·8 days ago

Simon Willison· ANALYST

Why AI hasn’t replaced software engineers, and won’t

Why AI hasn’t replaced software engineers, and won’t Arvind Narayanan and Sayash Kappor take on the question of AI job losses through the lens of a profession that is uniquely suited to AI disruption - software engineering. In this essay, we argue that there is enough evidence to reject the narrative that once AI capabilities reach a certain threshold, it will cause mass layoffs. Given that this is true even in a sector with very few regulatory barriers, most other professions are likely to be even more cushioned. The first good news is that the data still doesn't support the idea that AI is ...

Simon Willison·8 days ago

The Verge AI· PRESS

China may have accessed Mythos

According to a new report from Semafor, the White House's decision to impose export restrictions on Anthropic's Mythos was driven in part by fears that it had been accessed by a group linked to China. If the Chinese government actually had access to Mythos 5 or Fable 5, it would present a serious national security risk. The government could also attempt to reverse engineer the model through distillation, a method in which a "student" AI is trained on a more advanced model to replicate its behavior. The White House has not confirmed this report, and a post on X by Trump advisor David Sacks did...

Terrence O’Brien·8 days ago

OpenAI· FRONTIER

Introducing the OpenAI Partner Network

OpenAI launches the Partner Network, investing $150M to help global partners accelerate enterprise AI adoption, deployment, and transformation.

OpenAI·8 days ago

TechCrunch AI· PRESS

As AI companies race to go public, who else is along for the ride?

Startups are trying to "ride that SpaceX IPO wave."

Anthony Ha·8 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

TrustedARI: Towards Trust-Native Agentic Routing Infrastructure for Agentic AI

AI agents increasingly access external models, tools, and services through Agentic Routing Infrastructure (ARI) to manage the overhead of heterogeneous interfaces and fragmented subscriptions. Yet, the architecture of ARI introduces fundamental trust risks: it obtains plaintext access to agent queries and service responses, while leaving agents unable to verify that their queries are routed to intended service providers or that requests and responses remain untampered. To address this problem, we present TrustedARI, the first trust-native agentic routing infrastructure for agentic AI. Archite...

Qi Li·8 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

The Truth Stays in the Family: Enhancing Contextual Grounding via Inherited Truthful Heads in Model Lineages

Recent advances in large language models (LLMs) have produced many specialized multimodal LLMs (MLLMs) that share common foundational LLMs, forming distinct model lineages. It remains unclear whether a fundamental behavioral link exists between the foundational LLMs and downstream variants. We investigate this question by quantifying head-level context-truthfulness scores. Across diverse LLM and MLLM lineages, including Vicuna-, Qwen2.5-, LLaMA2-, and Mistral-based models, we find that Truth Scores are strongly preserved within model families, even after instruction tuning or multimodal adapt...

Miso Choi·8 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

SACE: Concept Erasure at the Semantic Singularity in Visual Autoregressive Models

The rapid progress of visual autoregressive (VAR) models has unlocked a transformative frontier for high-fidelity text-to-image synthesis, while heightening concerns over the safety alignment of generated content. Naive application of existing erasure techniques to VAR models causes catastrophic semantic collapse and visual artifacts, since they are predominantly designed for the homogeneous denoising steps of diffusion models. To address this foundational challenge, we first propose the Semantic Singularity Axiom, which posits that any target semantic concept embedded within a prompt is defi...

Siya Yang·8 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

On Defining Erasure Harms for NLP

The deployment of NLP systems has raised concerns about harms they might produce, including representational harms. Recent literature has begun to conceptualize and measure one such harm, the harm of erasure. Nevertheless, the field lacks a clear and cohesive conceptual foundation for identifying and measuring erasure. Existing conceptualizations of erasure are often broad -- making it difficult to identify what is needed to establish and measure erasure -- or else specific to particular settings -- facilitating measurement for those settings but potentially challenging to adapt to other sett...

Yu Lu Liu·8 days ago

← Front Page30 stories

← Newer Older →