The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Vessel trajectory prediction from Automatic Identification System (AIS) data is essential for maritime situational awareness, yet it remains challenging due to irregular sampling, missing reports, and complex dynamics. Beyond accurate point forecasts, maritime applications also demand well-calibrated uncertainty estimates for reliable decision-making. Bayesian Neural Ordinary Differential Equations (ODEs) offer a principled framework for continuous-time trajectory modeling with uncertainty quantification by placing a prior over the neural vector field parameters. However, the commonly used is...

Jaeyeong Lee·20 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

EDIT: Evidence-Diagnosed Intervention Training for Rule-Faithful LLM Grading

Reliable rubric grading requires more than accurate score prediction. Each judgement must be grounded in the mark scheme and evidence from the student answer. Existing credit-assignment and intervention methods, primarily designed for self-contained reasoning tasks such as mathematics reasoning, struggle in this setting because they do not identify where grading reasoning goes wrong or how the model's belief about the final mark changes during reasoning. We propose Evidence-Diagnosed Intervention Training (EDIT), a two-phase framework for training more rubric-faithful LLM graders. First, EDIT...

Zhihao Wu·20 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

"Chi nas dal soch el sent de legn" -- Auditing Text Corpora for Lombard

Several of the world's languages are still under-resourced in terms of Natural Language Processing (NLP) tools. This is mostly due to the lack of high-quality datasets to train, develop, and evaluate systems and models for several tasks, such as Machine Translation (MT). We conduct a manual audit of the parallel and monolingual corpora available for Lombard, an under-resourced language continuum from Italy. Our analysis reveals that the perceived abundance of web-scraped data is an illusion, with massive datasets plagued by severe language misidentification, boilerplate text, and non-linguist...

Edoardo Signoroni·20 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Performance Evaluation of GraphCast for Medium-Range Weather Forecasting over Brazil

The paradigm of global weather forecasting is rapidly shifting with the emergence of Machine Learning Weather Prediction models (MLWP). While these data-driven architectures demonstrate remarkable global skill, regional benchmarks in the Global South remain scarce, leaving their efficacy in complex, highly convective environments largely unverified. This study evaluates the performance of GraphCast operational against the deterministic ECMWF IFS HRES as baseline across four distinct Brazilian climatic sub-regions. Utilizing a scalable, cloud-native pipeline and the WeatherBench-X framework fo...

Wolfgang R. Rowell·20 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Attack Detection using Time Series Foundation Models

This paper addresses the problem of attack detection in cyber-physical systems without any knowledge of the plant model or its structure. A remotely located plant transmits sensor measurements to an operator over a network that is assumed to be under attack. We consider two classes of attacks: model-free replay attacks and model-based stealthy attacks. For the latter, we derive closed-form expressions for the optimal stealthy attack policy against a $χ^2$ detector, for both linear and nonlinear systems. We then propose a model-structure-free detector based on TimesFM, a time-series foundation...

Sribalaji C. Anand·20 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Boosting Brain-to-Image Decoding with TRIBE v2 Data Augmentation

Brain decoding is limited by the availability of labeled neural data, and remains challenging in low-data regimes. To address this issue, we investigate whether and when brain decoding can be boosted by augmenting small fMRI datasets with synthetic data generated by a pretrained model of fMRI responses to stimuli. We use TRIBE v2, a large encoding model pretrained on more than 1000 hours of fMRI responses to video, audio and language. For each dataset, we evaluate systematic grids that show how the performance of image decoders varies with the amount of synthetic data used for training. Our r...

Yohann Benchetrit·20 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Equivariant Neural Belief Propagation

Probabilistic inference over spatially embedded variables requires beliefs that respect $SE(3)$ symmetry, yet existing equivariant networks produce only scalars and vectors -- not the rank-2 precision tensors needed for anisotropic uncertainty, and single-component messages collapse multi-modal energy landscapes to physically meaningless averages. We introduce Equivariant Neural Belief Propagation (ENBP), a factor-graph framework whose messages are equivariant Gaussian mixture models with sufficient statistics that transform exactly under $SE(3)$. Rank-2 precision matrices are synthesised via...

Zehua Cheng·20 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Symmetric Divergence and Normalized Similarity: A Unified Topological Framework for Representation Analysis

Topological Data Analysis (TDA) offers a principled, intrinsic lens for comparing neural representations. However, existing paired topological divergences (e.g., RTD) are limited by heuristic asymmetry and, more critically, unbounded scores that depend on sample size, hindering reliable cross-scenario benchmarking. To address these challenges, we develop a unified topological toolkit serving two complementary needs: fine-grained structural diagnosis and robust, standardized evaluation. First, we complete the RTD framework by introducing Symmetric Representation Topology Divergence (SRTD) and ...

Yan Wang·20 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

TokenMizer: Graph-Structured Session Memory for Long-Horizon LLM Context Management

Large language model (LLM) deployments for long-horizon tasks face a fundamental constraint: context windows are finite while productive work sessions are not. When history exceeds the Maximum Effective Context Window (MECW), critical structured information - architectural decisions, task transitions, file histories - is silently discarded. Existing mitigations treat history as flat text, destroying the relational structure that makes sessions resumable. We present TokenMizer, an open-source proxy system that models LLM session history as a typed knowledge graph. The schema defines 14 node ty...

Shweta Mishra·20 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Bridging Domain Expertise and Generalization for Performance Estimation

Performance estimation under distribution shift aims to predict how a model behaves on an unlabeled test set whose distribution differs from the training data, a scenario that requires reliable indicators that can faithfully reflect model behavior without ground-truth labels. Existing approaches rely solely on the outputs of the given model whose biases are amplified once the distribution shifts, weakening the correlation with the true performance. Motivated by this limitation, we propose Fused Reference Alignment Prediction (FRAP), which leverages the complementary strengths of an external f...

Shuxuan Li·20 days ago

TechCrunch AI· PRESS

Is Silicon Valley ready to put robots in people’s homes? Hello Robot is.

The California startup released the fourth-generation of its home assistance robot, Stretch.

Tim Fernholz·20 days ago

The Verge AI· PRESS

TSMC struggles to keep up with AI demand: ‘We can only support so much’

Taiwan Semiconductor Manufacturing Co. - the world's biggest semiconductor-maker - is struggling to meet demands from American customers even with its factory buildout in the US, according to reports from Reuters and Bloomberg. "Customer demand is so high, and we can only support so much," TSMC CEO C.C. Wei said after a shareholder meeting on Thursday, Reuters reports. "We are doing our best to ensure TSMC does not become a bottleneck." The surge in AI use has already put constraints on the memory industry, with the widespread shortage of RAM and NAND Flash memory expected to last for years. ...

Emma Roth·20 days ago

Ars Technica AI· PRESS

How some data center operators are tackling their water use problems

Hyperscalers have come under scrutiny for their impact on water quality and availability.

Molly Taft, wired.com ·20 days ago

TechCrunch AI· PRESS

Apple touts $1.4 trillion in App Store billings and sales, 90% without a commission

Apple's App Store generated $1.4 trillion in sales, up from $1.3 trillion last year, with $149 billion in sales for digital goods.

Sarah Perez·20 days ago

The Verge AI· PRESS

Elon Musk is steamrolling Wall Street to become a trillionaire

Today on Decoder, I’m talking to Ryan Mac, a technology reporter at The New York Times and coauthor of the excellent book Character Limit: How Elon Musk Destroyed Twitter, which came out in 2024. I can’t recommend it enough. I wanted to have Ryan on the show because we’re on the cusp of the SpaceX IPO, which promises to be one of the most consequential public offerings in history for a variety of reasons — its biggest-ever size, of course, at nearly $2 trillion dollars, but also because all kinds of rules that keep our markets fair are being bent, if not outright broken, along the way. I also...

Nilay Patel·20 days ago

NVIDIA Dev Blog· INFRA

NVIDIA Nemotron 3 Ultra Powers Faster, More Efficient Reasoning for Long-Running Agents

Single-turn chatbots are evolving into long-running agents that can reason, maintain context, use tools, and run efficiently across many turns to complete... Single-turn chatbots are evolving into long-running agents that can reason, maintain context, use tools, and run efficiently across many turns to complete complex workflows. However, these multi-agent workflows cause token counts to grow quickly. Agents plan, call tools, invoke sub-agents, receive information, and then pass history, outputs, and reasoning steps back into the model… Source

Chris Alexiuk·20 days ago

Hugging Face· INFRA

How to Fine-Tune Nemotron 3.5 ASR for Your Language, Domain, or Accent

Hugging Face·20 days ago

The Verge AI· PRESS

Let us filter AI slop, you cowards

Nobody should be subjected to seeing shrimp Jesus all over their social feeds. | Image: Cath Virginia / The Verge, Getty Images It's almost impossible to avoid seeing AI-generated content online, but it doesn't have to be this way. YouTube, Instagram, TikTok, and more have ramped up content authentication efforts over the last year, with many now automatically applying labels to distinguish AI-generated images, videos, and music from those made by real, human creators. That's all very well and good if we're just stumbling across labeled content at random, but you know what would be better? Le...

Jess Weatherbed·20 days ago

Hugging Face· INFRA

EVA-Bench Data 2.0: 3 Domains, 121 Tools, 213 Scenarios

Hugging Face·20 days ago

The Verge AI· PRESS

AI leaders call for tougher protections against AI-aided bioweapons

Some of the AI industry's biggest rivals have put their many, many grievances aside for a common cause: making it harder for people to use their technology to develop biological weapons. In an open letter to US lawmakers, tech leaders are pressing Congress to enact rules closing what they say is an alarming biosecurity gap that could help trigger a global pandemic. Anthropic's Dario Amodei, OpenAI's Sam Altman, and Microsoft's Mustafa Suleyman are among the signatories urging US lawmakers to require companies selling synthetic DNA and RNA - genetic material that can ordered online and assembl...

Robert Hart·20 days ago

OpenAI· FRONTIER

How Endava is redesigning software delivery around AI agents

Learn how Endava is using AI agents, ChatGPT Enterprise, and Codex to accelerate software delivery, automate workflows, and build an AI-native culture across the enterprise.

OpenAI·20 days ago

Hugging Face· INFRA

Task-Seeded Synthetic Q&A Generation for Nemotron Pretraining

Hugging Face·20 days ago

MIT Tech Review· PRESS

How courts are coping with a flood of AI-generated lawsuits

Most days in her chambers, Judge Maritza Braswell, a federal magistrate judge in Colorado, sifts through stacks of documents written by people without a lawyer. Many of them can’t afford to hire a lawyer, and others have cases too weak or too small to interest one. She reads each one carefully, mindful of how daunting…

Michelle Kim·20 days ago

Stratechery· ANALYST

An Interview with Microsoft CEO Satya Nadella About Finding Core Competencies

An interview with Microsoft CEO Satya Nadella about figuring out Microsoft's role in AI, the relationship with OpenAI, Capex, Software, and a potential new agentic platform.

Ben Thompson·20 days ago

The Verge AI· PRESS

Amazon develops a warehouse robot workers can speak to

The design hasn’t changed much from the original Proteus, which was announced in 2022. | Image: Amazon Amazon has announced a new version of its fully autonomous warehouse robot, Proteus, that will can interact using language instead of code. The expanded capabilities come as part of a growing pivot toward automation as the e-commerce giant replaces its human workers with robots. Amazon says the AI-powered upgrade means its human employees can assign the robot tasks in the same way they'd communicate with colleagues. Previously, workers would need to use specialized software to direct the flo...

Robert Hart·20 days ago

OpenAI· FRONTIER

Dreaming: Better memory for a more helpful ChatGPT

ChatGPT introduces a new memory system to better remember preferences, keeping context fresh and relevant across conversations.

OpenAI·20 days ago

Latent Space· ANALYST

[AINews] Reve 2 and Ideogram 4: Layouts in Imagegen

a quiet day.

Latent Space·20 days ago

Hugging Face· INFRA

Designing the hf CLI as an agent-optimized way to work with the Hub

Hugging Face·21 days ago

OpenAI· FRONTIER

Biodefense in the Intelligence Age

An action plan for AI-powered biological resilience

OpenAI·21 days ago

TechCrunch AI· PRESS

Lovable signs multi-year deal with Google Cloud to up usage 5x, source says

Lovable and Google signed an expanded multi-year deal athat involves a 5x expansion of Lovable's footprint on Google Cloud, and expanded access to Anthropic Claude.

Julie Bort·21 days ago

← Front Page30 stories

← Newer Older →

The Archive

Function-Space Priors for Bayesian Neural ODEs with Application to Vessel Trajectory Prediction

EDIT: Evidence-Diagnosed Intervention Training for Rule-Faithful LLM Grading

"Chi nas dal soch el sent de legn" -- Auditing Text Corpora for Lombard

Performance Evaluation of GraphCast for Medium-Range Weather Forecasting over Brazil

Attack Detection using Time Series Foundation Models

Boosting Brain-to-Image Decoding with TRIBE v2 Data Augmentation

Equivariant Neural Belief Propagation

Symmetric Divergence and Normalized Similarity: A Unified Topological Framework for Representation Analysis

TokenMizer: Graph-Structured Session Memory for Long-Horizon LLM Context Management

Bridging Domain Expertise and Generalization for Performance Estimation

Is Silicon Valley ready to put robots in people’s homes? Hello Robot is.

TSMC struggles to keep up with AI demand: &#8216;We can only support so much&#8217;

How some data center operators are tackling their water use problems

Apple touts $1.4 trillion in App Store billings and sales, 90% without a commission

Elon Musk is steamrolling Wall Street to become a trillionaire

NVIDIA Nemotron 3 Ultra Powers Faster, More Efficient Reasoning for Long-Running Agents

How to Fine-Tune Nemotron 3.5 ASR for Your Language, Domain, or Accent

Let us filter AI slop, you cowards

EVA-Bench Data 2.0: 3 Domains, 121 Tools, 213 Scenarios

AI leaders call for tougher protections against AI-aided bioweapons

How Endava is redesigning software delivery around AI agents

Task-Seeded Synthetic Q&A Generation for Nemotron Pretraining

How courts are coping with a flood of AI-generated lawsuits

An Interview with Microsoft CEO Satya Nadella About Finding Core Competencies

Amazon develops a warehouse robot workers can speak to

Dreaming: Better memory for a more helpful ChatGPT

[AINews] Reve 2 and Ideogram 4: Layouts in Imagegen

Designing the hf CLI as an agent-optimized way to work with the Hub

Biodefense in the Intelligence Age

Lovable signs multi-year deal with Google Cloud to up usage 5x, source says

TSMC struggles to keep up with AI demand: ‘We can only support so much’