The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Claude OpenAI Anthropic Gemini Mistral Cursor

The algebra of Krom logic programs

This paper investigates the algebraic structure of Krom logic programs, consisting only of facts and rules with at most one body atom. We show that sequential composition endows the class of Krom programs with a natural monoid structure and that this structure admits rich algebraic extensions to Krom seminearrings, Krom quemirings, Krom-Conway seminearrings, and Krom-Conway omegaseminearrings. Furthermore, we establish explicit generating sets and canonical decompositions, study the associated ${}^ω$-operator, characterize the Kleene star in graph-theoretic terms, and relate finite Krom monoi...

Christian Antić·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

How to Score Experts for One-Shot MoE Expert Pruning: A Unified Formulation and Selection Principle

Mixture-of-Experts (MoE) language models reduce per-token computation through sparse expert activation, yet deployment still requires storing the full expert pool, making one-shot expert pruning a practical approach for reducing memory usage. Although effective, existing criteria are largely heuristic, and no single criterion is universally optimal. Thus, establishing a principle for selecting pruning criteria suited to different deployment objectives remains an important yet largely underexplored problem in one-shot expert pruning. To this end, we introduce a unified formulation for one-shot...

Zongfang Liu·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Beyond English: Uncovering the Multilingual Gap in Vision-Language-Action Models

Vision-Language-Action models have recently demonstrated promising capabilities in learning generalist robot policies from large-scale multimodal data. However, most existing VLA systems are trained and evaluated primarily with English instructions, leaving their ability to understand and execute instructions in other languages largely unexplored. While the underlying large language models often possess multilingual capabilities, it remains unclear whether these multilingual capabilities transfer to VLAs during training. In this work, we present the first systematic study of multilingual inst...

Hanyang Chen·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Odds Law: The Decomposition Algebra On How Intelligence Organizes Itself to Solve Difficult Problems Reliably

We ask a structural question: given unreliable elementary problem-solvers, what organizations of them solve hard problems reliably, and what are the limits? We develop a $decomposition~algebra$: elementary solvers are morphisms in a stochastic category, and four combinators (sequential composition, parallel ensembling, verification gating, and recursive reduction) generate the space of compound solvers. We equip this algebra with two homomorphisms, a $reliability$ valuation into the ordered monoid $([0,1],\le)$ and a $cost$ valuation into a commutative semiring, and we derive the composition ...

Hidayet Aksu·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

AI-Driven Framework for Adaptive Water Network Management with Proof-of-Concept Implementation: Addressing Non-Revenue Water in Jordan

Jordan faces severe water scarcity with 50\% of water produced is lost to leakage, theft and metering issues also known as non-revenue water (NRW). Traditional reactive approaches have proven insufficient for sustained NRW reduction. This paper proposes an intelligent framework integrating EPANET hydraulic modeling, digital twin technology, SCADA systems, and large language model (LLM)-based AI agents for continuous network monitoring and adaptive decision-making. The system combines real-time data streams with physics-based simulation to detect anomalies, employing retrieval-augmented genera...

Mohammed Fasha·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Schattor: Schatten-family methods for deep learning optimization

Modern deep learning optimization features heterogeneous parameter structures, noisy gradients, and highly nonconvex landscapes, posing significant challenges for both algorithm design and theoretical analysis. Motivated by the limitations of SGD and the success of adaptive optimizers, we propose {\it Schattor}, a family of adaptive first-order methods based on Schatten norms. Schattor unifies SGD and the recently proposed matrix-variate adaptive optimizer Muon within a single Schatten-norm-based framework. We establish dimension-free stationarity guarantees for methods in the Schattor family...

Bohao Ma·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Robust Transformer-Based One-Step Stock Index Forecasting via Shifted Data Augmentation

Transformers have shown remarkable success in sequence modeling, yet their direct application to financial time series remains challenging due to noisy signals, short-memory dynamics, and distributional shifts. This paper proposes a modified Transformer architecture for one-step stock index forecasting, combined with advanced learning-rate scheduling and a novel Shifted Data Augmentation (SDA) technique. We evaluate the proposed framework on two benchmark stock index datasets, VN30 and S&P 500. Experimental results demonstrate that cosine annealing with warmup consistently improves forecastin...

Tien Thanh Thach·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

When Generator Replay Degrades: Projected Rehearsal Orchestration for Heterogeneous Federated Class-Incremental Learning

Federated class-incremental learning (FCIL) becomes substantially harder when clients observe different label subsets, progress through tasks at different stages, and provide uneven supervision for the same semantic concepts. Existing FCIL methods often preserve old knowledge through input-space synthesis, but they can be fragile under heterogeneous task streams and difficult to transfer across modalities. To alleviate such issues, we propose PRO, a framework that replaces synthetic input replay with projected rehearsal orchestration. To remove external pretraining, we evaluate all methods un...

Thinh T. H. Nguyen·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

MAF: Multimodal Adaptive Few-shot Prompting for Sentiment Analysis with MLLMs

Multimodal large language models (MLLMs) have demonstrated remarkable capabilities in understanding complex multimodal content. However, their performance in sentiment analysis exhibits acute sensitivity to prompt design, rendering static, uniformly applied prompts inherently suboptimal for capturing the nuanced multimodal cues that vary across inputs. To address this limitation, we propose a Multimodal Adaptive Few-Shot Prompting (MAF) framework, which dynamically retrieves and integrates query-relevant demonstrations to elicit the sentiment reasoning capabilities of MLLMs in a context-sensi...

Hangling Xie·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Multi-Fidelity SINDy: Sparse Discovery of Nonlinear Dynamical Systems with Fidelity-Weighted Measurements

Data from simulations and experiments are rarely noise-free and often exhibit heterogeneous levels of fidelity. Measurement uncertainty may vary across repeated observations, sensing devices, or even within a single experiment. This work addresses the problem of discovering nonlinear dynamical systems from such inhomogeneous data. We extend the Sparse Identification of Nonlinear Dynamical Systems (SINDy) framework to account for variable noise levels by combining Ensemble SINDy and Weak SINDy within a weighted regression formulation derived from generalized least squares. A statistical justif...

Filippo Zacchei·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Multi-agent Framework for Time-Sensitive Complementary Collaboration in Minecraft

We present TickingCollabBench, a Minecraft-based multi-agent benchmark for a novel class of time-sensitive complementary collaboration tasks. Our benchmark reflects four core characteristics of real-world collaboration: agent heterogeneity, mandatory collaboration, dynamic environments, and strict real-time constraints with failure risks. To enable this, we develop the TickingCollab framework, which supports the generation of diverse dynamic environments and abstracts Minecraft's primitive APIs to enable declarative YAML task specifications for composing these events. Building on this, we des...

Juheon Yi·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

ReQAT: Achieving Full-Precision Reasoning Accuracy with 4-bit Floating-Point Quantization-Aware Training

Large Reasoning Models (LRMs) achieve strong problem-solving through long chain-of-thought, but their deployment is constrained by the high cost of full-precision inference and growing KV cache footprints. Microscaled FP4 formats enable efficient FP4 deployment; however, fully quantizing weights, activations, and KV caches (W4A4KV4) causes severe reasoning degradation that existing PTQ and QAT fail to recover. We identify that FP4 failures concentrate on low-entropy tokens--precise symbolic commitments such as digits and operators--where quantization noise inflates sampling errors that cascad...

Janghwan Lee·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

The Reservoir Attention Network: Cross-Pass State in Pretrained Transformers via Content-Addressable Reservoir Injection

A feasibility and dynamics study of the Reservoir Attention Network (RAN), an architecture that injects a fixed, randomly-initialized reservoir into the mid-layer attention of a pretrained transformer to carry state across forward passes. Experiments span GPT-2 (124M, 355M) to Qwen2.5 (0.5B, 1.5B) on a single consumer GPU. The tasks are minimal probes chosen to isolate individual mechanisms; the broader always-alive agent vision is treated throughout as compute-limited future work, not a claim of this paper. The reservoir is left untrained (fixed random) by design: this isolates whether untra...

Emma Leonhart·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Stochastic trace estimation with tensor train random vectors

Stochastic trace estimation is a standard tool for approximating the trace of a large-scale matrix available only through matrix-vector products. However, in tensor-structured settings, unstructured Gaussian or Rademacher test vectors may be prohibitively expensive to store and compute with, while cheaper rank-one tensor-product vectors can require sample complexities that grow exponentially with the tensor order. This work studies Gaussian random tensor train vectors as a structured alternative for stochastic trace estimation. We show that, with a suitable choice of the tensor train rank, ra...

Zvonimir Bujanović·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Z-Plane Neural Networks: Bounded Geometric Activation Replaces ReLU and LayerNorm

Modern deep neural networks rely on Euclidean scalar activations (e.g., ReLU) and global normalization techniques (e.g., LayerNorm) to prevent gradient instability in deep architectures. However, these mechanisms inherently cause dead neurons, discard critical directional information, and destroy the orthogonality of feature representations. Inspired by the frequency-modulation transmission of biological axons, we propose the Z-Plane Neural Network, which maps hidden states into 2D phasor bundles on a hypersphere. We introduce a novel geometric activation function, Radial Bounding($\mathbf{x}...

Sungwoo Goo·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Information Gap and Feasibility-Aware Inference in Binomial Logistic Mixtures

This paper studies the information gap between mixture detection and label recovery in binomial logistic mixtures. Standard likelihood-based criteria such as the Bayesian information criterion (BIC) can detect the presence of two components, but this does not guarantee that the corresponding labels are recoverable. We show that this gap is intrinsic to binomial logistic mixtures with a fixed number of trials: observed-data evidence for mixture structure and per-observation information for label recovery have different local orders in the component separation, and only the former accumulates w...

Yuta Hayashida·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

PO-PDDL: Learning Symbolic POMDPs from Visual Demonstrations for Robot Planning Under Uncertainty

Real-world robot task planning must operate under both stochastic action execution and partial observability, yet constructing Partially Observable Markov Decision Process (POMDP) models for real robotics domains remains difficult and labor-intensive. We introduce PO-PDDL, a symbolic formulation of POMDPs that preserves the relational structure and LLM-friendly syntax of the Planning Domain Definition Language (PDDL), while explicitly modeling partial observability, stochasticity, and beliefs. Building on this formulation, we propose a demonstration-driven pipeline for learning PO-PDDL models...

Wenjing Tang·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

MosaicQuant: Inlier-Outlier Disaggregation for Unified 4-Bit LLM Quantization

4-bit quantization significantly reduces the memory footprint and accelerates the inference of large language models (LLMs). However, its limited bit-width representation struggles to faithfully capture both dense common values (\emph{inliers}) and rare large-magnitude values (\emph{outliers}), causing substantial accuracy degradation. Existing mixed-precision methods mitigate this by retaining outliers in high precision, but at the cost of breaking the uniformity of low-bit execution, introducing precision conversion and extra data movement that undermine practical speedup. We propose \textb...

Yangjia Hu·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Extending Item Response Theory for Efficient and Meaningful Multilingual Evaluation

Multilingual benchmarks are central to evaluating large language models (LLMs) across languages, but they suffer from three issues: exhaustive evaluation scales linearly with the number of languages, automatic translation introduces errors that are easily missed at scale, and some items conflate general and culture-specific knowledge. We address all three with a unified statistical framework, Multilingual-IRT, which extends Item Response Theory with per-language difficulty deviations, split discriminability separating content from language effects, and per-language ability residuals. Fitting ...

Gili Lior·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

CIWI-CKT: Chaos-Informed Wave Interference Feature Fusion and Cross-City Knowledge Transfer for Traffic Flow Forecasting

Accurate traffic flow prediction remains challenging in cross-city, data-scarce scenarios where limited historical data hinders model generalisation. The chaotic nature of traffic dynamics, complex spatio-temporal dependencies, and heterogeneous urban networks complicate few-shot learning across cities. Existing deep learning approaches either treat traffic as purely deterministic or lack mechanisms to model wave-like interference patterns essential for cross-regime traffic dynamics. To address these limitations, this paper proposes CIWI-CKT, a novel Chaos-Informed Wave Interference Feature F...

Abdul Joseph Fofanah·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Distilling Examples into Task Instructions: Enhanced In-Context Learning for Real-World B2B Conversations

In-context learning (ICL) is the standard method for low-resource classification, yet its efficacy in specialized domains remains largely unexplored. We address the challenge of classifying semantically complex, multi-party B2B conversations, where traditional ICL encounters significant limitations, especially as context length increases due to the concatenation of multiple few-shot examples. We introduce the \texttt{Call Playbook} dataset, featuring five classification tasks derived from real-world B2B conversations targeting core sales concepts. To bridge the gap between performance and pra...

Guy Rotman·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Multi-Agent Framework for Audit Risk Assessment with Explicit Uncertainty and Evidence Conflict Modeling

Audit risk assessment increasingly benefits from combining heterogeneous evidence sources, yet existing approaches typically produce point predictions without quantifying how well different evidence streams agree. We propose UMAR (Uncertainty-Aware Multi-Agent Risk Assessment), a framework that employs three specialized agents: an MD&A Text Agent, a Financial Ratio Agent, and a CAM Agent, each producing independent risk scores with calibrated uncertainty estimates. An Uncertainty Aggregator based on Dempster-Shafer evidence theory fuses these scores while explicitly measuring inter-agent conf...

Yuhan Wang·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

HAPI-EP: Towards Hybrid, Adaptive, and Predictive Digital Twins of Cardiac Electrophysiology

A digital twin (DT) of a patient-specific heart offers significant potential in personalized medicine. However, its rapid and dynamic adaptation to an individual's live data and its predictive capability after adaptation remains central challenges. We examine this challenge from its two building blocks: DT formulation where mechanistic and data-driven models show competing merits and limitations, and DT optimization strategies that are largely driven by a reconstruction objective leading to un-identifiable models. We address both bottlenecks via HAPI -- an AI framework for building hybrid, ad...

Sumeet Vadhavkar·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Formalizing and Mitigating Structural Distortion in LLM Attention for Zero-Shot Graph Reasoning

Large Language Models (LLMs) have shown promise for reasoning over Text-Attributed Graphs (TAGs). However, applying LLMs to graphs requires linearizing their structure into sequences, introducing distortion rooted in the graph bandwidth problem. While this distortion has been shown to degrade performance, it is often attributed to prompt design or model scale, leaving the underlying mechanism unclear. In this work, we show \textit{how} rotary positional embeddings turn graph linearization into bandwidth-dependent attention decay, suppressing attention between graph-adjacent nodes that are for...

Donald Loveland·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Retrieve, Don't Retrain: Extending Vision Language Action Models to New Tasks at Test Time

Extending a vision-language-action (VLA) policy to a new task typically requires task-specific teleoperated demonstrations and per-task fine-tuning, making adaptation costly in both data collection and compute. In this paper, we show that this target-side per-task adaptation cost can be replaced by retrieval. Our retrieval-augmented policy is trained once on paired demonstrations from the target embodiment (query) and a cheaper embodiment (pool, e.g., human-hand video), then frozen. New tasks are added at deployment by appending pool-side demonstrations to a retrieval pool. The frozen policy ...

Jeongeun Park·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Conflict-Aware Federated Fine-Tuning of Large Language Models with Mixture-of-Experts

The continuous scaling of large language models (LLMs) incurs prohibitive computational costs, making Mixture-of-Experts (MoE) a scalable alternative for efficient fine-tuning via sparse activation. While federated learning (FL) emerges as the paradigm for privacy-preserving collaborative optimization, integrating MoE into FL under data heterogeneity may trigger conflicting expert optimizations. Client-specific data distributions force same-indexed experts to optimize under inconsistent or even conflicting feature-label correlations. This mismatch induces destructive interference during aggre...

Yijun Lu·9 days ago

TechCrunch AI· PRESS

As Anthropic suspends access to new models, India debates its AI future

Tech leaders debate whether the Anthropic episode is a wake-up call for India’s AI ambitions.

Jagmeet Singh·9 days ago

TechCrunch AI· PRESS

Meta reportedly moves to unwind $2B Manus deal after Beijing’s demand

Meta starts dismantling its $2 billion Manus acquisition after Beijing ordered the deal reversed.

Kate Park·9 days ago

Simon Willison· ANALYST

Publishing WASM wheels to PyPI for use with Pyodide

The Pyodide 314.0 release announcement (via Hacker News ) includes news I've been looking forward to for a long time: You can now publish Python packages built for Pyodide (or any Python runtime compatible with the PyEmscripten platform defined in PEP 783 ) directly to PyPI and install them at runtime. Previously, the Pyodide maintainers had to maintain, build, and host over 300 packages ourselves. This created a significant burden on our maintainers and became a major bottleneck for the community, as every new package required manual review. Moving forward, package maintainers can simply bui...

Simon Willison·9 days ago

Simon Willison· ANALYST

luau-wasm 0.1a0

Release: luau-wasm 0.1a0 See Publishing WASM wheels to PyPI for use with Pyodide for details. Tags: lua , webassembly , pyodide

Simon Willison·9 days ago

← Front Page30 stories

← Newer Older →