Section · Papers

Papers

Fresh arrivals from arXiv cs.AI, cs.CL, and cs.LG. The raw research feed.

Taming Outlier Tokens in Diffusion Transformers

Study identifies outlier tokens in Diffusion Transformers that attract disproportionate attention in image generation, affecting both encoder and denoiser layers.

Xiaoyu Wu·22 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Implicit Representations of Grammaticality in Language Models

Research shows pretrained language models implicitly distinguish grammaticality from string probability through internal representations, despite surface statistics.

Yingshan Susan Wang·22 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Grokability in five inequalities

Grok AI model discovered five new mathematical inequalities and bounds in convex geometry and combinatorics, verified by human authors.

Paata Ivanisvili·22 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Almost-Orthogonality in Lp Spaces: A Case Study with Grok

Mathematical analysis refuting Carbery's triangle inequality conjecture for Lp spaces with counterexample and sharp bounds on exponent.

Ziang Chen·22 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

LongSeeker: Elastic Context Orchestration for Long-Horizon Search Agents

LongSeeker proposes Context-ReAct paradigm for elastic context management in long-horizon search agents, maintaining trajectory at variable detail levels.

Yijun Lu·22 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Sharp Capacity Thresholds in Linear Associative Memory: From Winner-Take-All to Listwise Retrieval

Theoretical analysis establishes sharp capacity thresholds for linear associative memory, showing d²∼n log n scaling for top-1 retrieval via phase transition.

Nicholas Barnfield·22 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Estimating the expected output of wide random MLPs more efficiently than sampling

Method estimates expected outputs of wide random MLPs without sampling by propagating activation distributions via cumulants and Hermite expansions.

Wilson Wu·22 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Understanding In-Context Learning for Nonlinear Regression with Transformers: Attention as Featurizer

Theoretical framework explains transformers' in-context learning on nonlinear regression by showing attention mechanisms construct polynomial and spline bases.

Alexander Hsu·22 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

MRI-Eval: A Tiered Benchmark for Evaluating LLM Performance on MRI Physics and GE Scanner Operations Knowledge

MRI-Eval benchmark with 1365 items assesses LLM performance on MRI physics and GE scanner operations with tiered difficulty and diagnostic conditions.

Perry E. Radau·22 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

When Life Gives You BC, Make Q-functions: Extracting Q-values from Behavior Cloning for On-Robot Reinforcement Learning

Q2RL algorithm extracts Q-functions from behavior cloning for efficient offline-to-online robot learning, preventing policy collapse via distribution mismatch.

Lakshita Dodeja·22 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Design Conductor 2.0: An agent builds a TurboQuant inference accelerator in 80 hours

Design Conductor 2.0 autonomous agent builds hardware accelerators (TurboQuant) in 80 hours using frontier April 2026 models, demonstrating 80x capability scaling over prior work.

The Verkor Team·22 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

The First Token Knows: Single-Decode Confidence for Hallucination Detection

First-token confidence (phi_first) from single greedy decode detects LLM hallucinations as effectively as multi-sample semantic self-consistency with lower computational cost.

Mina Gabriel·22 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Geometry-Aware State Space Model: A New Paradigm for Whole-Slide Image Representation

Geometry-Aware State Space Model applies hyperbolic geometry to whole-slide histopathology image analysis via Multiple Instance Learning, improving patch aggregation for gigapixel resolution.

Enhui Chai·22 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

PSK at SemEval-2026 Task 9: Multilingual Polarization Detection Using Ensemble Gemma Models with Synthetic Data Augmentation

SemEval-2026 Task 9 system fine-tunes Gemma 3 (12B/27B) per-language with LoRA and GPT-4o-mini synthetic data augmentation for 22-language polarization detection.

Srikar Kashyap Pulipaka·23 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Aes3D: Aesthetic Assessment in 3D Gaussian Splatting

Aes3D proposes aesthetic assessment framework for 3D Gaussian Splatting, addressing composition and visual appeal evaluation beyond reconstruction fidelity.

Chuanzhi Xu·23 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Superposition Is Not Necessary: A Mechanistic Interpretability Analysis of Transformer Representations for Time Series Forecasting

Sparse autoencoders reveal PatchTST uses non-superposed, task-specific representations for time-series forecasting, explaining competitiveness against simple linear models.

Alper Yıldırım·23 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

What Matters in Practical Learned Image Compression

Comprehensive study of learned image compression design choices balancing perceptual quality and runtime, introducing novel techniques for practical human-visual-system-optimized codecs.

Kedar Tatwawadi·23 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Human-AI Co-Mentorship in Project-Based Learning: A Case Study in Financial Forecasting

Case study of high-school/undergraduate students using AI tools for financial forecasting research, highlighting human-AI co-mentorship acceleration of learning outcomes.

Freyaa Chawla·23 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Executable World Models for ARC-AGI-3 in the Era of Coding Agents

Coding agent with executable Python world models, verification, and simplicity-bias refactoring solves 25 public ARC-AGI-3 games without task-specific logic.

Sergey Rodionov·23 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Low-Cost Black-Box Detection of LLM Hallucinations via Dynamical System Prediction

Koopman operator theory applied to LLM embeddings as dynamical system enables low-cost black-box hallucination detection without sampling or external retrieval.

Dan Wilson·23 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Transformed Latent Variable Multi-Output Gaussian Processes

T-LVMOGP framework scales Multi-Output Gaussian Processes to high-dimensional outputs via transformed latent variables.

Xiaoyu Jiang·23 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Joint Treatment Effect Estimation from Incomplete Healthcare Data: Temporal Causal Normalizing Flows with LLM-driven Evolutionary MNAR Imputation

CausalFlow-T applies DAG-constrained normalizing flows and LLM-driven imputation for treatment effect estimation in incomplete EHR data.

Olivia Jullian Parra·23 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Conditional outlier detection for clinical alerting

Data-driven anomaly detection flags unusual patient-management actions in EHR systems to reduce clinical errors.

Milos Hauskrecht·23 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Adaptive Policy Selection and Fine-Tuning under Interaction Budgets for Offline-to-Online Reinforcement Learning

Adaptive policy selection method improves offline-to-online RL by combining off-policy and online evaluation under interaction budgets.

Alper Kamil Bozkurt·23 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Beyond Semantics: An Evidential Reasoning-Aware Multi-View Learning Framework for Trustworthy Mental Health Prediction

Multi-view evidential reasoning framework for mental health prediction from text with calibrated uncertainty estimation.

Yucheng Ruan·23 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Physiologically Grounded Driver Behavior Classification: SHAP-Driven Elite Feature Selection and Hybrid Gradient Boosting for Multimodal Physiological Signals

SHAP-based feature selection and hybrid boosting classify driving behaviors from multimodal physiological signals (EEG, EMG, GSR).

Sahar Askari·23 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

On the Wasserstein Gradient Flow Interpretation of Drifting Models

Wasserstein Gradient Flow analysis characterizes Generative Modeling via Drifting (GMD) as fixed-point optimization in probability measure space.

Arthur Gretton·23 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

On the Hardness of Junking LLMs

Analysis of LLM jailbreak vulnerability without structured prompts reveals robustness gaps in current safety defenses.

Marco Rando·23 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Manifold Steering Reveals the Shared Geometry of Neural Network Representation and Behavior

Manifold steering interventions causally link neural activation geometry to model behavior via structured representation space.

Daniel Wurgaft·23 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

How Long Does Infinite Width Last? Signal Propagation in Long-Range Linear Recurrences

Finite-width signal propagation analysis shows when infinite-width approximation breaks down in long linear recurrences.

Mariia Seleznova·23 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Rollout Pass-Rate Control: Steering Binary-Reward RL Toward Its Most Informative Regime

Proposes Prefix Sampling to optimize RL training efficiency by maintaining 50% pass rate—the regime maximizing reward signal and entropy in agentic tasks like SWE-bench.

Tianshu Zhu·23 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

LineRides: Line-Guided Reinforcement Learning for Bicycle Robot Stunts

LineRides framework enables bicycle robot to learn complex stunts via line-guided RL without demonstrations, using spatial guidelines and sparse keyframe constraints.

Seungeun Rho·23 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Building informative materials datasets beyond targeted objectives

Framework for materials science dataset construction balancing targeted property optimization against preservation of untargeted outcomes via diversity-aware selection.

Rafael Espinosa Castañeda·23 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Text Corpora as Concept Fields: Black-Box Hallucination and Novelty Measurement

Introduces Concept Field method to detect hallucination and measure novelty in LLM outputs by modeling semantic drift in text corpora using sentence embeddings.

Nicholas S. Kersting·23 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Unified Framework of Distributional Regret in Multi-Armed Bandits and Reinforcement Learning

Unified theoretical framework for distributional regret bounds in bandits and episodic RL, with UCBVI-style algorithm achieving gap-independent guarantees.

Harin Lee·23 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Continual Knowledge Updating in LLM Systems: Learning Through Multi-Timescale Memory Dynamics

Memini: associative memory system with multi-timescale dynamics for continual knowledge updating in deployed LLMs without explicit management.

Andreas Pattichis·23 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

A Bayesian Approach for Task-Specific Next-Best-View Selection with Uncertain Geometry

Bayesian framework for active view selection in 3D reconstruction using posterior inference over implicit surfaces.

Jingsen Zhu·23 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Proximal Projection for Doubly Sparse Regularized Models

Doubly sparse regularization exploiting Gaussian graphical model structure for high-dimensional regression.

Jia Wei He·23 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Driver-WM: A Driver-Centric Traffic-Conditioned Latent World Model for In-Cabin Dynamics Rollout

Driver-WM: latent world model for predicting driver reactions during L2/L3 automation transitions using in-cabin behavioral dynamics.

Haozhuang Chi·24 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Think-Aloud Reshapes Automated Cognitive Model Discovery Beyond Behavior

Think-aloud traces improve automated cognitive model discovery beyond behavior-only constraints in risky decision-making tasks.

Hanbo Xie·24 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Automatically Finding and Validating Unexpected Side-Effects of Interventions on Language Models

Automated pipeline for auditing unexpected behavioral side-effects of LLM interventions through contrastive multi-token generation analysis.

Quintin Pope·24 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Gated Multimodal Learning for Interpretable Property Energy Performance Prediction and Retrofit Scenario Analysis

Gated multimodal model combining EPC tabular data and assessor text to predict building energy efficiency scores.

Yunfei Bai·24 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Order Matters: Improving Domain Adaptation by Reordering Data

ORDERED: variance reduction for unsupervised domain adaptation via optimal data reordering during training.

Andrea Napoli·24 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Provable imitation learning for control of instability in partially-observed Vlasov--Poisson equations

Imitation learning for stabilizing Vlasov-Poisson plasma control using sparse macroscopic diagnostics with stability guarantees.

Xiaofan Xia·24 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

The Pinocchio Dimension: Phenomenality of Experience as the Primary Axis of LLM Psychometric Differences

Psychometric analysis of 50 LLMs identifies phenomenal experience as primary variance axis via Pinocchio dimension.

Hubert Plisiecki·24 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Look Once, Beam Twice: Camera-Primed Real-Time Double-Directional mmWave Beam Management for Vehicular Connectivity

Vision-based mmWave beam management system for V2X vehicular connectivity using camera sensing and closed-loop learning.

Avhishek Biswas·24 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

The Impossibility Triangle of Long-Context Modeling

Theoretical proof that long-context models cannot simultaneously optimize efficiency, compactness, and recall—fundamental trade-off affecting Transformers and SSMs.

Yan Zhou·24 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Full-chip CMP modelling based on Fully Convolutional Network leveraging White Light Interferometry

Fully convolutional neural network for chemical-mechanical polishing modeling in IC manufacturing using white light interferometry.

Jules Exbrayat·1 day ago

arXiv (cs.AI/CL/LG)· ACADEMIA

SoK: Robustness in Large Language Models against Jailbreak Attacks

Systematic review of jailbreak attack and defense methods for LLMs with critique of narrow evaluation metrics like attack success rate.

Feiyue Xu·1 day ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Adaptive Learning Strategies for AoA-Based Outdoor Localization: A Comprehensive Framework

Adaptive deep learning framework for angle-of-arrival based outdoor localization in 5G/6G networks with flexible training strategies.

Bac Trinh-Nguyen·1 day ago

← Front Page50 stories