The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Claude OpenAI Anthropic Gemini Mistral Cursor

Can Coding Agents Reproduce Findings in Computational Materials Science?

AutoMat benchmark evaluates LLM agents on reproducing computational materials science findings, requiring domain knowledge and result interpretation beyond code quality.

Ziyang Huang·7 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Generating Statistical Charts with Validation-Driven LLM Workflows

Workflow decomposes statistical chart generation into screening, synthesis, rendering, and validation-driven refinement with aligned artifacts for LLM training.

Pavlin G. Poličar·7 days ago

Ars Technica AI· PRESS

Minnesota passes ban on fake AI nudes; app makers risk $500K fines

More evidence of Grok CSAM seen as Minnesota passes nudifying app ban.

Ashley Belanger ·7 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

RunAgent: Interpreting Natural-Language Plans with Constraint-Guided Execution

RunAgent multi-agent platform executes natural-language plans with constraint-guided execution and explicit control constructs (IF, GOTO, FORALL) for structured workflows.

Arunabh Srivastava·7 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

When RAG Chatbots Expose Their Backend: An Anonymized Case Study of Privacy and Security Risks in Patient-Facing Medical AI

Security assessment of patient-facing medical RAG chatbot reveals backend exposure risks, with governance lessons for safe clinical AI deployment.

Alfredo Madrid-García·7 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Unsupervised Denoising of Real Clinical Low Dose Liver CT with Perceptual Attention Networks

Unsupervised low-dose CT denoising framework using Cycle-GAN-inspired deep learning to reduce noise while minimizing radiation exposure.

Jingxi Pu·7 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Make Your LVLM KV Cache More Lightweight

LightKV reduces LVLM KV cache memory overhead by exploiting vision-token embedding redundancy via cross-modality message passing during prefill.

Xihao Chen·7 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

SAVGO: Learning State-Action Value Geometry with Cosine Similarity for Continuous Control

SAVGO RL algorithm embeds state-action pairs with cosine similarity to shape policy updates, improving sample efficiency in continuous control.

Stavros Orfanoudakis·7 days ago

Stratechery· ANALYST

2026.18: Long-term, Peripheral & Myopic Visions

Stratechery weekly commentary on Amazon, AI strategy, AR device futures, and Beijing tech policy (April 2026).

Ben Thompson·7 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

GeoContra: From Fluent GIS Code to Verifiable Spatial Analysis with Geography-Grounded Repair

GeoContra framework verifies and repairs LLM-generated GIS Python code by enforcing geographic contracts including coordinate semantics, topology, and spatial predicates.

Yinhao Xiao·7 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Observable Performance Does Not Fully Reflect System Organization: A Multi-Level Analysis of Gait Dynamics Under Occlusal Constraint

Biomechanical case study analyzes gait dynamics under occlusal constraint in Parkinson's patient, showing performance metrics don't fully reflect system organization.

Jacques Raynal·7 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

LASE: Language-Adversarial Speaker Encoding for Indic Cross-Script Identity Preservation

LASE framework improves multilingual voice cloning speaker encoders for cross-script identity preservation in Indic languages using language-adversarial training.

Venkata Pushpak Teja Menta·7 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Directed Social Regard: Surfacing Targeted Advocacy, Opposition, Aid, Harms, and Victimization in Online Media

Directed Social Regard (DSR) NLP method detects mixed pro-social and anti-social sentiments with fine-grained targets in online text, improving on binary sentiment tools.

Scott Friedman·7 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Characterizing the Expressivity of Local Attention in Transformers

Theoretical analysis of local attention in transformers characterizes expressivity trade-offs between computational cost and model quality versus global attention.

Jiaoda Li·7 days ago

r/Anthropic· COMMUNITY

Claude Pro and $100 Plan

Reddit user complains about Claude Pro $20 tier rate limits and service degradation, considering upgrade to $100 plan.

u/Glittering_Pea_7226·7 days ago·10 pts / 29 comm

r/MachineLearning· COMMUNITY

Why ML conference reviews sometimes feel like a “lottery“ [D]

Reddit discussion on ML conference peer review variability: strong papers consistently accept, weak papers reject, middle-tier papers vulnerable to reviewer mismatch and capacity constraints.

u/Hope999991·7 days ago·30 pts / 20 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Meritocratic Fairness in Budgeted Combinatorial Multi-armed Bandits via Shapley Values

K-Shapley value extension for meritocratic fairness in budgeted combinatorial bandits with full-bandit feedback, applied to arm contribution attribution.

Shradha Sharma·7 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Learning the Helmholtz equation operator with DeepONet for non-parametric 2D geometries

DeepONet-based neural operator learns solution to 2D Helmholtz equation on non-parametric domains with arbitrary scatterer geometries using signed distance encoding.

Rodolphe Barlogis·7 days ago

r/OpenAI· COMMUNITY

OpenAI just quietly killed the AGI clause in their Microsoft deal

OpenAI removed AGI trigger clause from Microsoft deal, replacing it with 2032 date limit; enables multi-cloud licensing but signals shift away from founding governance principle.

u/Single-Jack8·7 days ago·51 pts / 26 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Themis: Training Robust Multilingual Code Reward Models for Flexible Multi-Criteria Scoring

Themis introduces multilingual code reward model framework and benchmark (Themis-CodeRewardBench) for multi-criteria code generation scoring beyond execution feedback.

Indraneil Paul·7 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

NonZero: Interaction-Guided Exploration for Multi-Agent Monte Carlo Tree Search

NonZero scales cooperative multi-agent MCTS via interaction-guided exploration over low-dimensional representation instead of full joint-action space expansion.

Sizhe Tang·7 days ago

TechCrunch AI· PRESS

Pentagon inks deals with Nvidia, Microsoft, and AWS to deploy AI on classified networks

The deals come as the DOD has doubled down on diversifying its exposure to AI vendors in the wake of its controversial dispute with Anthropic over usage terms of its AI models.

Ram Iyer·7 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Quantum Interval Bound Propagation for Certified Training of Quantum Neural Networks

Quantum interval bound propagation (QIBP) method adapts certified adversarial training from classical ML to quantum neural networks using bound tracking.

Emma Andrews·7 days ago

r/Anthropic· COMMUNITY

Time amnesia and “You’re tired” logic

I’ve noticed 2 things recently (even on 4.6). Time amnesia: 1. It used to be so good at understanding what the current day is and how far away a certain upcoming event is. Now even after a specific meeting it had in memory has passed it says it is upcoming and tries to get me prepared. And if I start a chat on a day of travel or at night or anything it has context on, it will forever think it is still that night or day. Pushing user to rest or not “spiral”: 2. The push to “rest”, refusing to give information more than once, or “it’s late, you’re spiraling” (even when it is incorrect and...

u/rowrow17·7 days ago·10 pts / 5 comm

MIT Tech Review· PRESS

Cyber-Insecurity in the AI Era

Cybersecurity was already under strain before AI entered the stack. Now, as AI expands the attack surface and adds new complexity, the limits of legacy approaches are becoming harder to ignore. This session from MIT Technology Review’s EmTech AI conference explores why security must be rethought with AI at its core, not layered on after…

MIT Technology Review Events·7 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Position: agentic AI orchestration should be Bayes-consistent

Position paper argues agentic AI orchestration layers should use Bayesian decision theory for handling uncertainty in tool selection and resource allocation.

Theodore Papamarkou·7 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Randomized Subspace Nesterov Accelerated Gradient

Technical paper on randomized-subspace Nesterov acceleration for first-order optimization with low-dimensional projected gradients.

Gaku Omiya·7 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Temporal Data Requirement for Predicting Unplanned Hospital Readmissions

Empirical study on EHR data windows for predicting hospital readmissions after joint arthroplasty using structured and unstructured clinical notes.

Ramin Mohammadi·7 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

To Call or Not to Call: A Framework to Assess and Optimize LLM Tool Calling

Framework to assess when LLMs should call external tools via decision theory, focusing on web search tool use and knowledge integration.

Qinyuan Wu·7 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

EASE: Federated Multimodal Unlearning via Entanglement-Aware Anchor Closure

Federated multimodal unlearning approach (EASE) addressing cross-modal entanglement in decentralized image-text model training.

Zihao Ding·7 days ago

← Front Page30 stories

← Newer Older →