The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Claude OpenAI Anthropic Gemini Mistral Cursor

From Shield to Target: Denial-of-Service Attacks on LLM-Based Agent Guardrails

LLM-based guardrails have emerged as a highly effective defense against prompt injection and jailbreak attacks in autonomous agents. However, we reveal that the very reasoning and task-following capabilities enabling this protection introduce a novel vulnerability: attackers can inject crafted data to trap the guardrail in extended reasoning loops, effectuating a systematic denial-of-service (DoS) attack. To systematically expose this threat, we design a beam-search optimization framework that crafts natural-language payloads to maximize guardrail reasoning length, utilizing an LLM proposer g...

Yuguang Zhou·11 days ago

The Archive

From Shield to Target: Denial-of-Service Attacks on LLM-Based Agent Guardrails

Every Eval Ever: A Unifying Schema and Community Repository for AI Evaluation Results

Securing the Future of IoMT in the Post-Quantum Era: An Edge-Native Federated Learning Approach

Deploy Long-Context Reasoning and Agentic Workflows with MiniMax M3 on NVIDIA Accelerated Infrastructure

Fodor and Pylyshyn's Systematicity Challenge Still Stands

PepALD: Macrocyclic Peptide Generation via Autoregressive Latent Diffusion

Dense Coordinate-List Fine-Tuning Induces a Controllable Interference Surface in Vision-Language Models

Beyond the Training Distribution: Evaluating Predictions Under Distribution Shift and Selection Bias

SpaceX’s massive IPO: all the latest news

From Chatbot to Digital Colleague: The Paradigm Shift Toward Persistent Autonomous AI

A Fixed-Point Neural Operator for Size- and Functional-Transferable Hamiltonian Prediction

Recipe-Controlled Decoder Audit for Structural Knowledge-Graph Completion

Nonlinear Two-Time-Scale Stochastic Approximation: A Sharp Phase Transition and How to Beat It

When the Tool Decides: LLM Agents Defer Blindly to Graph Neural Network Tools, and Stronger Backbones Defer More

SpaceX IPO: Everything you need to know

Jeff Bezos’ AI startup aims to build an ‘artificial general engineer’

GitOfThoughts: Version-Controlled Reasoning and Agent Memory You Can Replay, Diff, and Merge

The Perceived Fragility of Explanations in Audio Models: Manipulation of Attribution with Unchanged Predictions

EM-NeSy: Expectation Maximization for Neurosymbolic Learning

A Computational Audit of Demographic Association Encoding in ClinicalBERT Language Predictions

MoDiCoL: A Modular Diagnostic Continual Learning Dataset for Robust Speech Recognition

tap: A File-Based Protocol for Heterogeneous LLM Agent Collaboration

CADET: Physics-Grounded Causal Auditing and Training-Free Deconfounding of End-to-End Driving Planners

Coping in Crisis: Computational Modeling of Coping Styles in Digital Crisis Discourse During the 2023 Turkiye Earthquake

Causal Object-Centric Models for Planning with Monte Carlo Tree Search

Federated Learning for Feature Generalization with Convex Constraints

CSPO: Constraint-Sensitive Policy Optimization for Safe Reinforcement Learning

Hy-Embodied-0.5-VLA: From Vision-Language-Action Models to a Real-World Robot Learning Stack

A theoretical model for task routing in mixture-of-expert transformers

Running the Gauntlet: Re-evaluating the Capabilities of Agents Beyond Familiar Environments