The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

DualFact multimodal framework separates factual verification in procedural video captioning into conceptual and contextual facts.

Cennet Oguz·11 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Bye Bye Perspective API: Lessons for Measurement Infrastructure in NLP, CSS and LLM Evaluation

Analysis of Perspective API shutdown exposes structural dependence of NLP/LLM evaluation on single proprietary toxicity measurement tool.

David Hartmann·11 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Marco-MoE: Open Multilingual Mixture-of-Expert Language Models with Efficient Upcycling

Marco-MoE open-weight multilingual sparse MoE models with 5% parameter activation and best-in-class performance-to-compute ratio.

Fan Jiang·11 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Dictionary learning for Kernel EDMD

Dictionary learning method for Kernel EDMD approximation of nonlinear dynamical systems via Koopman operators.

Erik Lien Bolager·11 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Benchmarking bandgap prediction in semiconductors under experimental and realistic evaluation settings

RealMat-BaG benchmark for semiconductor bandgap prediction under experimental conditions using GNNs; addresses domain generalization challenges.

Haolin Wang·11 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

SnapGuard: Lightweight Prompt Injection Detection for Screenshot-Based Web Agents

SnapGuard detects prompt injection attacks on screenshot-based web agents using lightweight multimodal methods instead of large VLMs.

Mengyao Du·11 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

From CRUD to Autonomous Agents: Formal Validation and Zero-Trust Security for Semantic Gateways in AI-Native Enterprise Systems

Semantic Gateway framework applies formal validation and zero-trust security to LLM-orchestrated enterprise APIs using Model Context Protocol.

Ignacio Peyrano·11 days ago

r/Anthropic· COMMUNITY

Why is this happening?

https://preview.redd.it/o43dx9b6cxxg1.png?width=796&format=png&auto=webp&s=64ea0822d0090847bf468e2efedc6179fb553e99

u/Objective_Frame_412·11 days ago·10 pts / 4 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Egocentric Tactile and Proximity Sensors as Observation Priors for Humanoid Collision Avoidance

RL-based study of tactile and proximity sensor properties for humanoid collision avoidance on H1-2 robot using dodgeball task.

Carson Kohlbrenner·11 days ago

r/LocalLLaMA· COMMUNITY

Qwen 3.6 27B BF16 vs Q4_K_M vs Q8_0 GGUF evaluation

Qwen 3.6 27B quantization benchmark: BF16 (69.78% avg accuracy, 54GB RAM) vs Q4_K_M (66.54%, 15GB) vs Q8_0 across HumanEval, HellaSwag, BFCL.

u/gvij·11 days ago·202 pts / 65 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

On Halting vs Converging in Recurrent Graph Neural Networks

Theoretical analysis of expressiveness tradeoffs between converging, output-converging, and halting variants of Recurrent Graph Neural Networks.

Jeroen Bollen·11 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Enhancing SignSGD: Small-Batch Convergence Analysis and a Hybrid Switching Strategy

SignSGD improvements via small-batch convergence analysis and hybrid dithering strategy for 1-bit gradient compression.

Haoran Chen·11 days ago

r/ClaudeAI· COMMUNITY

PullMD - gave Claude Code an MCP server so it stops burning tokens parsing HTML

PullMD MCP server extracts article content for Claude Code, reducing token waste on HTML parsing by filtering boilerplate.

u/SYSWAVE·11 days ago·21 pts / 13 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Medoid Prototype Alignment for Cross-Plant Unknown Attack Detection in Industrial Control Systems

Medoid prototype alignment framework for cross-plant ICS intrusion detection without labeled data for unseen attacks.

Luyao Wang·11 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Sample-efficient Neuro-symbolic Proximal Policy Optimization

Neuro-symbolic PPO extension transfers logical policy specs from easier tasks to improve sample efficiency in sparse-reward RL.

Simone Murari·11 days ago

TechCrunch AI· PRESS

Otter’s new feature lets users search across their enterprise tools

Otter is also releasing a new Windows app that can capture meeting notes without joining one

Ivan Mehta·11 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

The Surprising Effectiveness of Canonical Knowledge Distillation for Semantic Segmentation

Wall-clock analysis shows canonical logit/feature-based knowledge distillation outperforms recent methods for semantic segmentation.

Muhammad Ali·11 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

AI as Consumer and Participant: A Co-Design Agenda for MBSE Substrates and Methodology

Opinion piece: MBSE models used with LLMs function as prompts rather than knowledge bases; co-design methodology needed.

Siyuan Ji·11 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

From Chatbots to Confidants: A Cross-Cultural Study of LLM Adoption for Emotional Support

Cross-cultural survey of 4,641 users finds LLM emotional support adoption varies 20–59% across seven countries.

Natalia Amat-Lefort·11 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Automated Adversarial Collaboration for Advancing Theory Building in the Cognitive Sciences

Automated adversarial collaboration framework using LLM agents and program synthesis to adjudicate competing cognitive science theories.

Suyog Chandramouli·11 days ago

r/singularity· COMMUNITY

OpenAI ends its exclusive partnership with Microsoft

OpenAI terminates exclusive partnership with Microsoft, reshaping enterprise AI licensing and competitive dynamics in foundation model distribution.

u/JackFisherBooks·11 days ago·100 pts / 29 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

PHISHREV: A Hybrid Machine Learning and Post-Hoc Non-monotonic Reasoning Framework for Context-Aware Phishing Website Classification

Hybrid ML and Answer Set Programming framework for phishing detection with post-hoc belief revision layer.

Mainak Sen·11 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Dyna-Style Safety Augmented Reinforcement Learning: Staying Safe in the Face of Uncertainty

Dyna-SAuR algorithm for safe reinforcement learning combining uncertainty-aware dynamics models with scalable safety filters.

Artur Eisele·11 days ago

The Verge AI· PRESS

Google and Pentagon reportedly agree deal for ‘any lawful’ use of AI

Google has signed a classified deal that allows the US Department of Defense to use its AI models for "any lawful government purpose," The Information reports. The agreement was reported less than a day after Google employees demanded CEO Sundar Pichai block the Pentagon from using its AI amid concerns that it would be used in "inhumane or extremely harmful ways." If the agreement is confirmed, it would place Google alongside OpenAI and xAI, which have also made classified AI deals with the US government. Anthropic was also among that list until it was blacklisted by the Pentagon for refusing...

Jess Weatherbed·11 days ago·+ covered by others

arXiv (cs.AI/CL/LG)· ACADEMIA

Assistants, Not Architects: The Role of LLMs in Networked Systems Design

Empirical study showing LLMs lack reliable architectural reasoning for networked systems design despite architectural complexity.

Pratyush Sahu·11 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

EvoTSC: Evolving Feature Learning Models for Time Series Classification via Genetic Programming

EvoTSC genetic programming method for evolving lightweight feature learning models for time series classification.

Xuanhao Yang·11 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

SymphonyGen: 3D Hierarchical Orchestral Generation with Controllable Harmony Skeleton

SymphonyGen hierarchical framework for 3D orchestral music generation with cascading decoder architecture.

Xuzheng He·11 days ago

The Verge AI· PRESS

Attack of the killer script kiddies

Last August, some of the best cybersecurity teams in the business gathered in Las Vegas to demonstrate the strength of their AI bug-finding systems at DARPA's Artificial Intelligence Cyber Challenge (AIxCC). The tools had scanned 54 million lines of actual software code that DARPA had injected with artificial flaws. The teams were capable enough to identify most of the artificial bugs, but their automated tools went beyond that - they found more than a dozen bugs that DARPA hadn't inserted at all. Even before the security earthquake that Anthropic delivered this month with Claude Mythos - the...

Yael Grauer·11 days ago

r/LocalLLaMA· COMMUNITY

Deepseek Vision Coming

Deepseek Vision model announced or coming soon per Xiaokang Chen social media post.

u/Nunki08·11 days ago·129 pts / 25 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Improving Zero-Shot Offline RL via Behavioral Task Sampling

Behavioral task sampling method improving zero-shot offline RL generalization via learned task vector distributions.

Nazim Bendib·11 days ago

← Front Page30 stories

← Newer Older →

The Archive

DualFact+: A Multimodal Fact Verification Framework for Procedural Video Understanding

Bye Bye Perspective API: Lessons for Measurement Infrastructure in NLP, CSS and LLM Evaluation

Marco-MoE: Open Multilingual Mixture-of-Expert Language Models with Efficient Upcycling

Dictionary learning for Kernel EDMD

Benchmarking bandgap prediction in semiconductors under experimental and realistic evaluation settings

SnapGuard: Lightweight Prompt Injection Detection for Screenshot-Based Web Agents

From CRUD to Autonomous Agents: Formal Validation and Zero-Trust Security for Semantic Gateways in AI-Native Enterprise Systems

Why is this happening?

Egocentric Tactile and Proximity Sensors as Observation Priors for Humanoid Collision Avoidance

Qwen 3.6 27B BF16 vs Q4_K_M vs Q8_0 GGUF evaluation

On Halting vs Converging in Recurrent Graph Neural Networks

Enhancing SignSGD: Small-Batch Convergence Analysis and a Hybrid Switching Strategy

PullMD - gave Claude Code an MCP server so it stops burning tokens parsing HTML

Medoid Prototype Alignment for Cross-Plant Unknown Attack Detection in Industrial Control Systems

Sample-efficient Neuro-symbolic Proximal Policy Optimization

Otter’s new feature lets users search across their enterprise tools

The Surprising Effectiveness of Canonical Knowledge Distillation for Semantic Segmentation

AI as Consumer and Participant: A Co-Design Agenda for MBSE Substrates and Methodology

From Chatbots to Confidants: A Cross-Cultural Study of LLM Adoption for Emotional Support

Automated Adversarial Collaboration for Advancing Theory Building in the Cognitive Sciences

OpenAI ends its exclusive partnership with Microsoft

PHISHREV: A Hybrid Machine Learning and Post-Hoc Non-monotonic Reasoning Framework for Context-Aware Phishing Website Classification

Dyna-Style Safety Augmented Reinforcement Learning: Staying Safe in the Face of Uncertainty

Google and Pentagon reportedly agree deal for ‘any lawful’ use of AI

Assistants, Not Architects: The Role of LLMs in Networked Systems Design

EvoTSC: Evolving Feature Learning Models for Time Series Classification via Genetic Programming

SymphonyGen: 3D Hierarchical Orchestral Generation with Controllable Harmony Skeleton

Attack of the killer script kiddies

Deepseek Vision Coming

Improving Zero-Shot Offline RL via Behavioral Task Sampling