The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Claude OpenAI Anthropic Gemini Mistral Cursor

Skill Neologisms: Towards Skill-based Continual Learning

Skill neologisms—soft tokens optimized for new capabilities—enable selective LLM skill extension without catastrophic forgetting or context limits.

Antonin Berthon·1 day ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Reliable Modeling of Distribution Shifts via Displacement-Reshaped Optimal Transport

ReshapeOT improves optimal transport for distribution shifts by reshaping ground metrics using observed sample displacements.

Philip Naumann·1 day ago

Simon Willison· ANALYST

Vibe coding and agentic engineering are getting closer than I'd like

Simon Willison observes convergence between vibe coding and agentic engineering in practical AI-assisted development workflows.

Simon Willison·1 day ago

arXiv (cs.AI/CL/LG)· ACADEMIA

TabEmbed: Benchmarking and Learning Generalist Embeddings for Tabular Understanding

TabEmbed introduces first generalist embedding model for tabular data and TabBench, a comprehensive benchmark for tabular understanding evaluation.

Minjie Qiang·1 day ago

arXiv (cs.AI/CL/LG)· ACADEMIA

EP-GRPO: Entropy-Progress Aligned Group Relative Policy Optimization with Implicit Process Guidance

EP-GRPO fixes credit assignment failures in GRPO-based LLM reasoning via token-level entropy, polarity-aware rewards, and zero-variance collapse mitigation.

Song Yu·1 day ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Delving into Non-Exchangeability for Conformal Prediction in Graph-Structured Multivariate Time Series

Conformal prediction applied to graph-structured time series; addresses non-exchangeability via spectral graph theory for rigorous uncertainty quantification.

Ruichao Guo·1 day ago

arXiv (cs.AI/CL/LG)· ACADEMIA

KernelBench-X: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

KernelBench-X evaluates LLM-generated Triton GPU kernels across 176 tasks; finds task structure explains 3x more correctness variance than method design.

Han Wang·1 day ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Order-based Rehearsal Learning

First order-based rehearsal learning method for avoiding undesired futures; uses ordinal structures instead of graph estimation.

Yu-Xuan Tao·1 day ago

arXiv (cs.AI/CL/LG)· ACADEMIA

On the Influence of the Feature Computation Budget on Per-Instance Algorithm Selection for Black-Box Optimization

Study determines optimal feature computation budget fraction for per-instance algorithm selection in black-box optimization.

Koen van der Blom·1 day ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Adaptive Inverted-Index Routing for Granular Mixtures-of-Experts

AIR-MoE uses vector quantization for efficient routing in granular mixture-of-experts, reducing computational overhead of token-to-expert assignment.

Klaus-Rudolf Kladny·1 day ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Adapting Large Language Models to a Low-Resource Agglutinative Language: A Comparative Study of LoRA and QLoRA for Bashkir

Comparative study of LoRA and QLoRA fine-tuning on Bashkir, a low-resource Turkic language, using models from DistilGPT2 to Qwen2.5-7B.

Mullosharaf K. Arabov·1 day ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Training-Time Batch Normalization Reshapes Local Partition Geometry in Piecewise-Affine Networks

Theoretical analysis of batch normalization's effect on geometry of piecewise-affine networks during training via hyperplane switching.

Xuan Qi·1 day ago

arXiv (cs.AI/CL/LG)· ACADEMIA

DART: A Vision-Language Foundation Model for Comprehensive Rope Condition Monitoring

DART, a vision-language foundation model for synthetic fiber rope condition monitoring, provides severity estimates, maintenance recommendations, and automated reports.

Anju Rani·1 day ago

arXiv (cs.AI/CL/LG)· ACADEMIA

UFAL-CUNI at SemEval-2026 Task 11: An Efficient Modular Neuro-symbolic Method for Syllogistic Reasoning

Neuro-symbolic system combining LLM parser with automated theorem prover for syllogistic reasoning in SemEval-2026 Task 11.

Ivan Kartáč·1 day ago

r/LocalLLaMA· COMMUNITY

Qwen3.6 27B NVFP4 + MTP on a single RTX 5090: 200k context working in vLLM

User demonstrates Qwen3.6 27B running 200k context on single RTX 5090 with NVFP4 quantization in vLLM, sharing exact configuration and parameters.

u/Maheidem·1 day ago·41 pts / 11 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Modular Reinforcement Learning For Cooperative Swarms

Modular multi-agent reinforcement learning approach for cooperative robot swarms with limited communication and local interaction.

Erel Shtossel·1 day ago

r/singularity· COMMUNITY

Religious robots are coming: South Korea's first autonomous humanoid robot converts to Buddhism

South Korean humanoid robot programmed with Buddhist practices; novelty claim lacks technical substance or robotics advancement details.

u/GeneReddit123·1 day ago·127 pts / 62 comm·+ covered by others

TechCrunch AI· PRESS

3 days left to lock in 50% off a second ticket to TechCrunch Disrupt 2026

Three days left to lock in 50% off a second ticket to Disrupt 2026. Buy one TechCrunch Disrupt 2026 ticket, and get a second ticket at 50% off. Gain more visibility in the tech industry. Offer ends May 8 at 11:59 p.m. PT.

TechCrunch Events·1 day ago·+ covered by others

arXiv (cs.AI/CL/LG)· ACADEMIA

Jacobian-Velocity Bounds for Deployment Risk Under Covariate Drift

Drift-aligned tangent regularization (DTR) bounds deployment risk under covariate shift using Jacobian-velocity theorem and Poincaré inequalities.

Jonathan R. Landers·1 day ago

arXiv (cs.AI/CL/LG)· ACADEMIA

When Does Gene Regulatory Network Inference Break? A Controlled Diagnostic Study of Causal and Correlational Methods on Single-Cell Data

Controlled benchmark study diagnosing when causal vs. correlational methods fail for gene regulatory network inference from single-cell RNA-seq.

Miguel Fernandez-de-Retana·1 day ago

TechCrunch AI· PRESS

AI boom pushes Samsung to $1T

Samsung crossed the $1 trillion valuation mark after shares surged on AI-driven chip demand, making it only the second Asian company after TSMC to hit the milestone.

Kate Park·1 day ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Unintended Negative Impacts of Promotional Language in Patent Evaluation

Large-scale USPTO study finds promotional language in patents negatively correlates with approval probability, contrary to science communication norms.

Bingkun Zhao·1 day ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Evolving Idea Graphs with Learnable Edits-and-Commits for Multi-Agent Scientific Ideation

Evolving Idea Graphs (EIG), a multi-agent LLM framework using learnable graph edits for scientific ideation with novelty, feasibility, clarity metrics.

Jiangwen Dong·1 day ago

r/ClaudeAI· COMMUNITY

Kindergarten-grade nouns

Reddit user reports Claude Opus struggles to distinguish word obscurity via corpus frequency vs. human recognition familiarity.

u/babelphishy·1 day ago·58 pts / 5 comm

r/OpenAI· COMMUNITY

Anyone else hate reading AI generated text?

Reddit user expresses frustration with detectability and stylistic uniformity of AI-generated text across news and government documents.

u/Connect-Painter-4270·1 day ago·52 pts / 76 comm

r/singularity· COMMUNITY

The Blue Collar Delusion: Why the machines don’t have to climb up to where we are, because the work will descend to meet them

Mechanic argues blue-collar work faces AI displacement risk through task simplification rather than machine capability escalation, challenging consensus on trade job resilience.

u/_noise-complaint·1 day ago·164 pts / 53 comm

The Verge AI· PRESS

Google’s AI search summaries will now quote Reddit

Want real human feedback related to your search results? Google’s AI now fetches it for you. | Image by Google / The Verge Google is updating its AI Search features to make it easier for users to find information from sources they know and trust. One of the more notable changes introduces "a preview of perspectives" from firsthand sources like social media, Reddit, and other web forums, effectively linking your search queries with online conversations around similar topics. Google says this update aims to address that "people are increasingly seeking out advice from others" when searching for...

Jess Weatherbed·1 day ago

r/LocalLLaMA· COMMUNITY

An Open Benchmark for Testing RAG on Realistic Company-Internal Data

EnterpriseRAG-Bench: 500k-document synthetic dataset benchmarking RAG systems on realistic internal company data (Slack, email, tickets, PRs) vs. public corpora.

u/Weves11·1 day ago·41 pts / 14 comm

r/ClaudeAI· COMMUNITY

Voice + Claude my daily workflow for building stuff

Developer describes workflow using Claude voice for brainstorming during walks, then Claude Code for implementation.

u/dspv·1 day ago·24 pts / 31 comm

r/ClaudeAI· COMMUNITY

Dictation is the fastest way to work now, but how do you deal with the awkwardness of using it in an open office?

I'm a fast typer, but I find my projects go a lot better when I'm able to really dictate with Claude. I appreciate this won't be the case for all of you. At the moment I'm much more productive if I'm working from home or in a quiet space. There is a sensitivity setting on FluidVoice so I try to whisper, but so far it just ends up feeling too awkward and I go immediately back to typing. Also someone inevitably starts talking louder somewhere else in the office and the acoustics can impact what I'm saying. You can't express your questions and theories as freely as you'd like, because you'...

u/snowliondev·1 day ago·21 pts / 58 comm

← Front Page30 stories

← Newer Older →

The Archive

Skill Neologisms: Towards Skill-based Continual Learning

Reliable Modeling of Distribution Shifts via Displacement-Reshaped Optimal Transport

Vibe coding and agentic engineering are getting closer than I'd like

TabEmbed: Benchmarking and Learning Generalist Embeddings for Tabular Understanding

EP-GRPO: Entropy-Progress Aligned Group Relative Policy Optimization with Implicit Process Guidance

Delving into Non-Exchangeability for Conformal Prediction in Graph-Structured Multivariate Time Series

KernelBench-X: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

Order-based Rehearsal Learning

On the Influence of the Feature Computation Budget on Per-Instance Algorithm Selection for Black-Box Optimization

Adaptive Inverted-Index Routing for Granular Mixtures-of-Experts

Adapting Large Language Models to a Low-Resource Agglutinative Language: A Comparative Study of LoRA and QLoRA for Bashkir

Training-Time Batch Normalization Reshapes Local Partition Geometry in Piecewise-Affine Networks

DART: A Vision-Language Foundation Model for Comprehensive Rope Condition Monitoring

UFAL-CUNI at SemEval-2026 Task 11: An Efficient Modular Neuro-symbolic Method for Syllogistic Reasoning

Qwen3.6 27B NVFP4 + MTP on a single RTX 5090: 200k context working in vLLM

Modular Reinforcement Learning For Cooperative Swarms

Religious robots are coming: South Korea's first autonomous humanoid robot converts to Buddhism

3 days left to lock in 50% off a second ticket to TechCrunch Disrupt 2026

Jacobian-Velocity Bounds for Deployment Risk Under Covariate Drift

When Does Gene Regulatory Network Inference Break? A Controlled Diagnostic Study of Causal and Correlational Methods on Single-Cell Data

AI boom pushes Samsung to $1T

Unintended Negative Impacts of Promotional Language in Patent Evaluation

Evolving Idea Graphs with Learnable Edits-and-Commits for Multi-Agent Scientific Ideation

Kindergarten-grade nouns

Anyone else hate reading AI generated text?

The Blue Collar Delusion: Why the machines don’t have to climb up to where we are, because the work will descend to meet them

Google&#8217;s AI search summaries will now quote Reddit

An Open Benchmark for Testing RAG on Realistic Company-Internal Data

Voice + Claude my daily workflow for building stuff

Dictation is the fastest way to work now, but how do you deal with the awkwardness of using it in an open office?

Google’s AI search summaries will now quote Reddit