The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Claude OpenAI Anthropic Gemini Mistral Cursor

How do I get support from Anthropic?!?!?!?!?!?!?!?

Reddit user reports billing issues and unresponsive support from Anthropic; anecdotal customer service complaint.

u/letmeinfornow·1 month ago·12 pts / 28 comm

Goal-driven Bayesian Optimal Experimental Design for Robust Decision-Making Under Model Uncertainty

GoBOED: goal-driven Bayesian experimental design framework optimizing information gathering for decision-critical objectives under model uncertainty.

Jinwoo Go·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

OrpQuant: Geometric Orthogonal Residual Projection for Multiplier-Free Power-of-Two Transformer Quantization

OrpQuant: geometric orthogonal residual projection for power-of-two quantization enabling multiplier-free LLM/ViT deployment on edge devices.

Maoyang Xiang·1 month ago

r/LocalLLaMA· COMMUNITY

AI content detector based on Qwen 0.8b fine-tuned on Pangram dataset

I've fine-tuned Qwen 3.5 0.8B on the dataset provided by Pangram with their EditLens paper. It's available via a [Chrome extension](https://chromewebstore.google.com/detail/slop-hammer/gfjdmhfokmhedlgfggmmgchpppmhkdgg); you can just click selected text and it's going to give you the probability distribution of how likely it is AI-generated. It takes under 1s on my M1 MacBook Pro. Pangram did release Llama 3.2 3B trained on their dataset, but I found this model slightly too legacy (too big for the capabilities). Qwen 0.8B (base) ended up being as good after roughly 20h of fine-tuning on a sin...

u/jslominski·1 month ago·48 pts / 55 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Channel-wise Vector Quantization

Channel-wise Vector Quantization replaces patch-wise image tokenization with channel-wise quantization for autoregressive visual models.

Wei Song·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

DiscoverPhysics: Benchmarking LLMs for Out-of-the-Box Scientific Thinking

DiscoverPhysics benchmark tests LLM reasoning by having agents discover physics laws in simulated worlds with non-standard dynamics.

Matt L. Wiemann·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Claw-Anything: Benchmarking Always-On Personal Assistants with Broader Access to User's Digital World

Claw-Anything benchmark evaluates LLM agents as always-on assistants with access to long-horizon histories and interdependent backend services.

Yusong Lin·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

VeriTrace: Evolving Mental Models for Deep Research Agents

VeriTrace proposes explicit feedback mechanisms for research agents to evolve mental models and prevent error propagation in uncertain information.

Haolang Zhao·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Automated Benchmark Auditing for AI Agents and Large Language Models

Auto Benchmark Audit uses agentic framework to systematically audit AI benchmarks for hidden dependencies, specification gaps, and grading flaws.

Junlin Wang·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Global Convergence of Wasserstein Policy Gradient for Entropy-Regularized Reinforcement Learning

Wasserstein Policy Gradient for entropy-regularized RL proves global convergence using optimal-transport geometry for continuous control.

Zhaoyu Zhu·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

StakeBench: Evaluating Language Understanding Grounded in Market Commitment

StakeBench evaluates financial language understanding using real market commitments from 560K+ Polymarket and Manifold comments instead of human labels.

Yunhua Pei·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Active Query Synthesis for Preference Learning

Active Query Synthesis for preference learning uses confidence-aware response model to reduce labeling cost for pairwise comparisons.

Namrata Nadagouda·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

WhoSaidIt: Human-LLM Collaborative Annotation for Text-Based Multilingual Speaker-Attribute Classification

WhoSaidIt applies human-LLM collaborative re-annotation to stabilize multilingual speaker-attribute labels via disagreement-focused sampling.

Lingyu Gao·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Rethinking Weak Supervision in Anomaly Detection: A Comprehensive Benchmark

WSADBench unifies weakly supervised anomaly detection evaluation across incomplete, inexact, inaccurate supervision for 36 algorithms and 4 modalities.

Xu Yao·1 month ago

r/ClaudeAI· COMMUNITY

Stop letting Claude glaze your bad product ideas

Developer shares prompt-based skills to make Claude challenge product ideas rather than automatically validate them via custom instructions.

u/Global-Tradition-318·1 month ago·29 pts / 33 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Conditional KRR: Injecting Unpenalized Features into Kernel Methods with Applications to Kernel Thresholding

Conditional kernel ridge regression with unpenalized features via conditionally positive definite kernels; primarily theoretical ML contribution.

Rustem Takhanov·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Paris 2.0: A Decentralized Diffusion Model for Video Generation

Paris 2.0: first decentralized video generation model trained without GPU clusters, extending prior Paris 1.0 image work.

Ali Rouzbayani·1 month ago

r/LocalLLaMA· COMMUNITY

CUDA: add fast walsh-hadamard transform by am17an · Pull Request #23615 · ggml-org/llama.cpp

llama.cpp adds fast Walsh-Hadamard transform (FWHT) for CUDA, yielding 1–2% prompt-processing and 7–9% token-generation speedups with quantized KV-cache.

u/pmttyji·1 month ago·40 pts / 10 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Neuronal Stochastic Attention Circuit (NSAC) for Probabilistic Representation Learning

Neuronal Stochastic Attention Circuit for uncertainty in continuous-time representation learning using C.elegans-inspired circuits.

Waleed Razzaq·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Accelerating Bayesian inverse design in computational fluid dynamics using neural operators

Neural operator surrogates for Bayesian inverse design in CFD to accelerate MCMC sampling for aerodynamic geometry inference.

Bipin Tiwari·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Retrying vs Resampling in AI Control

Retrying vs resampling in AI control: retrying leaks exploitable monitor rationale; resampling preserves safety without information leakage.

James Lucassen·1 month ago

Anthropic· FRONTIER

Anthropic co-founder Chris Olah's remarks on Pope Leo XIV's encyclical "Magnifica humanitas"

Chris Olah comments on papal encyclical; off-topic for AI industry professionals unless directly addressing AI ethics.

Anthropic·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

When Gradients Collide: Failure Modes of Multi-Objective Prompt Optimization for LLM Judges

Multi-objective textual gradient optimization for LLM judges: gradient conflicts cause failure modes; six decomposition modes tested.

Parth Darshan·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Confidence and Calibration of Activation Oracles for Reliable Interpretation of Language Model Internals

Uncertainty quantification for activation oracles interpreting LLM internals; bootstrap frequency is best-calibrated confidence method.

Federico Torrielli·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

L2IR: Revealing Latent Intent in Graph Fraud Detection

L2IR: graph fraud detection combining GNNs and LLMs to infer fraudster intent behind suspicious connections.

Jinsheng Guo·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

DRScaffold: Boosting Dense-Scene Reasoning in Lightweight Vision Language Models

DRScaffold: training lightweight vision-language models for dense-scene reasoning with explicit grounding between inference steps.

Xinrui Shi·1 month ago

r/singularity· COMMUNITY

One of the authors of "Attention is All You Need" just argued we should move past it. Pathway’s Post-Transformer debate is worth watching

Transformer paper co-author argues for moving beyond transformer architecture; Pathway's post-transformer research gaining attention from technical community.

u/_donothaveone_·1 month ago·177 pts / 57 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Peak-Then-Collapse and the Four Interface Channels of Knowledge-Graph Tool Use

Peak-then-collapse failure in GRPO tool-use training on knowledge graphs: four recurring failure modes shift rather than resolve.

Tianda Sun·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

CITYREP: A Unified Benchmark for Urban Representations Across Cities, Tasks, and Modalities

CityRep benchmark evaluates urban representation learning across cities and modalities using spatially-structured splits to prevent data leakage.

Junyuan Liu·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Length Generalization with Log-Depth Recurrent Units

MLP-LDRU architecture improves length generalization in neural networks via log-depth recurrent units with associativity-biased operators.

Charles Pert·1 month ago

← Front Page30 stories

← Newer Older →